Welcome to our conversion of:
   Multicast over TCP/IP HOWTO
   Juan-Mariano de Goyeneche <jmseyas@dit.upm.es>
   v1.0, 20 March 1998
  
Contents
   This HOWTO tries to cover most aspects related to multicast over
   TCP/IP networks. So, a lot of information within it is not Linux-spe-
   cific (just in case you don't use GNU/Linux... yet). Multicast is cur-
   rently an active area of research and, at the time of writing, many of
   the "standards" are merely drafts. Keep it in mind while reading the
   lines that follow.
   ______________________________________________________________________
  
   Table of Contents
  
  
  
   1. Introduction.
  
   1.1 What is Multicast.
   1.2 The problem with Unicast.
  
   2. Multicast Explained.
  
   2.1 Multicast addresses.
   2.2 Levels of conformance.
   2.3 Sending Multicast Datagrams.
   2.3.1 TTL.
   2.3.2 Loopback.
   2.3.3 Interface selection.
   2.4 Receiving Multicast Datagrams.
   2.4.1 Joining a Multicast Group.
   2.4.2 Leaving a Multicast Group.
   2.4.3 Mapping of IP Multicast Addresses to Ethernet/FDDI addresses.
  
   3. Kernel requirements and configuration.
  
   4. The MBone.
  
   5. Multicast applications.
  
   6. Multicast programming.
  
   6.1 IP_MULTICAST_LOOP.
   6.2 IP_MULTICAST_TTL.
   6.3 IP_MULTICAST_IF.
   6.4 IP_ADD_MEMBERSHIP.
   6.5 IP_DROP_MEMBERSHIP.
  
   7. The internals.
  
   7.1 IGMP.
   7.1.1 IGMP version 1.
   7.1.2 IGMP version 2.
   7.2 Kernel corner.
  
   8. Routing Policies and Forwarding Techniques.
  
   9. Multicast Transport Protocols.
  
   10. References.
  
   10.1 RFCs.
   10.2 Internet Drafts.
   10.3 Web pages.
   10.4 Books.
  
   11. Copyright and Disclaimer.
  
   12. Acknowledgements.
  
  
  
   ______________________________________________________________________
  
   1. Introduction.
  
   I'll try to give here the most wide range, up to date and accurate
   information related to multicasting over TCP/IP networks that I can.
   Any feedback is very welcome. If you find any mistakes in this
   document, have any comments about its contents or an update or
   addition, please send them to me at the address listed at the top of
   this howto.
   1.1. What is Multicast.
  
   Multicast is... a need. Well, at least in some scenarios. If you have
   information (a lot of information, usually) that should be transmitted
   to various (but usually not all) hosts over an internet, then
   Multicast is the answer. One common situation in which it is used is
   when distributing real time audio and video to the set of hosts which
   have joined a distributed conference.
  
   Multicast is much like radio or TV in the sense that only those who
   have tuned their receivers (by selecting a particular frequency they
   are interested on) receive the information. That is: you hear the
   channel you are interested in, but not the others.
  
  
  
   1.2. The problem with Unicast.
  
   Unicast is anything that is not broadcast nor multicast. All right,
   the definition is not very bright... When you send a packet and there
   is only one sender process -yours- and one recipient process (the one
   you are sending the packet to), then this is unicast. TCP is, by its
   own nature, unicast oriented. UDP supports a lot more paradigms, but
   if you are sending UDP packets and there is only one precess supposed
   to receive them, this is unicast too.
  
   For years unicast transmissions proved to be enough for the Internet.
   It was not until 1993 when the first implementation of multicast saw
   the light in the 4.4 BSD release. It seems nobody needed it until
   then. Which were those new problems that multicast addressed?
  
   Needless to say that the Internet has changed a lot since the "early
   days". Particularly, the appearance of the Web strongly transformed
   the situation: people didn't just want connections to remote hosts,
   mail and FTP. First they wanted to see the pictures people placed in
   their home pages, but later they also wanted to see and hear that
   people.
  
   With today's technology it is possible to afford the "cost" of making
   a unicast connection with everyone who wants to see your web page.
   However, if you are to send audio and video, which needs a huge amount
   of bandwidth compared with web applications, you have -you had, until
   multicast came into scene- two options: to establish a separate
   unicast connection with each of the recipients, or to use broadcast.
   The first solution is not affordable: if we said that a single
   connection sending audio/video consumes a huge bandwidth, imagine
   having to establish hundreds or, may be, thousands of those
   connections. Both the sending computer and your network would
   collapse.
  
   Broadcast seems to be a solution, but it's not certainly the solution.
   If you want all the hosts in your LAN to attend the conference, you
   may use broadcast. Packets will be sent only once and every host will
   receive them as they are sent to the broadcast address. The problem is
   that perhaps only a few of the hosts and not all are interested in
   those packets. Furthermore: perhaps some hosts are really interested
   in your conference, but they are outside of your LAN, a few routers
   away. And you know that broadcast works fine inside a LAN, but
   problems arise when you want broadcast packets to be routed across
   different LANs.
  
   The best solution seems to be one in which you send packets to a
   certain special address (a certain frequency in radio/TV
   transmissions). Then, all hosts which have decided to join the
   conference will be aware of packets with that destination address,
   read them when they traverse the network, and pass them to the IP
   layer to be demultiplexed. This is similar to broadcasting in that you
   send only one broadcast packet and all the hosts in the network
   recognize and read it; it differs, however, in that not all multicast
   packets are read and processed, but only those that were previously
   registered in the kernel as being "of interest".
  
   Those special packets are routed at kernel level like any packet
   because they are IP packets. The only difference might reside in the
   routing algorithm which tells the kernel where to route or not to
   route them.
  
  
  
   2. Multicast Explained.
  
   2.1. Multicast addresses.
  
   As you probably know, the range of IP addresses is divided into
   "classes" based on the high order bits of a 32 bits IP address:
  
  
   ______________________________________________________________________
   Bit --> 0 31 Address Range:
   +-+----------------------------+
   |0| Class A Address | 0.0.0.0 - 127.255.255.255
   +-+----------------------------+
   +-+-+--------------------------+
   |1 0| Class B Address | 128.0.0.0 - 191.255.255.255
   +-+-+--------------------------+
   +-+-+-+------------------------+
   |1 1 0| Class C Address | 192.0.0.0 - 223.255.255.255
   +-+-+-+------------------------+
   +-+-+-+-+----------------------+
   |1 1 1 0| MULTICAST Address | 224.0.0.0 - 239.255.255.255
   +-+-+-+-+----------------------+
   +-+-+-+-+-+--------------------+
   |1 1 1 1 0| Reserved | 240.0.0.0 - 247.255.255.255
   +-+-+-+-+-+--------------------+
   ______________________________________________________________________
  
  
  
   The one which concerns us is the "Class D Address". Every IP datagram
   whose destination address starts with "1110" is an IP Multicast
   datagram.
  
   The remaining 28 bits identify the multicast "group" the datagram is
   sent to. Following with the previous analogy, you have to tune your
   radio to hear a program that is transmitted at some specific
   frequency, in the same way you have to "tune" your kernel to receive
   packets sent to an specific multicast group. When you do that, it's
   said that the host has joined that group in the interface you
   specified. More on this later.
  
   There are some special multicast groups, say "well known multicast
   groups", you should not use in your particular applications due the
   special purpose they are destined to:
  
  
   o 224.0.0.1 is the all-hosts group. If you ping that group, all
   multicast capable hosts on the network should answer, as every
   multicast capable host must join that group at start-up on all it's
   multicast capable interfaces.
  
  
   o 224.0.0.2 is the all-routers group. All multicast routers must join
   that group on all it's multicast capable interfaces.
  
   o 224.0.0.4 is the all DVMRP routers, 224.0.0.5 the all OSPF routers,
   224.0.013 the all PIM routers, etc.
  
   All this special multicast groups are regularly published in the
   "Assigned Numbers" RFC.
  
   In any case, range 224.0.0.0 through 224.0.0.255 is reserved for local
   purposes (as administrative and maintenance tasks) and datagrams
   destined to them are never forwarded by multicast routers. Similarly,
   the range 239.0.0.0 to 239.255.255.255 has been reserved for
   "administrative scoping" (see section 2.3.1 for information on
   administrative scoping).
  
  
  
   2.2. Levels of conformance.
  
   Hosts can be in three different levels of conformance with the
   Multicast specification, according to the requirements they meet.
  
   Level 0 is the "no support for IP Multicasting" level. Lots of hosts
   and routers in the Internet are in this state, as multicast support is
   not mandatory in IPv4 (it is, however, in IPv6). Not too much
   explanation is needed here: hosts in this level can neither send nor
   receive multicast packets. They must ignore the ones sent by other
   multicast capable hosts.
  
   Level 1 is the "support for sending but not receiving multicast IP
   datagrams" level. Thus, note that it is not necessary to join a
   multicast group to be able to send datagrams to it. Very few additions
   are needed in the IP module to make a "Level 0" host "Level
   1-compliant", as shown in section 2.3.
  
   Level 2 is the "full support for IP multicasting" level. Level 2 hosts
   must be able to both send and receive multicast traffic. They must
   know the way to join and leave multicast groups and to propagate this
   information to multicast routers. Thus, they must include an Internet
   Group Management Protocol (IGMP) implementation in their TCP/IP stack.
  
  
  
   2.3. Sending Multicast Datagrams.
  
   By now, it should be obvious that multicast traffic is handled at the
   transport layer with UDP, as TCP provides point-to-point connections,
   not feasibles for multicast traffic. (Heavy research is taking place
   to define and implement new multicast-oriented transport protocols.
   See section ``Multicast Transport Protocols'' for details).
  
   In principle, an application just needs to open a UDP socket and fill
   with a class D multicast address the destination address where it
   wants to send data to. However, there are some operations that a
   sending process must be able to control.
  
  
  
   2.3.1. TTL.
  
   The TTL (Time To Live) field in the IP header has a double
   significance in multicast. As always, it controls the live time of the
   datagram to avoid it being looped forever due to routing errors.
   Routers decrement the TTL of every datagram as it traverses from one
   network to another and when its value reaches 0 the packet is dropped.
   The TTL in IPv4 multicasting has also the meaning of "threshold". Its
   use becomes evident with an example: suppose you set a long, bandwidth
   consuming, video conference between all the hosts belonging to your
   department. You want that huge amount of traffic to remain in your
   LAN. Perhaps your department is big enough to have various LANs. In
   that case you want those hosts belonging to each of your LANs to
   attend the conference, but in any case you want to collapse the entire
   Internet with your multicast traffic. There is a need to limit how
   "long" multicast traffic will expand across routers. That's what the
   TTL is used for. Routers have a TTL threshold assigned to each of its
   interfaces, and only datagrams with a TTL greater than the interface's
   threshold are forwarded. Note that when a datagram traverses a router
   with a certain threshold assigned, the datagram's TTL is not
   decremented by the value of the threshold. Only a comparison is made.
   (As before, the TTL is decremented by 1 each time a datagram passes
   across a router).
  
   A list of TTL thresholds and their associated scope follows:
  
  
   ______________________________________________________________________
   TTL Scope
   ----------------------------------------------------------------------
   0 Restricted to the same host. Won't be output by any interface.
   1 Restricted to the same subnet. Won't be forwarded by a router.
   ថ Restricted to the same site, organization or department.
   ឰ Restricted to the same region.
    Restricted to the same continent.
    Unrestricted in scope. Global.
   ______________________________________________________________________
  
  
  
   Nobody knows what "site" or "region" mean exactly. It is up to the
   administrators to decide what this limits apply to.
  
   The TTL-trick is not always flexible enough for all needs, specially
   when dealing with overlapping regions or trying to establish
   geographic, topologic and bandwidth limits simultaneously. To solve
   this problems, administratively scoped IPv4 multicast regions were
   established in 1994. (see D. Meyer's "Administratively Scoped IP
   Multicast" Internet draft). It does scoping based on multicast
   addresses rather than on TTLs. The range 239.0.0.0 to 239.255.255.255
   is reserved for this administrative scoping.
  
  
  
   2.3.2. Loopback.
  
   When the sending host is Level 2 conformant and is also a member of
   the group datagrams are being sent to, a copy is looped back by
   default. This does not mean that the interface card reads its own
   transmission, recognizes it as belonging to a group the interface
   belongs to, and reads it from the network. On the contrary, is the IP
   layer which, by default, recognizes the to-be-sent datagram and copies
   and queues it on the IP input queue before sending it.
  
   This feature is desirable in some cases, but not in others. So the
   sending process can turn it on and off at wish.
  
  
  
   2.3.3. Interface selection.
  
   Hosts attached to more than one network should provide a way for
   applications to decide which network interface will be used to output
   the transmissions. If not specified, the kernel chooses a default one
   based on system administrator's configuration.
  
  
  
   2.4. Receiving Multicast Datagrams.
  
   2.4.1. Joining a Multicast Group.
  
   Broadcast is (in comparison) easier to implement than multicast. It
   doesn't require processes to give the kernel some rules regarding what
   to do with broadcast packets. The kernel just knows what to do: read
   and deliver all of them to the proper applications.
  
   With multicast, however, it is necessary to advise the kernel which
   multicast groups we are interested in. That is, we have to ask the
   kernel to "join" those multicast groups. Depending on the underlying
   hardware, multicast datagrams are filtered by the hardware or by the
   IP layer (and, in some cases, by both). Only those with a destination
   group previously registered via a join are accepted.
  
   Essentially, when we join a group we are telling the kernel: "OK. I
   know that, by default, you ignore multicast datagrams, but remember
   that I am interested in this multicast group. So, do read and deliver
   (to any process interested in them, not only to me) any datagram that
   you see in this network interface with this multicast group in its
   destination field".
  
   Some considerations: first, note that you don't just join a group.
   You join a group on a particular network interface. Of course, it is
   possible to join the same group on more than one interface. If you
   don't specify a concrete interface, then the kernel will choose it
   based on its routing tables when datagrams are to be sent. It is also
   possible that more than one process joins the same multicast group on
   the same interface. They will all receive the datagrams sent to that
   group via that interface.
  
   As said before, any multicast-capable hosts join the all-hosts group
   at start-up , so "pinging" 224.0.0.1 returns all hosts in the network
   that have multicast enabled.
  
   Finally, consider that for a process to receive multicast datagrams it
   has to ask the kernel to join the group and bind the port those
   datagrams were being sent to. The UDP layer uses both the destination
   address and port to demultiplex the packets and decide which socket(s)
   deliver them to.
  
  
  
   2.4.2. Leaving a Multicast Group.
  
   When a process is no longer interested in a multicast group, it
   informs the kernel that it wants to leave that group. It is important
   to understand that this doesn't mean that the kernel will no longer
   accept multicast datagrams destined to that multicast group. It will
   still do so if there are more precesses who issued a "multicast join"
   petition for that group and are still interested. In that case the
   host remains member of the group, until all the processes decide to
   leave the group.
  
   Even more: if you leave the group, but remain bound to the port you
   were receiving the multicast traffic on, and there are more processes
   that joined the group, you will still receive the multicast
   transmissions.
  
   The idea is that joining a multicast group only tells the IP and data
   link layer (which in some cases explicitly tells the hardware) to
   accept multicast datagrams destined to that group. It is not a per-
   process membership, but a per-host membership.
  
  
  
   2.4.3. Mapping of IP Multicast Addresses to Ethernet/FDDI addresses.
  
   Both Ethernet and FDDI frames have a 48 bit destination address field.
   In order to avoid a kind of multicast ARP to map multicast IP
   addresses to ethernet/FDDI ones, the IANA reserved a range of
   addresses for multicast: every ethernet/FDDI frame with its
   destination in the range 01-00-5e-00-00-00 to 01-00-5e-ff-ff-ff (hex)
   contains data for a multicast group. The prefix 01-00-5e identifies
   the frame as multicast, the next bit is always 0 and so only 23 bits
   are left to the multicast address. As IP multicast groups are 28 bits
   long, the mapping can not be one-to-one. Only the 23 least significant
   bits of the IP multicast group are placed in the frame. The remaining
   5 high-order bits are ignored, resulting in 32 different multicast
   groups being mapped to the same ethernet/FDDI address. This means that
   the ethernet layer acts as an imperfect filter, and the IP layer will
   have to decide whether to accept the datagrams the data-link layer
   passed to it. The IP layer acts as a definitive perfect filter.
  
   Full details on IP Multicasting over FDDI are given in RFC 1390:
   "Transmission of IP and ARP over FDDI Networks". For more information
   on mapping IP Multicast addresses to ethernet ones, you may consult
   draft-ietf-mboned-intro-multicast-03.txt: "Introduction to IP
   Multicast Routing".
  
   If you are interested in IP Multicasting over Token-Ring Local Area
   Networks, see RFC 1469 for details.
  
  
  
   3. Kernel requirements and configuration.
  
   Linux is, of course (you doubted it?), full Level-2 Multicast-
   Compliant. It meets all requirements to send, receive and act as a
   router (mrouter) for multicast datagrams.
  
   If you want just to send and receive, you must say yes to "IP:
   multicasting" when configuring your kernel. If you also want your
   Linux box to act as a multicast router (mrouter) you also need to
   enable multicast routing in the kernel by selecting "IP:
   forwarding/gatewaying", "IP: multicast routing" and "IP: tunneling",
   the latter because new versions of mrouted relay on IP tunneling to
   send multicast datagrams encapsulated into unicast ones. This is
   necessary when establishing tunnels between multicast hosts separated
   by unicast-only networks and routers. (The mrouted is a daemon that
   implements the multicast routing algorithm -the routing policy- and
   instructs the kernel on how to route multicast datagrams).
  
   Some kernel versions label multicast routing as "EXPERIMENTAL", so you
   should enable "Prompt for development and/or incomplete code/drivers"
   in the "Code maturity level options" section.
  
   If, when running the mrouted, traffic generated in the same network
   your Linux box is connected to is correctly forwarded to the other
   network, but you can't see the other's network traffic on your local
   network, check whether you are receiving ICMP protocol error messages.
   Almost sure you forgot to turn on IP tunneling in your Linux router.
   It's a kind of stupid error when you know it but, believe me, its
   quite time-consuming when you don't, and there is no apparent reason
   that explains what is going wrong. A sniffer proves to be quite useful
   in these situations!
  
   (You can see more on multicast routing on section ``Routing Policies
   and Forwarding Techniques''; mrouted and tunnels are also explained in
   sections ``The MBone'' and ``Multicast applications'').
  
   Once you have compiled and installed your new kernel, you should
   provide a default route for multicast traffic. The goal is to add a
   route to the network 224.0.0.0.
  
   The problem most people seem to face in this stage of the
   configuration is with the value of the mask to supply. If you have
   read Terry Dawson's excellent NET-3-HOWTO, it should not be difficult
   to guess the correct value, though. As explained there, the netmask is
   a 32 bit number filled with all-1s in the network part of your IP
   address, and with all-0s in the host part. Recall from section 2.1
   that a class D multicast address has no netwok/host sections. Instead
   it has a 28-bit group identifier and a 4-bit class D identifier. Well,
   this 4 bits are the network part and the remaining 28 the host part.
   So the netmask needed is 11110000000000000000000000000000 or, easier
   to read: 240.0.0.0. Then, the full command should be:
  
  
   route add 224.0.0.0 netmask 240.0.0.0 dev eth0
  
  
  
   Depending on how old your route program is, you might need to add the
   -net flag after the add.
  
   Here we supposed that eth0 was multicast-capable and that, when not
   otherwise specified, we wanted multicast traffic to be output there.
   If this is not your case, change the dev parameter as appropriate.
  
   The /proc filesystem proves here to be useful once again: you can
   check /proc/net/igmp to see the groups your host is currently
   subscribed to.
  
  
  
   4. The MBone.
  
   Using a new technology usually carries some advantages and
   disadvantages. The advantages of multicast are -I think- clear. The
   main disadvantage is that hundreds of hosts and, specially, routers
   don't support it yet. As a consequence, people who started working on
   multicast, bought new equipment, modified their operating systems, and
   built multicast islands in their local places. Then they discovered
   that it was difficult to communicate with people doing similar things
   because if only one of the routers between them didn't support
   multicast there was nothing to do...
  
   The solution was clear: they decided to build a virtual multicast
   network in the top of the Internet. That is: sites with multicast
   routers between them could communicate directly. But sites joined
   across unicast routers would send their island's multicast traffic
   encapsulated in unicast packets to other multicast islands. Routers in
   the middle would not have problems, as they would be dealing with
   unicast traffic. Finally, in the receiving site, traffic would be de-
   encapsulated, and sent to the island in the original multicast way.
   Two ends converting from multicast to unicast, and then again to
   multicast define what is called a multicast tunnel.
  
   The MBone or Multicast Backbone is that virtual multicast network
   based on multicast islands connected by multicast tunnels.
  
   Several activities take place in the MBone daily, but it deserves to
   be remarked the profusion of tele-conferences with real time audio and
   video taking place across the whole Internet. As an example, it was
   recently transmitted (live) the talk Linus Torvalds gave to the
   Silicon Valley Linux Users Group.
  
   For more information on the MBone, see:
  
   <http://www.mediadesign.co.at/newmedia/more/mbone-faq.html>
  
  
  
   5. Multicast applications.
  
   Most people dealing with multicast, sooner or later decide to connect
   to the MBone, and then they usually need an mrouted. You'll also need
   it if you don't have a multicast-capable router and you want multicast
   traffic generated in one of your subnets to be "heard" on another.
   mrouted does circunvect the problem of sending multicast traffic
   across unicast routers -it encapsulates multicast datagrams into
   unicast ones (IP into IP)- but this is not the only feature it
   provides. Most important, it instructs the kernel on how to route (or
   not-to-route) multicast datagrams based on their source and
   destination. So, even having a multicast capable router, mrouted can
   be used to tell it what to do with the datagrams (note I said what,
   and not how; mrouted says "forward this to the network connected to
   that interface", but actual forwarding is performed by the kernel).
   This distinction between actual-forwarding and the algorithm that
   decides who and how to forward is very useful as it allows to write
   forwarding code only once and place it into the kernel. Forwarding
   algorithms and policies are then implemented in user space daemons, so
   it is very easy to change from one policy to another without the need
   of kernel re-compilation.
  
   You can get a version of mrouted ported to Linux from:
  
   <ftp://www.video.ja.net/mice/mrouted/Linux/>. This site is mirrored
   all across the world. Be sure to read the
   <ftp://www.video.ja.net/mice/README.mirrors> file to choose the one
   nearest you.
  
   Next, we'll focus specially on multicast applications written to
   connect to the MBone, which have been ported to Linux. The list is
   picked up from Michael Esler's "Linux Multicast Information" page
   <http://www.cs.virginia.edu/~mke2e/multicast/>. I recommend you that
   page for lots of information and resources on multicast and Linux.
  
  
   Audio Conferencing
  
   o NeVoT - Network Voice Terminal <http://www.fokus.gmd.de/step/nevot>
  
   o RAT - UCL Robust-Audio Tool <http://www-mice.cs.ucl.ac.uk/mice/rat>
  
   o vat - LBL visual audio tool <http://www-nrg.ee.lbl.gov/vat/>
  
   Video Conferencing
  
   o ivs - Inria video conferencing system
   <http://www.inria.fr/rodeo/ivs.html>
  
   o nv - Network video tool <ftp://ftp.parc.xerox.com/pub/net-
   research/>
   o nv w/ Meteor - Release of nv w/ support for the Matrox Meteor (UVa)
   <ftp://ftp.cs.virginia.edu/pub/gwtts/Linux/nv-meteor.tar.gz>
  
   o vic - LBL video conferencing tool <http://www-nrg.ee.lbl.gov/vic/>
  
   o vic w/ Meteor - Release of vic w/ support for the Matrox Meteor
   (UVa)
   <ftp://ftp.cs.virginia.edu/pub/gwtts/Linux/vic2.7a38-meteor.tar.gz>
  
   Other Utilities
  
   o mmphone Multimedia phone service
   <http://www.eit.com/software/mmphone/phoneform.html>
  
   o wb - LBL shared white board <http://www-nrg.ee.lbl.gov/wb/>
  
   o webcast - Reliable multicast application for linking Mosaic
   browsers
   <http://www.ncsa.uiuc.edu/SDG/Software/XMosaic/CCI/webcast.html>
  
   Session Tools
  
   I placed session tools later because I think they deserve some
   explanation. When a conference takes places, several multicast groups
   and ports are assigned to each service you want for your conference
   (audio, video, shared white-boards, etc...) Announces of the
   conferences that will take place, along with information on multicast
   groups, ports and programs that will be used (vic, vat, ...) are
   periodically multicasted to the MBone. Session tools "hear" this
   information and present you in an easy way which conferences are
   taking (or will take) place, so you can decide which interest you.
   Also, they facilitate the task of joining a session. Instead of
   launching each program that will be used and telling which multicast
   group/port to join, you usually just need to click and the session
   tool launches the proper programs suppling them all information needed
   to join the conference. Session tools usually let you announce your
   own conferences on the MBone.
  
  
   o gwTTS - University of Virginia tele-tutoring system
   <http://www.cs.Virginia.EDU/~gwtts>
  
   o isc - Integrated session controller
   <http://www.fokus.gmd.de/step/isc>
  
   o mmcc - Multimedia conference control
   <ftp://ftp.isi.edu/confctrl/mmcc>
  
   o sd - LBL session directory tool
   <ftp://ftp.ee.lbl.gov/conferencing/sd>
  
   o sd-snoop - Tenet Group session directory snoop utility
   <ftp://tenet.berkeley.edu/pub/software>
  
   o sdr - UCL's next generation session directory
   <ftp://cs.ucl.ac.uk/mice/sdr>
  
  
  
   6. Multicast programming.
  
   Multicast programming... or writing your own multicast applications.
  
   Several extensions to the programming API are needed in order to
   support multicast. All of them are handled via two system calls:
   setsockopt() (used to pass information to the kernel) and getsockopt()
   (to retrieve information regarded multicast behavior). This does not
   mean that 2 new system calls were added to support multicast. The pair
   setsockopt()/getsockopt() has been there for years. Since 4.2 BSD at
   least. The addition consists on a new set of options (multicast
   options) that are passed to these system calls, that the kernel must
   understand.
  
   The following are the setsockopt()/getsockopt() function prototypes:
  
  
   int getsockopt(int s, int level, int optname, void* optval, int* optlen);
  
   int setsockopt(int s, int level, int optname, const void* optval, int optlen);
  
  
  
   The first parameter, s, is the socket the system call applies to. For
   multicasting, it must be a socket of the family AF_INET and its type
   may be either SOCK_DGRAM or SOCK_RAW. The most common use is with
   SOCK_DGRAM sockets, but if you plan to write a routing daemon or
   modify some existing one, you will probably need to use SOCK_RAW ones.
  
   The second one, level, identifies the layer that is to handle the
   option, message or query, whatever you want to call it. So, SOL_SOCKET
   is for the socket layer, IPPROTO_IP for the IP layer, etc... For
   multicast programming, level will always be IPPROTO_IP.
  
   optname identifies the option we are setting/getting. Its value
   (either supplied by the program or returned by the kernel) is optval.
   The optnames involved in multicast programming are the following:
  
  
   ______________________________________________________________________
   setsockopt() getsockopt()
   IP_MULTICAST_LOOP yes yes
   IP_MULTICAST_TTL yes yes
   IP_MULTICAST_IF yes yes
   IP_ADD_MEMBERSHIP yes no
   IP_DROP_MEMBERSHIP yes no
   ______________________________________________________________________
  
  
  
   optlen carries the size of the data structure optval points to. Note
   that in getsockopt() it is a value-result rather than a value: the
   kernel writes the value of optname in the buffer pointed by optval and
   informs us of that value's size via optlen.
  
   Both setsockopt() and getsockopt() return 0 on success and -1 on
   error.
  
  
  
   6.1. IP_MULTICAST_LOOP.
  
   You have to decide, as the application writer, whether you want the
   data you send to be looped back to your host or not. If you plan to
   have more than one process or user "listening", loopback must be
   enabled. On the other hand, if you are sending the images your video
   camera is producing, you probably don't want loopback, even if you
   want to see yourself on the screen. In that latter case, your
   application will probably receive the images from a device attached to
   the computer and send them to the socket. As the application already
   "has" that data, it is improbable it wants to receive it again on the
   socket. Loopback is by default enabled.
  
   Regard that optval is a pointer. You can't write:
  
  
   setsockopt(socket, IPPROTO_IP, IP_MULTICAST_LOOP, 0, 1);
  
  
  
   to disable loopback. Instead write:
  
  
   u_char loop;
   setsockopt(socket, IPPROTO_IP, IP_MULTICAST_LOOP, &loop, sizeof(loop));
  
  
  
   and set loop to 1 to enable loopback or 0 to disable it.
  
   To know whether a socket is currently looping-back or not use
   something like:
  
  
   u_char loop;
   int size;
  
   getsockopt(socket, IPPROTO_IP, IP_MULTICAST_LOOP, &loop, &size)
  
  
  
   6.2. IP_MULTICAST_TTL.
  
   If not otherwise specified, multicast datagrams are sent with a
   default value of 1, to prevent them to be forwarded beyond the local
   network. To change the TTL to the value you desire (from 0 to 255),
   put that value into a variable (here I name it "ttl") and write
   somewhere in your program:
  
  
   u_char ttl;
   setsockopt(socket, IPPROTO_IP, IP_MULTICAST_TTL, &ttl, sizeof(ttl));
  
  
  
   The behavior with getsockopt() is similar to the one seen on
   IP_MULTICAST_LOOP.
  
  
  
   6.3. IP_MULTICAST_IF.
  
   Usually, the system administrator specifies the default interface
   multicast datagrams should be sent from. The programmer can override
   this and choose a concrete outgoing interface for a given socket with
   this option.
  
  
   struct in_addr interface_addr;
   setsockopt (socket, IPPROTO_IP, IP_MULTICAST_IF, &interface_addr, sizeof(interface_addr));
  
   >From now on, all multicast traffic generated in this socket will be
   output from the interface chosen. To revert to the original behavior
   and let the kernel choose the outgoing interface based on the system
   administrator's configuration, it is enough to call setsockopt() with
   this same option and INADDR_ANY in the interface field.
  
   In determining or selecting outgoing interfaces, the following ioctls
   might be useful: SIOCGIFADDR (to get an interface's address),
   SIOCGIFCONF (to get the list of all the interfaces) and SIOCGIFFLAGS
   (to get an interface's flags and, thus, determine whether the
   interface is multicast capable or not -the IFF_MULTICAST flag-).
  
   If the host has more than one interface and the IP_MULTICAST_IF option
   is not set, multicast transmissions are sent from the default
   interface, although the remainding interfaces might be used for
   multicast forwarding if the host is acting as a multicast router.
  
  
  
   6.4. IP_ADD_MEMBERSHIP.
  
   Recall that you need to tell the kernel which multicast groups you are
   interested in. If no process is interested in a group, packets
   destined to it that arrive to the host are discarded. In order to
   inform the kernel of your interests and, thus, become a member of that
   group, you should first fill a ip_mreq structure which is passed later
   to the kernel in the optval field of the setsockopt() system call.
  
   The ip_mreq structure (taken from /usr/include/linux/in.h) has the
   following members:
  
  
   struct ip_mreq
   {
   struct in_addr imr_multiaddr; /* IP multicast address of group */
   struct in_addr imr_interface; /* local IP address of interface */
   };
  
  
  
   (Note: the "physical" definition of the structure is in the file above
   specified. Nonetheless, you should not include <linux/in.h> if you
   want your code to be portable. Instead, include <netinet/in.h> which,
   in turn, includes <linux/in.h> itself).
  
   The first member, imr_multiaddr, holds the group address you want to
   join. Remember that memberships are also associated with interfaces,
   not just groups. This is the reason you have to provide a value for
   the second member: imr_interface. This way, if you are in a multihomed
   host, you can join the same group in several interfaces. You can
   always fill this last member with the wildcard address (INADDR_ANY)
   and then the kernel will deal with the task of choosing the interface.
  
   With this structure filled (say you defined it as: struct ip_mreq
   mreq;) you just have to call setsockopt() this way:
  
  
   setsockopt (socket, IPPROTO_IP, IP_ADD_MEMBERSHIP, &mreq, sizeof(mreq));
  
  
  
   Notice that you can join several groups to the same socket, not just
   one. The limit to this is IP_MAX_MEMBERSHIPS and, as of version
   2.0.33, it has the value of 20.
   6.5. IP_DROP_MEMBERSHIP.
  
   The process is quite similar to joining a group:
  
  
   struct ip_mreq mreq;
   setsockopt (socket, IPPROTO_IP, IP_DROP_MEMBERSHIP, &mreq, sizeof(mreq));
  
  
  
   where mreq is the same structure with the same data used when joining
   the group. If the imr_interface member is filled with INADDR_ANY, the
   first matching group is dropped.
  
   If you have joined a lot of groups to the same socket, you don't need
   to drop memberships in all of them in order to terminate. When you
   close a socket, all memberships associated with it are dropped by the
   kernel. The same occurs if the process that opened the socket is
   killed.
  
   Finally, keep in mind that a process dropping membership for a group
   does not imply that the host will stop receiving datagrams for that
   group. If another socket joined that group in that same interface
   previously to this IP_DROP_MEMBERSHIP, the host will keep being a
   member of that group.
  
   Both ADD_MEMBERSHIP and DROP_MEMBERSHIP are nonblocking operations.
   They should return immediately indicating either success or failure.
  
  
  
   7. The internals.
  
   This section's aim is to provide some information, not needed to reach
   a basic understanding on how multicast works nor to be able to write
   multicast programs, but which is very interesting, gives some insight
   on the underlying multicast protocols and implementations, and may be
   useful to avoid common errors and misunderstandings.
  
  
  
   7.1. IGMP.
  
   When talking about IP_ADD_MEMBERSHIP and IP_DROP_MEMBERSHIP, we said
   that the information provided by this "commands" was used by the
   kernel to choose which multicast datagrams accept or discard. This is
   true, but it is not all the truth. Such a simplification would imply
   that multicast datagrams for all multicast groups around the world
   would be received by our host, and then it would check the memberships
   issued by processes running on it to decide whether to pass the
   traffic to them or to throw it out. As you can imagine, this is a
   complete bandwidth waste.
  
   What actually happens is that hosts instruct their routers telling
   them which multicast groups they are interested in; then, those
   routers tell their up-stream routers they want to receive that
   traffic, and so on. Algorithms employed for making the decision of
   when to ask for a group's traffic or saying that it is not desired
   anymore, vary a lot. There's something, however, that never changes:
   how this information is transmitted. IGMP is used for that. It stands
   for Internet Group Management Protocol. It is a new protocol, similar
   in many aspects to ICMP, with a protocol number of 2, whose messages
   are carried in IP datagrams, and which all level 2-compliant host are
   required to implement.
   As said before, it is used both by hosts giving membership information
   to its routers, and by routers to communicate between themselves. In
   the following I'll cover only the hosts-routers relationships, mainly
   because I was unable to find information describing router to router
   communication other than the mrouted source code (rfc 1075 describing
   the Distance Vector Multicast Routing Protocol is now obsoleted, and
   mrouted implements a modified DVMRP not yet documented).
  
   IGMP version 0 is specified in RFC-988 which is now obsoleted. Almost
   no one uses version 0 now.
  
   IGMP version 1 is described in RFC-1112 and, although it is updated by
   RFC-2236 (IGMP version 2) it is in wide use still. The Linux kernel
   implements the full IGMP version 1 and parts of version 2
   requirements, but not all.
  
   Now I'll try to give an informal description of the protocol. You can
   check RFC-2236 for an in-proof formal description, with lots of state
   diagrams and time-out boundaries.
  
   All IGMP messages have the following structure:
  
   ______________________________________________________________________
   0 1 2 3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Type | Max Resp Time | Checksum |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Group Address |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   ______________________________________________________________________
  
  
  
   IGMP version 1 (hereinafter IGMPv1) labels the "Max Resp Time" as
   "Unused", zeroes it when sent, and ignores it when received. Also, it
   brakes the "Type" field in two 4-bits wide fields: "Version" and
   "Type". As IGMPv1 identifies a "Membership Query" message as 0x11
   (version 1, type 1) and IGMPv2 as 0x11 too, the 8 bits have the same
   effective interpretation.
  
   I think it is more instructive to give first the IGMPv1 description
   and next point out the IGMPv2 additions, as they are mainly that,
   additions.
  
   For the following discussions it is important to remember that
   multicast routers receive all IP multicast datagrams.
  
  
  
   7.1.1. IGMP version 1.
  
   Routers periodically send IGMP Host Membership Queries to the all-
   hosts group (224.0.0.1) with a TTL of 1 (once every minute or two).
   All multicast-capable hosts hear them, but don't answer immediately to
   avoid an IGMP Host Membership Report storm. Instead, they start a
   random delay timer for each group they belong to on the interface they
   received the query.
  
   Sooner or later, the timer expires in one of the hosts, and it sends
   an IGMP Host Membership Report (also with TTL 1) to the multicast
   address of the group being reported. As it is sent to the group, all
   hosts that joined the group -and which are currently waiting for their
   own timer to expire- receive it, too. Then, they stop their timers and
   don't generate any other report. Just one is generated -by the host
   that chose the smaller timeout-, and that is enough for the router. It
   only needs to know that there are members for that group in the
   subnet, not how many nor which.
  
   When no reports are received for a given group after a certain number
   of queries, the router assumes that no members are left, and thus it
   doesn't have to forward traffic for that group on that subnet. Note
   that in IGMPv1 there are no "Leave Group messages".
  
   When a host joins a new group, the kernel sends a report for that
   group, so that the respective process needs not to wait a minute or
   two until a new membership query is received. As you can see this IGMP
   packet is generated by the kernel as a response to the
   IP_ADD_MEMBERSHIP command, seen in section ``IP_ADD_MEMBERSHIP''.
   Note the emphasis in the adjective "new": if a process issues an
   IP_ADD_MEMBERSHIP command for a group the host is already a member of,
   no IGMP packets are sent as we must already be receiving traffic for
   that group; instead, a counter for that group's use is incremented.
   IP_DROP_MEMBERSHIP generates no datagrams in IGMPv1.
  
   Host Membership Queries are identified by Type 0x11, and Host
   Membership Reports by Type 0x12.
  
   No reports are sent for the all-hosts group. Membership in this group
   is permanent.
  
  
  
   7.1.2. IGMP version 2.
  
   One important addition to the above is the inclusion of a Leave Group
   message (Type 0x17). The reason is to reduce the bandwidth waste
   between the time the last host in the subnet drops membership and the
   time the router times-out for its queries and decides there are no
   more members present for that group (leave latency). Leave Group
   messages should be addressed to the all-routers group (224.0.0.2)
   rather than to the group being left, as that information is of no use
   for other members (kernel versions up to 2.0.33 send them to the
   group; although it does no harm to the hosts, it's a waste of time as
   they have to process them, but don't gain useful information). There
   are certain subtle details regarding when and when-not to send Leave
   Messages; if interested, see the RFC.
  
   When an IGMPv2 router receives a Leave Message for a group, it sends
   Group-Specific Queries to the group being left. This is another
   addition. IGMPv1 has no group-specific queries. All queries are sent
   to the all-hosts group. The Type in the IGMP header does not change
   (0x11, as before), but the "Group Address" is filled with the address
   of the multicast group being left.
  
   The "Max Resp Time" field, which was set to 0 in transmission and
   ignored on reception in IGMPv1, is meaningful only in "Membership
   Query" messages. It gives the maximum time allowed before sending a
   report in units of 1/10 second. It is used as a tune mechanism.
  
   IGMPv2 adds another message type: 0x16. It is a "Version 2 Membership
   Report" sent by IGMPv2 hosts if they detect an IGMPv2 router is
   present (an IGMPv2 host knows an IGMPv1 router is present when it
   receives a query with the "Max Response" field set to 0).
  
   When more than one router claims to act as querier, IGMPv2 provides a
   mechanism to avoid "discussions": the router with the lowest IP
   address is designed to be querier. The other routers keep timeouts. If
   the router with lower IP address crashes or is shutdown, the decision
   of who will be the querier is taken again after the timers expire.
  
  
   7.2. Kernel corner.
  
   This sub-section gives some start-points to study the multicast
   implementation of the Linux kernel. It does not explain that
   implementation. It just says where to find things.
  
   The study was carried over version 2.0.32, so it could be a bit
   outdated by the time you read it (network code seems to have changed A
   LOT in 2.1.x releases, for instance).
  
   Multicast code in the Linux kernel is always surrounded by #ifdef
   CONFIG_IP_MULTICAST / #endif pairs, so that you can include/ exclude
   it from your kernel based on your needs (this inclusion/exclusion is
   done at compile time, as you probably know if reading that section...
   #ifdefs are handled by the preprocessor. The decision is made based
   in what you selected when doing either a make config, make menuconfig
   or make xconfig).
  
   You might want multicast features, but if your Linux box is not going
   to act as a multicast router you will probably not want multicast
   router features included in your new kernel. For this you have the
   multicast routing code surrounded by #ifdef CONFIG_IP_MROUTE / #endif
   pairs.
  
   Kernel sources are usually placed in /usr/src/linux. However, the
   place may change so, both for accuracy and brevity, I will refer to
   the root directory of the kernel sources as just LINUX. Then,
   something like LINUX/net/ipv4/udp.c should be the same as
   /usr/src/linux/net/ipv4/udp.c if you unpacked the kernel sources in
   the /usr/src/linux directory.
  
   All multicast interfaces with user programs shown in the section
   devoted to multicast programming were driven across the setsockopt()/
   getsockopt() system calls. Both of them are implemented by means of
   functions that make some tests to verify the parameters passed to them
   and which, in turn, call another function that makes some additional
   tests, demultiplexes the call based on the level parameter to either
   system call, and then calls another function which... (if interested
   in all this jumps, you can follow them in LINUX/net/socket.c
   (functions sys_socketcall() and sys_setsockopt(),
   LINUX/net/ipv4/af_inet.c (function inet_setsockopt()) and
   LINUX/net/ipv4/ip_sockglue.c (function ip_setsockopt()) ).
  
   The one which interests us is LINUX/net/ipv4/ip_sockglue.c. Here we
   find ip_setsockopt() and ip_getsockopt() which are mainly a switch
   (after some error checking) verifying each possible value for optname.
   Along with unicast options, all multicast ones seen here are handled:
   IP_MULTICAST_TTL, IP_MULTICAST_LOOP, IP_MULTICAST_IF,
   IP_ADD_MEMBERSHIP and IP_DROP_MEMBERSHIP. Previously to the switch, a
   test is made to determine whether the options are multicast router
   specific, and if so, they are routed to the ip_mroute_setsockopt() and
   ip_mroute_getsockopt() functions (file LINUX/net/ipv4/ipmr.c).
  
   In LINUX/net/ipv4/af_inet.c we can see the default values we talked
   about in previous sections (loopback enabled, TTL=1) provided when the
   socket is created (taken from function inet_create() in this file):
  
  
  
   ______________________________________________________________________
  
   #ifdef CONFIG_IP_MULTICAST
   sk->ip_mc_loop=1;
   sk->ip_mc_ttl=1;
   *sk->ip_mc_name=0;
   sk->ip_mc_list=NULL;
   #endif
   ______________________________________________________________________
  
  
  
   Also, the assertion of "closing a socket makes the kernel drop all
   memberships this socket had" is corroborated by:
  
   ______________________________________________________________________
   #ifdef CONFIG_IP_MULTICAST
   /* Applications forget to leave groups before exiting */
   ip_mc_drop_socket(sk);
   #endif
   ______________________________________________________________________
  
  
   taken from inet_release(), on the same file as before.
  
   Device independent operations for the Link Layer are kept in
   LINUX/net/core/dev_mcast.c.
  
   Two important functions are still missing: the processing of input and
   output multicast datagrams. As any other datagrams, incoming datagrams
   are passed from the device drivers to the ip_rcv() function
   (LINUX/net/ipv4/ip_input.c). In this function is where the perfect
   filtering is applied to multicast packets that crossed the devices
   layer (recall that lower layers only perform best-effort filtering and
   is IP who 100% knows whether we are interested in that multicast group
   or not). If the host is acting as a multicast router, this function
   decides too whether the datagram should be forwarded and calls
   ipmr_forward() appropriately. (ipmr_forward() is implemented in
   LINUX/net/ipv4/ipmr.c).
  
   Code in charge of out-putting packets is kept in
   LINUX/net/ipv4/ip_output.c. Here is where the IP_MULTICAST_LOOP
   option takes effect, as it is checked to see whether to loop back the
   packets or not (function ip_queue_xmit()). Also the TTL of the
   outgoing packet is selected based on whether it is a multicast or
   unicast one. In the former case, the argument passed to the
   IP_MULTICAST_TTL option is used (function (ip_build_xmit()).
  
   While working with mrouted (a program which gives the kernel
   information about how to route multicast datagrams), we detected that
   all multicast packets originated on the local network were properly
   routed..., except the ones from the Linux box that was acting as the
   multicast router!! ip_input.c was working OK, but it seemed
   ip_output.c wasn't. Reading the source code for the output functions,
   we found that outgoing datagrams were not being passed to
   ipmr_forward(), the function that had to decide whether they should be
   routed or not. The packets were outputed to the local network but, as
   network cards are usually unable to read their own transmissions,
   those datagrams were never routed. We added the necessary code to the
   ip_build_xmit() function and everything was OK again. (Having the
   sources for your kernel is not a luxury or pedantry; it's a need!)
  
   ipmr_forward() has been mentioned a couple of times. It is an
   important function as it solves one important misunderstanding that
   appears to be widely expanded. When routing multicast traffic, it is
   not mrouted who makes the copies and sends them to the proper
   recipients. mrouted receives all multicast traffic and, based on that
   information, computes the multicast routing tables and tells the
   kernel how to route: "datagrams for this group coming from that
   interface should be forwarded to those interfaces". This information
   is passed to the kernel by calls to setsockopt() on a raw socket
   opened by the mrouted daemon (the protocol specified when the raw
   socket was created must be IPPROTO_IGMP). This options are handled in
   the ip_mroute_setsockopt() function from LINUX/net/ipv4/ipmr.c. The
   first option (would be better to call them commands rather than
   options) issued on that socket must be MRT_INIT. All other commands
   are ignored (returning -EACCES) if MRT_INIT is not issued first. Only
   one instance of mrouted can be running at the same time in the same
   host. To keep track of this, when the first MRT_INIT is received, an
   important variable, struct sock* mroute_socket, is pointed to the
   socket MRT_INIT was received on. If mroute_socket is not null when
   attending an MRT_INIT this means another mrouted is already running
   and -EADDRINUSE is returned. All resting commands (MRT_DONE,
   MRT_ADD_VIF, MRT_DEL_VIF, MRT_ADD_MFC, MRT_DEL_MFC and MRT_ASSERT)
   return -EACCES if they come from a socket different than
   mroute_socket.
  
   As routed multicast datagrams can be received/sent across either
   physical interfaces or tunnels, a common abstraction for both was
   devised: VIFs, Virtual InterFaces. mrouted passes vif structures to
   the kernel, indicating physical or tunnel interfaces to add to its
   routing tables, and multicast forwarding entries saying where to
   forward datagrams.
  
   VIFs are added with MRT_ADD_VIF and deleted with MRT_DEL_VIF. Both
   pass a struct vifctl to the kernel (defined in
   /usr/include/linux/mroute.h) with the following information:
  
   ______________________________________________________________________
   struct vifctl {
   vifi_t vifc_vifi; /* Index of VIF */
   unsigned char vifc_flags; /* VIFF_ flags */
   unsigned char vifc_threshold; /* ttl limit */
   unsigned int vifc_rate_limit; /* Rate limiter values (NI) */
   struct in_addr vifc_lcl_addr; /* Our address */
   struct in_addr vifc_rmt_addr; /* IPIP tunnel addr */
   };
   ______________________________________________________________________
  
  
  
   With this information a vif_device structure is built:
  
   ______________________________________________________________________
   struct vif_device
   {
   struct device *dev; /* Device we are using */
   struct route *rt_cache; /* Tunnel route cache */
   unsigned long bytes_in,bytes_out;
   unsigned long pkt_in,pkt_out; /* Statistics */
   unsigned long rate_limit; /* Traffic shaping (NI) */
   unsigned char threshold; /* TTL threshold */
   unsigned short flags; /* Control flags */
   unsigned long local,remote; /* Addresses(remote for tunnels)*/
   };
   ______________________________________________________________________
  
  
  
   Note the dev entry in the structure. The device structure is defined
   in /usr/include/linux/netdevice.h file. It is a big structure, but the
   field that interests us is:
   ______________________________________________________________________
   struct ip_mc_list* ip_mc_list; /* IP multicast filter chain */
   ______________________________________________________________________
  
  
  
   The ip_mc_list structure -defined in /usr/include/linux/igmp.h- is as
   follows:
  
   ______________________________________________________________________
   struct ip_mc_list
   {
   struct device *interface;
   unsigned long multiaddr;
   struct ip_mc_list *next;
   struct timer_list timer;
   short tm_running;
   short reporter;
   int users;
   };
   ______________________________________________________________________
  
  
  
   So, the ip_mc_list member from the dev structure is a pointer to a
   linked list of ip_mc_list structures, each containing an entry for
   each multicast group the network interface is a member of. Here again
   we see membership is associated to interfaces.
   LINUX/net/ipv4/ip_input.c traverses this linked list to decide whether
   the received datagram is destined to any group the interface that
   received the datagram belongs to:
  
   ______________________________________________________________________
   #ifdef CONFIG_IP_MULTICAST
   if(!(dev->flags&IFF_ALLMULTI) && brd==IS_MULTICAST
   && iph->daddr!=IGMP_ALL_HOSTS
   && !(dev->flags&IFF_LOOPBACK))
   {
   /*
   * Check it is for one of our groups
   */
   struct ip_mc_list *ip_mc=dev->ip_mc_list;
   do
   {
   if(ip_mc==NULL)
   {
   kfree_skb(skb, FREE_WRITE);
   return 0;
   }
   if(ip_mc->multiaddr==iph->daddr)
   break;
   ip_mc=ip_mc->next;
   }
   while(1);
   }
   #endif
   ______________________________________________________________________
  
  
  
   The users field in the ip_mc_list structure is used to implement what
   was said in section ``IGMP version 1'': if a process joins a group and
   the interface is already a member of that group (ie, another process
   joined that same group in that same interface before) only the count
   of members (users) is incremented. No IGMP messages are sent, as you
   can see in the following code (taken from ip_mc_inc_group(), called by
   ip_mc_join_group(), both in LINUX/net/ipv4/igmp.c):
  
   ______________________________________________________________________
   for(i=dev->ip_mc_list;i!=NULL;i=i->next)
   {
   if(i->multiaddr==addr)
   {
   i->users++;
   return;
   }
   }
   ______________________________________________________________________
  
  
  
   When dropping memberships, the counter is decremented and additional
   operations are performed only when the count reaches 0
   (ip_mc_dec_group()).
  
   MRT_ADD_MFC and MRT_DEL_MFC set or delete forwarding entries in the
   multicast routing tables. Both pass a struct mfcctl to the kernel
   (also defined in /usr/include/linux/mroute.h) with this information:
  
   ______________________________________________________________________
   struct mfcctl
   {
   struct in_addr mfcc_origin; /* Origin of mcast */
   struct in_addr mfcc_mcastgrp; /* Group in question */
   vifi_t mfcc_parent; /* Where it arrived */
   unsigned char mfcc_ttls[MAXVIFS]; /* Where it is going */
   };
   ______________________________________________________________________
  
  
  
   With all this information in hand, ipmr_forward() "walks" across the
   VIFs, and if a matching is found it duplicates the datagram and calls
   ipmr_queue_xmit() which, in turn, uses the output device specified by
   the routing table and the proper destination address if the packet is
   to be sent across a tunnel (ie, the unicast destination address of the
   other end of the tunnel).
  
   Function ip_rt_event() (not directly related to output, but which is
   in ip_output.c too) receives events related to a network device, like
   the device going up. This function assures that then the device joins
   the ALL-HOSTS multicast group.
  
   IGMP functions are implemented in LINUX/net/ipv4/igmp.c. Important
   information for that functions appears in /usr/include/linux/igmp.h
   and /usr/include/linux/mroute.h. The IGMP entry in the /proc/net
   directory is created with ip_init() in LINUX/net/ipv4/ip_output.c.
  
  
  
   8. Routing Policies and Forwarding Techniques.
  
   One trivial algorithm to make worldwide multicast traffic available
   everywhere could be to send it... everywhere, despite someone wants it
   or not. As this does not seem quite optimized, several routing
   algorithms and forwarding techniques have been implemented.
  
   DVMRP (Distance Vector Multicast Routing Protocol) is, perhaps, the
   one most multicast routers use now. It is a dense mode routing
   protocol, that is, it performs well in environments with high
   bandwidth and densely distributed members. However, in sparse mode
   scenarios, it suffers from scalability problems.
  
   Together with DVMRP we can find other dense mode routing protocols,
   such as MOSPF (Multicast Extensions to OSPF -Open Shortest Path
   First-) and PIM-DM (Protocol-Independent Multicast Dense Mode).
  
   To perform routing in sparse mode environments, we have PIM-SM
   (Protocol Independent Multicast Sparse Mode) and CBT (Core Based
   Trees).
  
   OSPF version 2 is explained in RFC 1583, and MOSPF in RFC 1584. PIM-
   SM and CBT specifications can be found in RFC 2117 and 2201,
   respectively.
  
   All this routing protocols use some type of multicast forwarding, such
   as flooding, Reverse Path Broadcasting (RPB), Truncated Reverse Path
   Broadcasting (TRPB), Reverse Path Multicasting (RPM) or Shared Trees.
  
   It would be too long to explain them here and, as short descriptions
   for them are publicly available, I'll just recommend reading the
   draft-ietf-mboned-in.txt text. You can find it in the same places RFCs
   are available, and it explains in some detail all the above techniques
   and policies.
  
  
  
   9. Multicast Transport Protocols.
  
   So far we have been talking about multicast transmissions using UDP.
   This is the usual practice, as it is impossible to do it with TCP.
   However, intense research is taking place since a couple of years in
   order to develop some new multicast transport protocols.
  
   Several of these protocols have been implemented and are being tested.
   A good lesson from them is that it seems no multicast transport
   protocol is general and good enough for all types of multicast
   applications.
  
   If transport protocols are complex and difficult to tune, imagine
   dealing with delays (in multimedia conferences), data loss, ordering,
   retransmissions, flow and congestion control, group management, etc,
   when the receiver is not one, but perhaps hundreds or thousands of
   sparse hosts. Here scalability is an issue, and new techniches are
   implemented, such as not giving acknowledges for every packet received
   but, instead, send negative acknowledges (NACKs) for data not
   received. RFC 1458 gives the proposed requirements for multicast
   protocols.
  
   Giving descriptions of those multicast protocols is out of the scope
   of this section. Instead, I'll give you the names of some of them and
   point you to some sources of information: Real-Time Transport Protocol
   (RTP) is concerned with multi-partite multimedia conferences, Scalable
   Reliable Multicast (SRM) is used by the wb (the distributed White-
   Board tool, see section ``Multicast applications''), Uniform Reliable
   Group Communication Protocol (URGC) enforces reliable and ordered
   transactions based in a centralized control, Muse was developed as an
   application specific protocol: to multicast news articles over the
   MBone, the Multicast File Transfer Protocol (MFTP) is quite
   descriptive by itself and people "join" to file transmission
   (previously announced) much in the same way they would join a
   conference, Log-Based Receiver-reliable Multicast (LBRM) is a curious
   protocol that keeps track of all packets sent in a logging server that
   tells the sender whether it has to retransmit the data or can drop it
   safely as all receivers got it. One protocol with a funny name
   -especially for a multicast protocol- is STORM (STructure-Oriented
   Resilient Multicast). Lots and lots of multicast protocols can be
   found searching the Web, along with some interesting papers proposing
   new activities for multicast (for instance, www page distribution
   using multicast).
  
   A good page providing comparisons between reliable multicast protocols
   is
  
   <http://www.tascnets.com/mist/doc/mcpCompare.html>.
  
   A very good and up-to-date site, with lots of interesting links
   (Internet drafts, RFCs, papers, links to other sites) is:
  
   <http://research.ivv.nasa.gov/RMP/links.html>.
  
   <http://hill.lut.ac.uk/DS-Archive/MTP.html> is also a good source of
   information on the subject.
  
   Katia Obraczka's "Multicast Transport Protocols: A Survey and
   Taxonomy" article gives short descriptions for each protocol and tries
   to classify them according to different features. You can read it in
   the IEEE Communications magazine, January 1998, vol. 36, No. 1.
  
  
  
   10. References.
  
   10.1. RFCs.
  
  
   o RFC 1112 "Host Extensions for IP Multicasting". Steve Deering.
   August 1989.
  
   o RFC 2236 "Internet Group Management Protocol, version 2". W.
   Fenner. November 1997.
  
   o RFC 1458 "Requirements for Multicast Protocols". Braudes, R and
   Zabele, S. May 1993.
  
   o RFC 1469 "IP Multicast over Token-Ring Local Area Networks". T.
   Pusateri. June 1993.
  
   o RFC 1390 "Transmission of IP and ARP over FDDI Networks". D. Katz.
   January 1993.
  
   o RFC 1583 "OSPF Version 2". John Moy. March 1994.
  
   o RFC 1584 "Multicast Extensions to OSPF". John Moy. March 1994.
  
   o RFC 1585 "MOSPF: Analysis and Experience". John Moy. March 1994.
  
   o RFC 1812 "Requirements for IP version 4 Routers". Fred Baker,
   Editor. June 1995
  
   o RFC 2117 "Protocol Independent Multicast-Sparse Mode (PIM-SM):
   Protocol Specification". D. Estrin, D. Farinacci, A. Helmy, D.
   Thaler; S. Deering, M. Handley, V. Jacobson, C. Liu, P. Sharma, and
   L. Wei. July 1997.
  
   o RFC 2189 "Core Based Trees (CBT version 2) Multicast Routing". A.
   Ballardie. September 1997.
  
   o RFC 2201 "Core Based Trees (CBT) Multicast Routing Architecture".
   A. Ballardie. September 1997.
  
  
  
   10.2. Internet Drafts.
  
  
   o "Introduction to IP Multicast Routing". draft-ietf-mboned-intro-
   multicast- 03.txt. T. Maufer, C. Semeria. July 1997.
  
   o "Administratively Scoped IP Multicast". draft-ietf-mboned-admin-ip-
   space-03.txt. D. Meyer. June 10, 1997.
  
  
  
   10.3. Web pages.
  
  
   o Linux Multicast Homepage.
   <http://www.cs.virginia.edu/~mke2e/multicast.html>
  
   o Linux Multicast FAQ. <http://andrew.triumf.ca/pub/linux/multicast-
   FAQ>
  
   o Multicast and MBONE on Linux.
   <http://www.teksouth.com/linux/multicast/>
  
   o Christian Daudt's MBONE-Linux Page.
   <http://www.microplex.com/~csd/linux/mbone.html>
  
   o Reliable Multicast Links
   <http://research.ivv.nasa.gov/RMP/links.html>
  
   o Multicast Transport Protocols <http://hill.lut.ac.uk/DS-
   Archive/MTP.html>
  
  
  
   10.4. Books.
  
  
   o "TCP/IP Illustrated: Volume 1 The Protocols". Stevens, W. Richard.
   Addison Wesley Publishing Company, Reading MA, 1994
  
   o "TCP/IP Illustrated: Volume 2, The Implementation". Wright, Gary
   and W. Richard Stevens. Addison Wesley Publishing Company, Reading
   MA, 1995
  
   o "UNIX Network Programming Volume 1. Networking APIs: Sockets and
   XTI". Stevens, W. Richard. Second Edition, Prentice Hall, Inc.
   1998.
  
   o "Internetworking with TCP/IP Volume 1 Principles, Protocols, and
   Architecture". Comer, Douglas E. Second Edition, Prentice Hall,
   Inc. Englewood Cliffs, New Jersey, 1991
  
  
  
   11. Copyright and Disclaimer.
  
   Copyright 1998 Juan-Mariano de Goyeneche.
  
   This HOWTO is free documentation; you can redistribute it and/or
   modify it under the terms of the GNU General Public License as
   published by the Free Software Foundation; either version 2 of the
   License, or (at your option) any later version.
  
   This document is distributed in the hope that it will be useful, but
   without any warranty; without even the implied warranty of
   merchantability or fitness for a particular purpose. See the GNU
   General Public License for more details.
  
   You can obtain a copy of the GNU General Public License by writing to
   the Free Software Foundation, 59 Temple Place - Suite 330, Boston, MA
   02111-1307, USA.
  
   If you publish this document on a CD-ROM or in hardcopy form, a
   complimentary copy would be appreciated; mail me for my postal
   address. Also consider making a donation to the Linux Documentation
   Project or the Free Software Foundation to help support free
   documentation for GNU/Linux. Contact the Linux HOWTO coordinator, Tim
   Bynum linux-howto@sunsite.unc.edu, for more information.
  
  
  
   12. Acknowledgements.
  
   This is the best opportunity I've ever had to thank so many people I
   feel grateful to. So, I'm afraid this is going to be a large
   section... It is, in any case, the most important one of this paper
   (for me, at least...).
  
   First, I want to thank Elena Apolinario Fernndez de Sousa (yes, Elena
   is the first name; the REST is THE surname ;-) ). I tried to reflect
   in this Howto all the knowledge I collected while working with her in
   connecting our Department to the MBone and debugging problems with
   locally generated CSCW software across multicast tunnels. She was of
   invaluable help in finding and correcting network problems,
   discovering and fixing kernel bugs that puzzled us for days, ... and
   keeping the sense of humor alive while problems appeared and appeared,
   but solutions didn't. She also read and corrected the drafts for this
   document and provided important ideas and suggestions. If this howto
   is here and is usefull for somebody, it will be, in many aspects,
   thanks to her. Thanks, Elena!
  
   There is something I have been lucky enough to find all my (still-not-
   too-long) live, but, despite being repetitive, has never stopped
   amazing me. I'm talking about people that altruistically employ part
   of their time and/or resources to help other people learn new things;
   and, what is better, they enjoy doing it. This is not only (but also,
   too) explain things they already know, but lend their books, provide
   access to their sources and facilitate you the way to learn all things
   they know; sometimes, even more... I know quite a few of that people,
   and I'd like to thank them for all their help.
  
   Pablo Basterrechea was my "first source of documentation" while I was
   in my pre-Internet stage. I learned assembly and advanced structured
   programming entirely from his books (well, the latter also from his
   programs...). Thanks for all, Pablo.
  
   In my first course at the University that "primary source of
   documentation" moved to Pepe Maas. He was teaching then Computer
   Programming there, and soon I became addict to his bookshelf. He lent
   me his books lots of times without asking for a minimum sign that
   could assure that I was going to return them back to him, not even my
   name! My first approach to TCP/IP was also by his hand: he lent me
   Comer's "Internetworking with TCP/IP, Volume 1" for the whole summer.
   He did not even know my name by then, but he lent me the book... That
   book influenced me a lot, and TCP/IP has become one of my primary
   fields of interest since that summer.
  
   If there are two persons I must thank most, these are (in alphabetic
   order ;-) ), Jos Manuel and Paco Moya. Nobody I asked more things more
   times (C, C++, Linux, security, Web, OSs, signals & systems,
   electronics, ... anything!) and, despite my persistence, I always got
   throughly and friendly responses and help. If I'm using GNU/Linux now,
   this is, again, thanks to them. I feel particularly lucky with friends
   like them. THANKS.
  
   Iigo Mascaraque also helped (from him I got my first System
   Administration book) and encouraged me in my beginnings, but never
   stopped reminding me that, although this was a fascinating world and
   an important part of my career, I should not forget the other, less-
   interesting, parts. (I don't forget, I$!).
  
   As I am on the topic, I'd like to thank my parents, too. They always
   tried to make the best opportunities available for me. Many thanks for
   all.
  
   I also feel grateful to Joaqun Seoane, the first who trusted me enough
   to give me a root password in the time I was learning system
   administration by myself, and Santiago Pavn, the one who gave me my
   first opportunity here at DIT.
  
   W. Richard Stevens' books have been a real revelation for me (it's a
   pity they are so expensive...). If he ever reads this paper, I'd like
   to thank him for them, and encourage him to keep on writing. Anything
   that comes out of his hands will -undoubtedly- be good for all of us.
  
   Finally I'd like to thank Richard Stallman, Linus Torvalds, Alan Cox
   and all contributors to the Linux kernel and the free software in
   general, for giving us such a great OS.
  
   I'm sure I'm forgetting someone here... Sorry. I'm certain they know
   I'm grateful to them too, so if they tell me, everybody will know
   it... :-)
  
  
  
Conversion Program © 1997,2003 by Andrew B. Cramer