Editor's Note: These minutes have not been edited. MBONED WG Meeting April 7, 1997 Memphis, TN David Meyer/University of Oregon, Chair Reported by Matt Crawford and David Meyer -------------------------------------------------------------------- Agenda -------------------------------------------------------------------- I. Monday, April 7 1300-1500 (i). Introduction, Agenda Bashing and Status (ii). Status of Pruning/Pruning Draft (iii). Administratively Scoped Multicast (iv). Using TTLs with Admin Scoped Addresses (v). Using DHCP to allocate multicast addresses (vi). Issues for an Inter-domain Multicast Routing Protocol (vii). Current IDMR Inter-domain Routing Proposals II. Monday, April 7 1930-2200 (viii). M-BGP Proposal Overview (ix). Multicast Diagnostic Tools (x). Intro to Multicast Routing (xi). Rate Limiting Draft -------------------------------------------------------------------- (i). Introduction, Agenda Bashing, and Status -------------------------------------------------------------------- David Meyer convened the meeting, giving a status report of progress made by the MBONED WG. Meyer noted that there was a MBONED web page at: http://www.antc.uoregon.edu/MBONED/ and a mail list that one can subscribe to: majordomo@ns.uoregon.edu.edu subscribe mboned He then asked for additions and changes to the agenda. -------------------------------------------------------------------- (ii). Progress on the Pruning Draft -------------------------------------------------------------------- John Hawkinson gave an update on the state of the pruning draft. The purpose was to be a BCP-to-be; things got politically complicated. IESG wrote a lot of comments but they never got to the author. That was fixed, then the ball was dropped. New draft expected in about two weeks. -------------------------------------------------------------------- (iii). Update on the admin scoping draft -------------------------------------------------------------------- Dave Meyer reported that the IANA has allocated the administrative scope. The draft has needs a minor update to the reserved ranges. A new draft will be posted soon. -------------------------------------------------------------------- (iv). Using TTLs with admin. scoped addresses -------------------------------------------------------------------- Ross Finlayson gave an overview of his idea for using using ttls with admin. scoped addresses. The problem: pruning doesn't happen if TTL expires inside the MBONE, because the router doesn't know that a higher-TTL packet wont arrive later. The one-line answer -- make sure the TTL is high enough to reach admin. scope boundaries. If there were no flood-and-prune protocols, this wouldn't be an issue. What TTL should an app. use (when, e.g., sdr doesn't tell it)? Easy answer: always 255 -- OK on a well-managed intranet. But it's dangerous for the global internet -- there's no guarantee scopes are universally set up correctly. Border routers will get many flood-and-prune hits. Proposal -- each admin scope range has an effective TTL large enough to reach all intended members. Defined along with the range -- by IANA or dynamically? Apps should always use exactly this TTL. (But lower TTL allowed for expanding-ring search.) Implications for mrouters No effect on admin scope implementation mrouters may allow configuration in terms of TTL thresholds alone. completely optional feature packets to non-admin-scoped address check against TTL thresh. as usual (but still no pruning). Sysadmins may find TTL-threshold-only configuration simpler. Lazy sysadmins get an easy migration path to admin. scoping. admin scope boundaries would come automatically with some s/w update. Key Issues App developers need guidance about what TTLs to use w/ admin scoped addrs, even if the answer is always 255. What effective TTLs are appropriate for the ranges already proposed? (E.g., 15 for site local 239.192/10?) Q: Van: TTL is used for too many things loop suppression expanding ring search; admin boundaries The final recommendation was to abolish the use of TTL as a boundary mechanism. No action taken on this draft. -------------------------------------------------------------------- (v). (v). Using DHCP to allocate multicast addresses -------------------------------------------------------------------- Baiju Patel discussed his draft using DHCP for multicast address allocation. Motivation: Coordinated address allocation Avoid collision Support admin. scoped addresses Common solution for all applications Problem: 1. mechanism for allocating addresses 2. allocation policy 3. usage policy Address #1 now, defer 2 & 3. Current art: Application specific guess (SDR) a gatekeeper manages a set of addresses Or individual solutions hard to manage! Parameters which need to be associated with a multicast address Scope Classes of scopes: IANA defined or Locally administered Examples: Intel/Jones Farm, Intel/Oregon, Intel,USA, ... Implementation of scope: by TTL or router configuration -- not part of this proposal. Start time Lease time TTL? You should be given an upper bound when you get the right to use an address. Tried to leverage what exists, where possible ... DHCP -- used for two purposes Allocation of IP addresses Distribution of host configuration parameters (DNS server, NNTP server, Router, ...) Can we meet requirements with DHCP extensions? Extend DHCP to provide scope information (a number and a string) Provide mechanisms to request, allocate. renew lease, release assignments of mcast addrs Details - Client obtains scope list and unicast or multicast address for MDHCP server using DHCPINFORM. - Client selects a scope, then sends a unicast or multicast DHCPDISCOVER - Server(s) send unicast DHCPOFFER of multicast address(es) - Client sends unicast or multicast DHCPREQUEST - Server sends DHCPACK or DHCPNAK. When a lease is created,some cookie is given out. Any participant can use the cookie to renew the lease in order, for example, to continue a conference after its creator has left. Other features -- - Client MDHCP and DHCP could be in a single implementation. - Client MDHCP and DHCP could be separate, MDHCP client would use (a new?) PORT option. - Servers could be combined or separate. Use of multicast or unicast ensures that DHCP-only implementations are not affected by MDHCP messages. Guidelines - Server MUST Not allocate same address with overlapping scope & time to two different requests. - Server SHOULD avoid allocating an address that is already allocated for a different time in the same scope. Questions: Crawford: Use of multicast for MDHCP non-interference with DHCP won't work in v6 Handley: SDR's scaling is good. Jacobson: Yes, and the scaling of MDHCP is bad. No structure - space is flat - multiple servers must coordinate, either by structure (which wastes a lot) or by talking between servers. How? Block of addresses, if your app. needs more than one. Someone: but at a corporate-sized scale, this can work. No action taken by the working group. -------------------------------------------------------------------- (v). Issues for an Inter-domain Multicast Routing Protocol -------------------------------------------------------------------- David Meyer discussed his draft on issues for an Inter-domain Multicast Routing Protocol. During the description, he pointed out that many common Internet applications exhibit point-to-multipoint or "multicast" behavior. The world wide web's data distribution and caching models, soft real-time applications such as video conferencing and application sharing, and USENET News (NNTP) are examples of common applications which assume a point-to-multipoint distribution model. These applications have historically relied on unicast technologies to implement point-to-multipoint or multipoint-to-multipoint communication topologies. In particular, multipoint communication in the Internet had been and is still is (in many cases) implemented by replicating unicast flows, one for each receiver. Replicating unicast flows is clearly neither efficient nor scalable; the problem is exacerbated in the inter-domain environment, where resources are particularly scarce. Over the past few years, however, efficient IP layer multicasting has been recognized as one of the essential architectural features required in an Internet that can scale to very large sizes. In recent years we have seen the first multicast routing and forwarding protocols designed and deployed on the Internet [DVMRP, PIMARCH, CBT]. While these protocols have been relatively successful in small scale deployment [MBONE], However, these protocols have exhibited various deficiencies when scaling to the Internet sizes while still providing adequate policy control for network service providers. The seminal work on multicast routing is contained in Steve Deering's thesis [DEERING89]. Deering outlined a mechanism for building shortest path multicast distribution trees (SPT) using Reverse Path Forwarding (RPF). For each (S,G) pair, the method builds a distribution tree rooted at the source such that each receiver is on the shortest path back to the source (the notation (S,G) represents a source,group pair). When the path between a sender and receiver is symmetrical, RPF computes the shortest path tree. The method is data-driven, data is flooded down all the branches of the distribution tree. Nodes that have no downstream receivers send "prune" packets upstream (toward the source) to prune branches of the tree which have no receivers. This protocol has become known as as Dense-Mode [PIM-DM], and is most useful when group members are densely clustered in some part of the topology. To address the case of sparsely distributed group members, Minimal Shared-Tree (MST) distribution algorithms have been introduced. Shared distribution trees have their origin as approximations of Steiner Minimal Trees, and use a variation on Center-based trees [WALL80] as their basis. Current shared tree multicast distribution algorithms include Core Based Trees [CBT] and Protocol Independent Multicasting Sparse Mode [PIM-SM]. The root of the shared tree is often called a Core or Rendezvous Point (RP). Shortest Path and Shared Tree algorithms represent trade-offs [WEI93] at three fundamental points in the design space: - Delay Shared tree algorithms have worse delays for large groups, since no known RP placement can produce shortest paths. Shared tree algorithms also don't handle dynamic group membership as well as shortest path tree, since optimal RP placement is a function of group membership distribution. In summary, SPT multicast routing algorithms like DVMRP or MOSPF [MOSPF] have the worst case delay is bounded by the round-trip time (RTT) from the receiver to the sender, whereas shared tree algorithms like PIM-SM and CBT have worst case delay bounded by twice the RTT from receiver to RP/core (assuming symmetrical unicast routing from sender to receiver). - Traffic Concentration Traffic concentration is a well known problem for shared tree protocols. For some important classes of topologies, shared tree and source trees yield the same delay characteristics. - State information Shortest path trees require much more state information. Shared tree approaches only require a single tree, used by all, while the shortest path trees are relative to each site as source. It is possible that shortest path trees could require a (S,G) pair for every active sender or receiver in the Internet. Today's multi-provider Internet reveals a fourth issue in the traditional design space: the ability to express and implement inter-provider policies. However, unlike current inter-domain unicast routing protocols (which have a rich and well developed policy model), neither of the two classes of algorithms described are adaptable in a straight forward way to the policy oriented multi-provider environment found in todays Internet. A simple example illustrates the problem: Consider three providers, A, B, and C, that have connections to a shared-media exchange point. Assume that connectivity is non-transitive due to some policy (the common case, since bi-lateral agreements are a very common form of peering agreement). That is, A and B are peers, B and C are peers, but A and C are not peers. Now, consider a source S covered by a prefix P, where P belongs to a customer of A (i.e., P is advertised by A). Now, multicast packets forwarded by A's border router will be correctly accepted by B's border router, since it sees the RPF interface for P to be the shared-media of the exchange. Likewise, C's border router will reject the packets forwarded by A's border router because, by definition, C's border router does not have A's routes through its interface on the exchange (so packets sourced "inside" A fail the RPF check in C's border router). In this example above, RPF is a powerful enough mechanism to inform C that it should not accept packets sourced in P from A over the exchange. However, consider the common case in which P multi-homed to both A and B. C now sees a route for P from B through its interface on the exchange. Without some form of multi-provider cooperation and/or packet filtering (or a more sophisticated RPF mechanism), C could accept multicast packets sourced by S from A across the shared media exchange, even though A and C have not entered into any agreement on the exchange. The situation described above is caused by the overloading of the semantics of unicast route (as described above), and the reliance on the RPF check as a policy mechanism. Note: During the IDMR meeting, it was suggested that this document be reworked into a requirements document. Meyer agreed to edit the document. -------------------------------------------------------------------- (vi). Current IDMR Inter-domain Routing Proposals -------------------------------------------------------------------- Dave Thaler (Deborah Estrin, Dino Farinacci) -- IDMRP concepts and deployment "A number of ideas that a number of us have been banging around for the last year or so." Goals of an IDMRP: low BW overhead minimize state low CPU consumption fast convergence Today's interoperability can achieve source-specific trees in flood-and-prune regions. Either no pruning, or state per source. Must flood either the data or the membership information Result: NOT acceptable Not a new problem - inter-domain mcast has the same menu of choices and intra-domain mcast. Concentrate work on an O(G) state O(C) mumble protocol, somewhat like HDVMRP Future goal: "Group-shared three of domains" Features of IDMRP scheme in progress independent of M-IGP Group-shared trees of domain (low state volume) Bidirectional group-shared tree (minimize 3rd party dependence Group state is aggregatable A group-shared tree of domains has effects on policy: - Can't have source-specific policies on group shared trees -- Can have group-specific policies Requiring switch to SPT first (before receiving data) would introduce a bursty-source problem, so don't require that. Summary: tradeoff between amount of state and amount of policy control. Using center-based trees requires mapping group G to root for RPF; however, for locality, want every domain to be a potential root domain (root gets all data). E.g., a group for a "lecture" should be rooted at the lecturer's domain. Advertising to BSR model is not as scalable when number of candidate RPs is unbounded. We want group state to be aggregatable. Proposal: put some structure into multicast addresses -- Dynamically assign group prefixes to user domains with topological significance (and repeat for each scope level). -- What do you do RPF check against? Use a "representative IP address" (aka Landmark address) for RPF check at BRs. BRs must map G to Landmark address ... -- Implications: Local-prefix periodically announced in each domain. Address allocators (eg, SDR) see prefix per scope and allocate from that range. -- Other allocation schemes merely give poorer tree of domains for the traffic. Policy II - Tragedy of the commons -- Transiting multicast means you're willing to transit between subtree branches. -- Since no single domain is the root for all groups, this (bold assertion follows) evens out. -- Result: less overhead for everyone. If people honor the recommended address allocation, the groups you provide transit for are usually your customers' groups. Q: Draw the picture in the real internet and "usually" -> "occasionally." Q: A long three-way discussion among Randy Bush, Van Jacobson and Sue Hares about transit, policy, peering, and so on. -------------------------------------------------------------------- (vii). M-BGP -------------------------------------------------------------------- Tony Ballardie described his draft with Martin Tatham that extends BGP to support inter-domain multicast routing (policy). The main issues here include the fact that this proposal just builds paths, and not distribution trees. An interesting "convergence" is occurring with the IDR. They are looking at various multi-protocol BGP proposals. Of course, M-BGP could be viewed as another address in this context. -------------------------------------------------------------------- (ix). Multicast Diagnostic Tools -------------------------------------------------------------------- Bernard Aboba outlined the current draft. Few comments were given. No working group action was taken. -------------------------------------------------------------------- (x). Intro to Multicast Routing -------------------------------------------------------------------- Tom Maufer introduced himself and asked for additional comments. draft-ietf-mboned-intro-multicast-00.txt is in WG last call. -------------------------------------------------------------------- (xi). Rate Limiting Draft -------------------------------------------------------------------- Doug Junkins outlined the current draft. A few comments were given. A new draft will be posted.