DNS and routing of IPv6 micro allocations

Currently, there are very few people who want to run an IPv6-only network. And that's a good thing too, as presently, there is no way to do this. One of the big hurdles is the DNS. Right now, very few, if any, top level domains accept IPv6 glue records. However, there are no technical reasons why those can't be added. Unfortunately, there is a technical reason why making the existing root nameservers perform their function over IPv6 is problematic. When a nameserver starts up, it looks at a local file for root servers. However, it will only use this list of root servers for a single query: one that results in the list of current root servers. In order to avoid problems, it's important that the answer for this query contains all the addresses for the root servers as additional information. The problem is that the original DNS specifications allow a relatively short packet size (around 512 bytes). This allows for the current 13 root servers and their IPv4 addresses with little room to spare.

But in the mean time some root server operators are experimenting with making the root service available over IPv6. (See http://www.root-servers.org/ for more information.) At the time of this writing, four root servers have IPv6 addresses:

B: 2001:478:65::53
F: 2001:500::1035
H: 2001:500:1::803f:235, and
M: 2001:dc3::35

However, only B and M are reachable (for me). A closer look at the addresses used provides the following information:

B: 2001:478:65::53 -> 2001:478::/32 @ ARIN: EP.NET
F: 2001:500::1035 -> 2001:500::/48 @ ARIN -> Internet Software Consortium
H: 2001:500:1::803f:235 -> 2001:500:1::/48 @ ARIN: U.S. Army Research Laboratory
M: 2001:dc3::35 -> 2001:dc3::/32 @ APNIC: M-ROOT-DNS-IPv6-20030619

The plot thickens... Since everyone and their little sister can easily obtain a /48 worth of IPv6 address space (I have two of those for personal use), it's expected that the global IPv6 routing table will suffer a lot of pollution from /48s, much like what happens with /24s in the IPv4 routing table, only worse. So it's unavoidable to filter on prefix length and not accept /48s.

(Additionally, it looks like the H /48 isn't announced at all: the route doesn't show up on the AMS-IX IPv6 looking glass, which does show the F /48 and other more specifics.)

When this issue came up on the IETF mailinglist, Paul Vixie, operator of the F root server, indicated that he had simply followed ARIN guidelines and obtained a /48 "micro allocation" from ARIN. It turns out ARIN has set aside a some address space for internet exchanges and "critical infrastructure". This address space is given out as /48s, see List of IPv6 Micro-allocations. (RIPE has a somewhat similar page at Smallest RIPE NCC Allocation / Assignment Sizes but it doesn't mention micro allocations.) All of this seems perfectly reasonable, except for one thing:

the existence of micro allocations is never mentioned in the RIR's IPv6 policy document.

This document, which is available in slightly different layouts and versions from LACNIC, APNIC, RIPE, ARIN and, for good measure, from IANA, says:

"4.3. Minimum Allocation

RIRs will apply a minimum size for IPv6 allocations, to facilitate prefix-based filtering.

The minimum allocation size for IPv6 address space is /32."

And this is exactly what many ISPs that offer IPv6 service do: they filter on a prefix length of 32 bits as indicated above, or 35 bits, the old allocation size. Obviously someone dropped the ball big time here, and this needs to be fixed in one way or another. Watch this space for more information. In the mean time, be sure to selectively relax your filters if you do prefix based filtering in IPv6. Gert Döring maintains a set of IPv6 BGP filter recommendations.

Permalink - posted 2003-12-09

Station Den Haag Centraal - Koningin Julianaplein

Image link - posted 2003-10-03 in

RIPE 46 Wednesday - Routing, IPv6

Wednesday

Routing

Wednesday brought sessions about my two favorite subjects: routing and IPv6. However, I didn't find most of the routing subjects very interesting: RIS Update, Verification of Zebra as a BGP Measurement Instrument, Comparative analysis of BGP update metrics. The last one sounds kind of interesting but it comes down to a long analysis of what you get when you compare BGP updates gathered at different locations such as the Amsterdam Internet Exchange looking glass and the Oregon Internet Exchange Route Views.

Yesterday's presentation in the routing wg about bidirectional forwarding detection (that I completely forgot about during all the train rerouting) was much more interesting. Daves Katz and Ward wrote an Internet Draft draft-katz-ward-bfd-01.txt for a new protocol that makes it possible for routers to check whether the other side is still forwarding. This goes beyond the link keepalives that many protocols employ, because it also tests if there is any actual forwarding happening. And the protocol works for unidrectional links and to top it all off, it works at millisecond granularity. There is a lot of interest in this protocol, so there is considerable pressure to get it finished soon.

But wednesday's routing session wasn't a complete write-off as Pascal Gloor presented the Netlantis Project. This is a collection of BGP tools. Especially the Graphical AS Matrix Tool is pretty cool: it shows you the interconnections between ASes. I'm not exactly sure how it decides which ASes to include, but it still provides a nice overview.

IPv6

In the afternoon there was the IPv6 working group session which conflicted with the Technical Security working group session which I would also have liked to attend...

Kurtis Lindqvist presented an IETF multi6 wg update.

Gert Doering talked about the IPv6 routing table. Apart from the size, there are some notable differences with IPv4: IPv6 BGP interconnection doesn't reflect business relationship or anything close to physical topology: people are still giving away free IPv6 transit and tunneling all over the place. This is getting better, though. (The problem with this is that you get lots of routes but no way to know in advance which are good. Nice to have free transit, not so nice when it's over a tunnel spanning the globe.) There are now nearly 500 entries in the global IPv6 table, which is nearly twice as much as two years ago. About half of those are /32s from the RIRs (2001::/16 space), and the rest more or less equally distributed over /35s from the RIRs and /24s, /28s and /32s from 6bone space (3ffe::/16).

I'm not sure if it was Gert, but someone remarked during a presentation: "In Asia, they run IPv6 for production. In Europe, they run it for fun. In the US, they don't run it at all."

Jeroen Massar talked about "ghost busting". When the Regional Internet Registries started giving out IPv6 space, they assigned /35s to ISPs. Later they changed this to /32s. The assignments were done in such a way that an ISP could simply change their /35 announcement to a /32 announcement "in place". However, this is not entirely without its problems as the BGP longest match first rule dictates that a longer prefix is always preferred (such as a /35 over a /32), regardless of the AS path length or other metrics. With everyone giving away free transit, there are huge amounts of potential longer paths that BGP will explore before the /35 finally disappears from the routing table and the /32 is used.

To add insult to injury, there appear to be bugs that make very long AS paths stay around when they should have disappeared. These are called "ghosts" so hence the ghost busting. See the Ghost Route Hunter page for more information.

Permalink - posted 2003-09-16

IPv4 Address Lifetime Expectancy Revisited

At the end of the thursday plenary at the RIPE 46 meeting, Geoff Huston presented IPv4 Address Lifetime Expectancy Revisited (PDF).

If looking at the slides leaves you puzzled, have a look at one of Geoff's columns from a few months ago that (as always) explains everything both in great detail and perfect clarity: IPv4 - How long have we got? (The full archive is available at http://www.potaroo.net/papers.html.)

Geoff looks at three steps in the address usage process:

Allocation of a /8 from IANA to a Regional Internet Registry
Assignment of address space from a RIR to someone who asks for it (usually an ISP)
The addresses showing up in the global BGP routing table

It looks like the free IANA space is going to run out in 2019. But the RIRs hold a lot of address space in pools of their own, if we include this the critical date becomes 2026. Initially, projections of BGP announcements would indicate that all the regularly available address space would be announced in 2027, and if we include the class E (240.0.0.0 and higher) address space a year later. However, Geoff didn't stop there. After massaging the data, his conclusion was that the growth in BGP announcements doesn't seem exponential after all, but linear. With the surprising result:

"Re-introducing the held unannounced space into the routing system over the coming years would extend this point by a further decade, prolonging the useable lifetime of the unallocated draw pool until 2038 - 2045."

Now of course there are lots of disclaimers: whatever happened in the past isn't guaranteed to happen in the future, that kind of thing. This goes double for the BGP data, as this extrapolation is only based on three years of data. But still, many people were pretty shocked. It was a good thing this was the last presentation of the day, because there were soon lines at the microphones.

So what gives? Ten years ago the projections indicated that the IPv4 address space would be depleted by 2005. Now, an internet boom and large scale adoption of always-on internet access later, it isn't going to be another couple of years, but four decades? Seems unlikely. Obviously CIDR, VLSM, NAT and ethernet switching that allows much larger subnets have all slowed down address consumption. But I think there are some other factors that have been overlooked so far. One of those is that some of the old assignments (such as entire class A networks) are being used up right now. For instance, AT&T Worldnet holds 12.0.0.0/8, but essentially this space is used much like a RIR block, ranges are further assigned to end-users. We're probably also seeing big blocks of address space disappearing from the global routing table because announcing such a block invites too much worm scanning traffic. On the other hand there are also reports from "ISPs" that assign private address space to their customers and use NAT. So I guess there is a margin of error in both directions.

The the same time, the argument can be made that for all intents and purposes the IPv4 address space has already run out: it's way too hard to get the address space you need (let alone want). This is in line with what Alain Durand and Christian Huitema explain in RFC 3194. They argue that the logarithm of the number of actually used addresses divided by the logarithm of the number of usable addresses (the HD ratio) represents a pain level: below a ratio of 80% there is little or no pain, trouble starts at 85% and 87% represents a practical maximum. For IPv4 that would be 211 million addresses used (note that in the RFC the number is 240 million, but this is based on the full 32 bits while a little over an eighth of that isn't usable).

According to the latest Internet Domain Survey we're now at 171 million. This is counting the number of hosts that have a name in the reverse DNS, so the real number is probably higher.

I think RFC 3194 is on the right track but rather than simply do a log over the size of the address space, what we should look at is the number and flexibility of aggregation boundaries. In the RFC phone numbers are cited. Those have one aggregation boundary: area code vs local number, with a factor 10 flexibility. This means wasting a factor 10 (worst case) once. Classful IPv4 also had a single boundary, but the jumps are 8 bits, so a waste of a factor 256. With classless IPv4 we have more boundaries, but they're only one bit most of the time: IANA->RIRs, RIR->ISPs, ISP->customers and subnet. That's four times a factor two, so a factor 16 in total. That means we can use 3.7 billion addresses / 16 = 231 million IPv4 addresses without pain. Hm... But we can collapse some boundaries to achieve better utilization.

Permalink - posted 2003-09-16

RIPE 46 Intro

RIPE 46, september 1 - 5, Amsterdam

Three times a year there is a RIPE meeting. Twice a year it's in the Krasnapolsky hotel in Amsterdam, and one is elsewhere in Europe.

This week there is the RIPE 46 meeting, once again in Amsterdam. Note that if you can't attend, you can follow what's happening using the experimental streaming service. The streaming bandwidth is around 225 kbps for the highest quality but you can fall back to lower quality or audio only, I think. It works both with Windows Mediaplayer and Video Lan Client on my Mac.

There are also archives of the streamed sessions. The presentation slides are also generally available.

Permalink - posted 2003-09-01

RIPE 46 Monday - EOF

Monday

VoIP

On monday there were talks about voice over IP the whole day as part of the European Operator Forum (EOF). I had somewhat mixed feelings about this. On the one hand I'm pretty interested in VoIP, but I haven't done anything with it in practice so I was expecting to learn a few things. Unfortunately, many of the talks were way too detailed, explaining stuff like the old electromechanical switching mechanisms in the phone network. There was also lots of stuff on how to interconnect your VoIP stuff with the plain old telephone system. This could/should be interesting but I found it again too detailed. I guess I would have liked a smaller scale, more practical approach on how to call over the net and not immediately focus on the POTS network as I'm not going to get rid of my existing phone just yet.

SIP/H.323

But some cool stuff: I got to know a little bit more about SIP vs H.323. Apps like Microsoft's Netmeeting use the ITU H.323 protocol family as their signalling protocol, but today's products are more inclined towards SIP, which is an IETF standard. Note that the actual voice packets are governed by a host of other protocols. Usually, it's even possible to call using an IP address without using SIP or H.323 on VoIP (Ethernet) phones. But a SIP server/proxy provides all the features that usually come from a PABX: implementing numbering plans, connecting to gateways, transferring calls, putting calls on hold, that kind of thing.

There are now VoIP phones that cost about 60 - 75 dollars/euros and there is the free Asterisk "Open Source Linux PBX" software.

MPLS DoS traffic shunt

In the afternoon, there also was a presentation about using an MPLS DoS traffic shunt. (Also presented at NANOG.) This is basically similar to what I talk about in my anti- DoS article, but they use MPLS to backhaul the traffic to a location where there is a Riverhead anti-DoS filtering box and then push the traffic out to where it needs to go. The MPLS paths are automatically created when the right iBGP routes are present, but COLT (who implemented this) doesn't want it's customers to automatically enable this, the NOC must create the iBGP routes for this manually. Note that the reason this makes sense is that COLT has huge amounts of bandwidth and around 60 locations where they interconnect with other networks, but the Riverheads are way too expensive to buy 60 boxes.

Permalink - posted 2003-09-01

oudere posts - nieuwere posts