Border Gateway Protocol and Routing Scalability

In computer science, an important concept is scalability, or how well a way to handle a certain task keeps working as the size of the task increases. For instance, writing phone numbers on scraps of paper works fairly well when you need to keep track of a dozen phone numbers: it only takes ten seconds to find a given one. But for a city with 100,000 people, it now takes a hundred thousand seconds (about a day) to find a number. Using a phone book for a city with a population of 100,000, it takes about half a minute to find a phone number that goes with a given name. The big advantage isn't so much that using a book is much faster than using individual scraps of paper, but rather that when doubling the size of the problem, you're not doubling the amount of work to solve it: searching through a phone book that's twice as big only takes a couple of extra seconds: is the name I'm looking for in the first half of the second half? It doesn't take twice as long, and thus phone books are scalable but scraps aren't. Routing scalability is applying the notion of scalability to the problem of delivering packets to the right destination over the Internet.

Scalability in Data Routing

Routing scalability consists of two issues: the management plane and the data plane.

The data plane is the central or distributed module in a router that takes incoming packets and forwards them to the next router on their way to their destination. This function must for each forwarded packet find the next hop in the forwarding table. The two main mechanisms for doing this are a TCAM, a specialized memory with built-in hardware support for searching through it, and regular memory that is searched using advanced algorithms. The speed of the lookups doesn't go down as the table size increases. However, the TCAM or memory size goes up linearly (or a little faster than that for the multi-level lookups), which increases cost and power use. Additionally, as the number of forwarding table lookups per second increases, more expensive and power-hungry technologies must be used. Such increases are unavoidable as interface speeds go up, but also depend on average or worst-case packet sizes and the number of interfaces per device or per blade/module in certain router architectures.

During the Internet Architecture Routing and Addressing workshop held in Amsterdam in 2006, it was argued that the required memory speed increases outstrip the performance increases in off-the-shelf components, especially now that separate SRAMs aren't in wide use anymore. Previously, computers used high-speed SRAM as memory cache, but these days that function is included on the CPU itself, so SRAM is no longer an easily available commodity chip anymore. This means that costs for the highest-end routers will go up much faster than they have been so far. However, after the IAB routing and addressing workshop, several router vendors have come out and stated in conversations and on mailing lists that this problem is not immediate at this time and that growth at the currently predicted levels will not pose problems in the foreseeable future.

Border Gateway Protocol

The management plane consists of a route processor that executes the BGP routing protocol and related tasks that must be performed by a router to be able to create a forwarding table. BGP is the protocol that ISPs and some other networks use to tell each other which IP addresses are used where, so the packets destined for those IP addresses can be forwarded correctly. BGP scalability is affected by the need to communicate updates, store them in the router and process them. At this time, bandwidth to propagate updates isn't a problem at all. In practice, the memory requirements to store increasingly large BGP tables can pose a problem, this is usually due to implementation limitations in commercially available routers, not because of inherent technological issues. A route processor is basically a general-purpose computer, which can now easily be built with 16 gigabytes or more RAM. Currently, the Route Views public route server runs with 1 GB RAM and has around 40 full BGP feeds of approximately 560,000 prefixes each (December 2015 figures).

However, this leaves the processing. The amount of processing required for BGP depends on the number of BGP update messages and the number of prefixes per message. Since the number of prefixes per update message is rather small, we'll ignore that aspect and just look at the number of updates. Presumably, apart from any autonomous growth, the number of updates goes up linearly with the number of prefixes. The actual processing of BGP updates is very limited, so the bottleneck is the time it takes to access memory for performing an update. Also during the IAB routing and addressing workshop, information was presented that indicates that the increase in DRAM speed is quite limited and won't be able to keep up with routing table growth.

Forwarding Table Syncing

Apart from the separate forwarding and data plane issues, there's the problem of syncing the forwarding table with the BGP/routing table after updates. Depending on the architecture of the forwarding table, updating it may be relatively time consuming. BGP is often described as a "path vector" routing protocol, very similar to distance vector protocols. As such, it implements a slightly modified version of the Bellman-Ford algorithm, which, in theory at least, requires a number of iterations equal to the number of nodes (in the case of BGP: external autonomous systems as well as internal iBGP routers) in the graph minus one to converge. In practice, convergence happens much faster because it's not a viable design to use the longest possible path between two locations in the network. However, a significant number of iterations in the form of distinct updates that must be processed can occur after a single event because of multiplication effects. For instance, in the case where two ASes interconnect in two locations, one update in the first AS will be propagated twice to the second AS through each interconnecting link. This leads to the following possible options:

Nature of the update	After first update	After second update
New path is better, first update is best	Install new path, propagate	No action
New path is better, second update is best	Install new path, propagate	Install new path, propagate
New path is worse, first update is best	Reroute, propagate	No action

This aspect of BGP is not explicitly recognized by many people, although studies such as Route Flap Damping Exacerbates Internet Routing Convergence do address the resulting behavior.

With the above in mind, we can conclude that BGP has some scaling issues: the protocol and the routers implementing it aren't prepared for an Internet where perhaps five million and certainly 50 million individual prefixes need to be managed by BGP. However, the current growth is relatively stable at about 16% per year for IPv4, so there is no cause for immediate concern. This is especially true for IPv6, which currently only has 25,000 prefixes in BGP.