GroupedLink-StateRouting

Privacy

Link-state routing has desirable qualities (optimal routes, quite scalable, well-studied) for use in the Ripple protocol. Its main drawback is that it requires that each routing node have a map of the entire network, which potentially reveals all node connections to prying eyes. Even if routing identifiers are not related to node IP address, XMPP ID, or other communication identifier that could be used to identify the owner of the node, the payment process must make the link between the two at some point.

Some solutions:

  • Group highly connected sets of nodes together and use these groups as the atomic routing entity, thereby providing obscurity in numbers. Even if a group consisted of only one node, outsiders would not know this, and could not prove that any connection involved the node in question.
  • Allow payment recipients or payers to decline to reveal their routing ID. If neither one reveals their routing ID, then allow them to negotiate a rendezvous intermediary (whose true node ID need never be known) as described below in Scalability
  • Allow nodes to masquerade as several connected nodes/node groups in routing adjacency advertisements. This would obfuscate the nature of their connections, as outsiders would have no way of discerning real nodes from "dummy" nodes.

In practice, a combination of these techniques might allow for sufficient privacy in a link-state routing scheme. The latter two are already built into the current protocol design.

Efficiency

Hazy-sighted link-state routing is an approach to optimizing the tradeoff between optimal routing and minimizing routing overhead in the network. Essentially, adjacency advertisements are sent out by each node/node group at regular intervals with hop limits in the sequence {1, 2, 1, 4, 1, 2, 1, 8, 1,...}, so nearby nodes are frequently updated and further nodes are updated less frequently.

Scalability

Even with the hazy-sighted partial information mechanism, the number of link advertisements in a Ripple network may overwhelm the network, or the actual number of links may become too high for individual nodes/routing groups to track. (The goal is to design a system that can work on a global scale.)

My idea here is to limit the number of hops that a link is advertised to be proportional to its capacity (or, more precisely, the log of its capacity), at least for links with capacities below a certain threshold. This would mean that small-capacity links are only known locally, and only high-capacity links are known globally. An initial guess would be to to advertise links of value 0-10 (maybe use SDRs as universal units?) for 1 hop, value 10-100 for 2 hops, value 100-1000 for 3 hops, etc. Above a certain threshold capacity (10^5?) links would be broadcast over the entire network.

If two nodes wanted to find a path between them, but neither was in the other's routing table, they would have to agree on a certain rendezvous intermediary. (If no such intermediary could be determined, then the only option would be to go on a wild goose chase across the network. For this reason, this type of link-state routing might be layered on top of Metropolis-Hastings routing, which is a simpler, slower mechanism, but works between any two nodes in the network.)

The theory is that in order for link-state routing to truly scale, every node cannot know about every link in the network. In Ripple, the utility of knowing about a link is something like log(capacity) / distance from node in question. (Log because the number of links grows exponentially with distance.)

Information Contained in Routing Advertisements

In addition to the two routing IDs for a link, routing advertisements for each node could theoretically contain:

  • credit range with each neighbour
  • current balance with each neighbour (or possibly average balance since last update?)
  • base fees in each direction

Initially these will probably be excluded for scalability and privacy.

Computing Routes

One possibility is to compute a routing table of distances (number of hops) to each node through every neighbour (one run of Dijkstra's algorithm for each neighbour), then route each query so as to minimize:

[sum over all forks(credit required * distance to target)] * # of forks

This routes in reasonable directions while minimizing branching.

Alongside the routing tables of number of hops through each neighbour to any given node, extra credit information from throughout the network could be used to compute an expected maximum payment to any given node through any neighbour. That number could be used as an upper bound on available credit for sending a query though a neighbour, rather than simply the current available credit on that account. Whether this actually improves routing performance would depend on the accuracy of out-of-date non-local credit information, which remains to be seen. (How fast will balances fluctuate? Will averages give an accurate picture?)