Design Outline
by Ryan Fugger
Contributors
Goal
The eventual goal is to define the Ripple network with a protocol enabling nodes to make and maintain mutual-credit accounts between them, and use the network of these accounts for making payments by finding paths to the payment recipient and passing obligations down those paths. Nodes should also be able to verify another's reputation by checking the available credit through the network to the target node.
For background on Ripple, see Money as IOUs in Social Trust Networks & A Proposal for a Decentralized Currency Network Protocol (PDF).
This document outlines in an abstract way the types of messages that need to be passed between nodes and the data contained in those messages.
To Do
Summary
The Ripple network consists of user agents, or nodes, connected pairwise by abstract mutual-credit accounts which allow searching for paths in the network, and sending and tracking of obligations between the two parties through the passing of messages. Payers must also be able to establish messaging connections to the recipients of their payments. The nature of those connections depends on the particular message transport system being used. Right now, there is a tentative binding to XMPP?.
Messaging Semantics
Each Ripple message has an ID unique for the node-to-node connection, and must be immediately responded to by either a reply or an error with the same ID upon receipt. This serves to establish that the message was received and either understood or malformed or otherwise inappropriate as detailed by an error code and message.
Nodes need not wait for the reply or error to send another message.
When a messaging connection between two Ripple nodes is established, the node accepting the conection must send a time message, consisting of its current time, UTC.  The node receiving the time message echoes back its own time message in reply, indicating that it agrees to use the other node's time as the official time for that connection.  If it disagrees, it replies with an error and sends its own time message.  If that time message is rejected, the connection is closed.  There is little reason to disagree with someone's time at the start of a connection, unless you have oustanding open transactions with the other node that are time-sensitive.  
During the course of the connection, either node may send a time message to the other to verify the time. If at any point the two nodes fail to agree on the current time, the connection should be closed.
Time messages can be used to discover connection latency.
An account is a mutual-credit connection between two nodes defined by the following data, a copy of which is kept by the node of each account partner.
Unless otherwise specified, all account messages must be signed by the sender's authentication key.
Creating an Account
Accounts are created by one node offering to accept another's IOUs with an offer message containing:
As with every message not requiring any immediate reply of data, the node receiving an offer replies with the affirmative result containing no data.
If the offer recipient accepts the offer, his node sends back an offer-accept message:
To indicate that the account has been created, the node receiving the offer-accept responds with the affirmative reply. The reply may also be the negative error with one of the following error codes indicating why the account has not been created:
Otherwise it sends an offer-reject message.  
In the case that there is an initial balance agreed upon for the account, a payment should be made immediately upon account creation to back up that balance with the proper receipts (see Payments).
Changing Credit Limits
If a node wishes to increase its partner's or its own credit limit, it sends a limit-request message containing:
In the case of wishing to increase one's partner's credit limit, the request is interpreted as an offer.
The change is considered pending until the requesting node receives back a limit message containing the same information as the limit-request, plus:
limit-request message
If a node wishes to decrease either limit, it sends a limit message, which is implemented immediately upon acknowledgment.  In this case, the request ID field is not required.  A node may not reject a decreased limit.
If a node receives a request to change the same limit as an existing open limit request, the second request should supercede the first.
If a node receives a request to change to same limit as an open limit request that it has sent, it should reply with an error: "conflicting limit request ID xxx already open". [Error codes to be defined...]
A node may cancel a limit request by responding to a limit acceptance message with a "request withdrawn" error.
Changing Routing Information
If a node's routing information changes, it must send a routing-change message to its neighbours:
For more information on routing information, see Query Routing below. The routing information messages do not have to be signed.
Changing Network ID
To change network ID, a node must send a signed id-change message containing the new network ID.  After the reply to this message, the receiving node must send all Ripple messages to the new ID.
Changing Authentication Key
Send a signed key-change message containing the new key.  (Perhaps this should contain a revocation message for the old key?)
Send a verify-account message to which the reply contains:
Especially important to verify is the balance. If there are discrepancies, they will have to be worked out between the node owners. In future it may be possible to compare payments receipt-by-receipt to find the source of the discrepancy.
Payment Messages
Several payment messages are also sent over account channels. See the next section for details.
To allow payments between two nodes that don't share an account, Ripple finds a path of account-connected intermediaries between the payer and recipient, and propagates IOUs along the path.
There are four stages to the payment process:
Requesting Payment
A node wishing to receive a payment from another node may initiate contact with the potential payer node by sending a payment-request message to his node's network ID:
The payment request may be followed by an payment-init message from the potential payer, or ignored.
The authorization number field may be used to contain a single-use password to pre-authorize payments, so the node can automatically make payments when it receives a request containing a password from a pre-determined list that it holds. The payment-request message may also be authorized by signing it with a certain key, contained in the node owner's smart card, for example, for point-of-sale scenarios.
Initializing Payment
The payer node contacts recipient node and communicates the following in a payment-init message:
Transaction units allow the payer to mask the true units of the payment by inventing custom units. However, this will hinder intermediaries' ability to properly conduct zero-fee path searches. To guarantee overall fee limits on paths, transaction units must be recognizable to the intermediaries. See Path Queries.
The recipient may accept the payment initiation with an payment-accept:
This must be signed by the recipient's identifying certificate key, whose certificate may be authorized by an authority recognized by the payer and which identifies the recipient in a way acceptable to the payer. This message serves as evidence that the recipient participated in the transaction. The explanatory text can contain the reason for the payment. It can be used to connect the recipient's signature on receipts to the recipient (see Finalizing Payment.)
Otherwise the recipient rejects the payment initiation with an error code.
The payer and recipient send out path-query messages to one or more account partners, which they in turn pass on to one or more of their partners as described in Query Routing.  A path-query message consists of:
Final payment will be made using the paths found by queries within a single swarm. Therefore, queries within a single swarm may not reserve the same credit. Queries within different swarms, but with the same payment ID, may reserve the same credit. Queries with different payment IDs may not reserve the same credit. When a query forks in two separate directions, it splits into two different paths, but each maintain the same swarm ID. To avoid overlap, payer-initiated (forward) queries should use even-numbered swarm IDs, and recipient-initiated (backward) queries odd-numbered swarm IDs. See Query Routing.
The payer sends out "forward" queries, and the recipient sends out "backward" queries, refering to the direction the payment will ultimately take. If fees are permitted, the backward queries are more precise: the payment amount is generally interpreted as the amount to be received by the recipient (Paypal notwithstanding), and the backward queries begin by specifying this amount and accumulate fees as they propagate towards the payer. Forward queries that permit fees must begin with an estimate of the payment amount including intermediary fees, which would generally be too high and result in too much credit being reserved. However, forward queries play an important role as signposts towards the payer for the more-precise backward queries.
The transaction units allow the payer and recipient to recognize what percentage of the payment is actually being paid and received down a given path, regardless of fees or varying conversion rates between account units along the path. For example, suppose node A is paying node C $100 through intermediary node B. A's account with B may be in Euros, and B's account with C may be in Pounds. Before paying 82 Euros to B, A must know how many dollars that represents to recipient C. Intermediaries that do not recognize the transaction units cannot enforce overall fee limits on queries.
The account ID changes with each step the query is forwarded along, and lets the query recipient know which account with the sender is intended as part of the path, if there is more than one. The amount in account units is the amount that is to be reserved on that account for this query's payment path. It also defines a conversion rate between account units and transaction units that can be used to calculate whether or not the full query amount can be passed on to a single neighbour, and what proportion to pass on to various neighbours otherwise. This conversion rate is also used in subsequent payment messages.
The target location is used by each intermediary to route the query. Generally the target for forward queries is the recipient's location, and the target for backward queries is the payer's location, but this is not strictly required. Either payer or recipient may direct the other to route their queries to a different target location, and then ensure that the node with this target location receives a query from the opposite direction to point back to the correct target.
The credit guarantee expiry time gives a deadline for the next phase of the payment to occur. If this time limit is reached, the credit held for this path will be released and made available to other payments. Each intermediary should subtract from this time before passing it on to give themselves a small buffer in which to process messages from the next payment phase. The time is given in the connection time between each of the intermediaries, so each one must convert if the time on the incoming account is not in sync with the time on the outgoing account. See Connection Time.
The promise penalty deadline, penalty rate, and final deadline refer to fields in the promise message in the next payment phase.  In the forward query, they represent maximum bounds for those fields, and in the backward query, minimum bounds.  The times are again given in connection time, and the penalty rate is a daily percentage.
The hop limit is the maximum number of hops for all queries spawned from the one received, including forks. When a query forks in several different directions, the hop limit must be divided up amongst them.
The bloom filter containing aliases of the locations already visited is to prevent loops -- no location should occur twice in the same path. Nodes can use any alias they want as long as it is unique among potential intermediaries with high probability and they remember it to compare against incoming queries with the same payment and swarm IDs.
The per-hop fee limit is a limit on the percentage fee charged per hop in the path. The accumulated fee limit is an overall limit to the fees on this query. This can only be enforced if intermediaries recognize the transaction units.
When the query reaches the opposing payment endpoint, or creates a loop or otherwise hits a dead end, the node receiving the query message returns a path-response message:
These path-responses are propagated back to the query originator in the following fashion: When an intermediary receives a path-response, it checks whether the available amounts equal the amount on the query to which it is a response. If the sum of the path amounts is less than the original query, it sends out new secondary queries for the remaining amounts, if possible. (Multiple queries are permitted on the same accounts, but each must have a different path ID as described in Query Routing.) Unused hop limit may be used for this purpose. Credit amounts held for the outgoing query should be reduced to reflect the amounts in the path-response.
If the path amounts equal the original query amount, if there is no unused hop limit, or if there are no directions remaining to query, it waits until it has received responses to all its queries, and then packages the nonzero paths together into a new query response. The new expiry time is the soonest expiry time of any of the path-responses. The unused hop limit is the total unused hop limit from all the responses that hasn't been used in a secondary query fork.
Fees charged may be integrated as a percentage into conversion rates between units. There is no provision for flat fees at the moment. Minimum fees for any payment through an account are shared with account partners as part of the account data.
Ripple uses the Metropolis-Hastings algorithm for routing, as described in Clarke and Sandberg's presentation Routing in the Dark (PDF).
Each node partitions its accounts into separate routing classes such that obligations are convertible between any accounts within a class (in other words, an incoming payment on any account might be routed out along any other account in its routing class). Each routing class is then given a location, which is a decimal number between 0 and 1 representing a position on the boundary of a circle. The distance between any two locations is defined as the distance along the boundary of the circle between their locations:
d(x, y) = min(abs(x - y), 1 - abs(x - y))
Each node regularly swaps locations with other randomly-chosen nodes in a systematic way to minimize the product of the distances to its neighbours. Thus a node's location comes to be close in a certain sense to the location of its neighbours. Path-finding messages can then be routed by passing them to those neighbours with locations closest to the target location.
See below for a specification of location-swapping.
A path-query is routed first in directions path-queries in the opposite direction were received for the same payment ID.  From among these directions, the neighbour with the location closest to the query's target location is chosen first.  If there is not enough credit, as determined by available credit specified on the opposing query received from that direction, to fulfill the entire query, new queries with new path IDs are forked off in subsequent directions which have received opposing queries.  Path IDs must be unique within a swarm.  Once directions having received opposing path-queries are used up, all the accounts in the routing class may be considered on the basis of distance to the target location, using available credit as determined by the account balances and limits.
Remember that credit amounts reported as available on queries must be held for that path and made unavailable for other payments until they time out. Only other swarms from the same payment may access amounts held.
At all times, per-hop fee limits must be obeyed. If a node would charge more than the per-hop fee limit, it must respond as though it were a dead end. An account with a minimum fee on through payments higher than the per-hop fee limit on the query should therefore be ignored for the purposes of routing that query.
Note that with Metropolis-Hastings routing there is no requirement for nodes to report their own exact location as the routing target of messages from transaction partners.  They may report slightly inexact locations for privacy reasons, or the location of a neighbour, or an average of a group of neighbours to route the message through them.  This requires informing those neighbours ahead of time that the message is coming, by sending a path-query.
The performance of path-finding will be critical to the success of Ripple, and therefore the protocol should be as flexible as possible in terms of its ability to incorporate new and diverse methods of path-finding.
Making Promises (Commit-Request)
When the payer has found payment paths with enough combined available credit to complete the payment, it sends a promise message to the first node on each path, representing its commitment to redeem a valid receipt that can be authenticated with the promise's authentication key up to the amount of credit indicated on the receipt, if presented before the stated deadline.  A promise renews the credit held for each path and gives a new deadline for the expiry of these holds.
path-query message
Each node passes the promise message on to the next node in the path, as determined by path-responses received in the previous phase.  At each step, the promise message is signed to the next node by the passing node's authentication key.  This serves to establish the integrity and non-repudiability of the promise.
The penalty deadline is the time after which the promising node may begin charging a penalty to its partner for having to hold credit it could be using for payment.  The penalty is charged continuously at the daily rate specified.  Intermediaries should always set a shorter deadline and higher penalty rate on promises out than they have received on promises in, to ensure that they are compensated for overdue receipts.  This provides incentive for the recipient to ensure quick completion or cancellation of each promise path, without requiring hard-and-fast deadlines to come into play so soon that minor delays spawn major disputes.
Nodes should only accept promises for paths in a single swarm for any given payment.
The recipient sends the payer promise-received messages, each indicating that a promise has been received:
The recipient's receipt deadline to the payer should be well before the penalty deadline on any promise received. Therefore, the recipient should wait until it has received promises equal to the entire payment amount before sending back any promises-received to the payer.
The recipient can cancel the transaction at any time by sending the payer a payment-cancel and releasing its neighbours from their promises with a path-cancel (see Cancelling Unneeded Paths).  The payer can cancel the transaction by sending the recipient a payment-cancel, informing it that it can release its neighbours from their promises.  Released promises should be propagated back down each payment path to free up those funds for other payments.
When the payer receives promise-received messages equal to or exceeding the amount of the payment, he generates receipt messages matching the promises received by the recipient, –- one receipt per promise/path --  authenticates them individually with each path's authentication key, and sends them each to the recipient:
The collection of receipts signed by the payer represents the value of the payment, and once they are given to the recipient, the recipient is considered to have been paid. If any of the receipts does not match its promise (eg, one of the amounts is wrong) or is otherwise invalid (to be rigorously defined), the amount of that receipt is not considered to have been paid until a valid receipt is issued. The recipient must reply with an error code when receiving an invalid receipt.
The payer and recipient should send a path-cancel down each path they know they will not need:
When passing a path-cancel before having returned a path-response, intermediaries should fork the path-cancel just as they did the path-query with the same identifiers.
Redeeming Promises (Commit)
Once the recipient has the receipts for a payment, and has verified the payer's signature on each of them, she must sign each of them with her authentication key for the payment to establish that she has accepted it from the payer. This validates the receipt so that it will be acceptable to the payment intermediaries. This recipient-signed receipt must then be passed back to the payer as proof-of-payment.
Then the recipient redeems each receipt by passing a receipt message to each of its neighbours that has promised to redeem a receipt. Each neighbour in turn passes the receipt back along the path to the payer. At each step, an outstanding promise is redeemed for the promised IOUs by adjusting the account balance according to the conversion rate between transaction and account units given by the promise.
When a node accepts a receipt, it must stamp the amount to be redeemed in account units, sign the receipt with its authentication key, and reply by sending the signed receipt back to the node from which it accepted the receipt. This creates an undeniable record that the promise has been redeemed. The sum of signed receipts from each partner to an account equals the account balance. The original receipt is passed down the chain unchanged.
The payer may optionally volunteer payment forward up a path without delay, as long as she has received a copy of the receipt signed by the recipient, which is her proof-of-payment. She can do this by appending the amount to be redeemed in account units to the receipt, signing it, and passing it forward. Any intermediary receiving such a volunteered payment may in turn volunteer it forward to the next node up the chain. This capability serves to speed up the finalization of the transaction, minimizing uncertainty and the time that credit must be held frozen.
Releasing Promises (Rollback)
A node may release a neighbour from all or part of its promise at any time with a signed path-cancel message: 
If an intermediary node is at any time released from its promise, it should immediately release the preceding node in the path from its related promise to free up available credit. Failure to release promises in a timely fashion will result in penalty fees being assessed according to the time-to-penalty and daily penalty rate thereafter.
Promise Penalties and Expiry
A promise has two deadlines: a penalty deadline and a final deadline. After the first, penalty, deadline, the promise issuer may begin penalizing the promise holder for failing to provide either a receipt or a promise-release. Penalties are charged at the daily rate stated on the promise applied to the full amount of the promise. The penalty grows continually applied to fractions of days since the penalty deadline, not just in chunks every 24 hours.
Penalties are collected by issuing a payment-request to the offending node, containing the following special fields:
The offending node may automatically pay the fine when requested, or hold the payment request for approval by the human being who owns the node. Penalties continue to accrue even though a fine may be paid, until a receipt or promise-release is produced, or until the promise's second, final, deadline is reached.
If any promise's final deadline for accepting receipts passes, the issuing node should issue a promise-expired notification, containing:
Then, it may issue a promise-release back down the chain, triggering a wave of promise-releases back to the payer.
Any node can fulfill an expired promise if it wishes and has available credit.  Any node can request that an expired promise be fulfilled by sending a promise-renew request:
This may cause a chain of such requests to be passed back to the payer. If accepted, the promise should be reissued with a new deadline, and the receipt can then be presented safely for redemption. There is no way to enforce promise renewal.
Timing Disputes
Neighbouring nodes should accept receipts and promise-releases received soon after deadlines with no penalty if doing so causes them no loss further down the chain. If two neighbours can not cooperate in this fashion due to high latency on their connection and a refusal to extend long deadlines to each other, then they cannot continue being neighbours.
Timing disputes between payer and recipient are slightly different, since they have no presumed trust between them. There may be a dispute about whether a receipt sent to the recipient by the payer reached the recipient before the recipient's deadline. The payer may suspect that the recipient used the receipt to collect payment from the path intermediaries, but is refusing to acknowledge that fact. Ultimately, this type of dispute is resolved by the payer waiting to see if a payer-signed receipt comes back down the payment path before the final deadline. If so, that receipt provides proof-of-payment, regardless whether the payer met the deadline for passing the receipt to the payment recipient.
For greater accountability in either case, a third party timing authority could timestamp and relay messages to provide a definitive message time. This capability could be integrated in the future.
Other Disputes
Generally, the receipts should propagate in an orderly fashion back to the payer. But it is possible that an intermediary node goes offline during this redemption phase and remains offline until after the redemption deadlines have passed. In this case, the intermediary directly after the offline node may get stuck with the receipt and thus an undeserved burden of payment. This intermediary will rightly blame the node that went offline, who is a trusted associate, and request that he take responsibility for the loss. To do this, the node that went offline can reissue his promise and accept the receipt. This allows him to continue attempting to renew promises back to the payer and recoup his loss. The node with the receipt and the node that went offline must negotiate a workable settlement in any case. This negotiation may become part of a future Ripple protocol.
It would be good etiquette for a node that is stuck with a receipt and unable to communicate with the next node up the chain to make an effort to contact the owner of that node, by email for example, ahead of the deadline to give her warning that her node is down and that she will be expected to cover any loss incurred by her node being unavailable.
Reputable node hosts may offer insurance against such downtime losses if they are due to a server failure, power outage, etc. One can imagine the possibility of dumb fall-back nodes that could perform minimal operations in case the primary node went down.
A neighbour that fails to acknowledge a receipt and then fails to take responsibility for any loss incurred should not remain a neighbour for long.
Credit/Reputation Requests
A node may wish to check how much another node can pay it without actually receiving payment, as a means of checking that node's credit or reputation. This can be accomplished by using the payment process, but by indicating that transaction is a credit check and not a payment, so that no credit is actually held and no actual payments are prevented by it. The total value of promises received is the result of the credit check.
Units of Account
National currencies and other value units can be identified by arbitrary strings. A natural ID for a national currency is its three-letter code. Other standard units should include ounces of gold, grams of gold, hours, joules, and kilowatt-hours. Since units of payment will have to be communicated between payer and recipient, it is important that unit codes be standardized.
Other types of units may be useful for credit/reputation checks, although not for payment. For example, a satisfied customer may give a merchant one unit of "satisfied customer" credit for each successful transaction with her. Then a prospective customer could learn about that merchant's reputation by doing a credit check against "satisfied customer" units.
For Metropolis-Hastings routing to work, locations must be continually shuffled around as the network changes to maintain the property that each node's location is close to its neighbours'. The general mechanism is that described in Clarke and Sandberg's presentation Routing in the Dark (PDF): A node wishing to swap finds another node in the network by random walk. If the product of the two nodes' distances to their neighbours would be lower after swapping locations, then the swap occurs. If it would be higher, then the swap occurs with probability A/B, where A is the product of distances to their neighbours before the swap, and B is what the product of distances to their neighbours would be after the swap.
To accomplish this in a distributed setting, first, a node must decide that it wishes to swap. Nodes should routinely initiate swaps, perhaps once a day, in order to maintain desirable overall network state, but the specific swapping strategy is up to each node.
The swap initiator sends out a swap-init message containing:
to one of its neighbours, chosen randomly.  That neighbour passes it on randomly, decrementing the TTL by one.  When a node receives a swap-init with a hop limit of one, it has been selected as a potential swap partner.  If the swap partner is not currently participating in another swap, it responds with a swap-accept, containing:
This is passed back to the swap initiator along the path followed by the swap-init.  If the swap partner is currently participating in a swap, then it replies with a swap-reject, and the previous intermediary tries another direction.
When the swap initiator receives a swap-accept, it sends back a swap-initiator-info:
Note that if the swap partners happen to be neighbours, then care must be taken to change both their locations when calculating this last number.  The swap partner replies with a swap-partner-info:
Now the swap initiator has all the information to determine whether the swap should occur or not.  Let log(A/B) be the sum of pre-swap distance logarithms minus the sum of the post-swap distance logarithms.  If log(A/B) is positive, or if log(P) is above/below (as specified in the swap-accept) a negative log(A/B), then the swap should occur.  
If the swap should occur, the swap initiator sends a swap-ready containing values that the swap partner can verify against the prior hashes:
If the swap should not occur, the swap initiator sends a swap-cancel containing the same information.
When the swap partner receives the swap-ready, it replies with a swap-commit containing the swap ID, and informs its neighbours of the its new location.  When the swap initiator receives the swap commit, it does the same.
Note that every swap messages passes along a chain of nodes, each of whom can see every message and check if there is cheating. Of course, any intermediary could substitute any numbers it wanted, which is a potential security hole if someone wanted to actively subvert the routing system -- although I can't imagine an effective attack at the moment. The initiator and partner can cheat in calculating their distance products and in so doing sway the probability of a swap occurring, however, there is no reason to do so since it is much simpler to simply invent a new desired location and report a swap that never happened. Nodes should not do this, as it could degrade the routing scheme. Future versions of the protocol may have to guard against this explicitly, at the cost of revealing the local topology of the network.
Nodes may decide to weight the importance of each connection differently in computing the distance product by giving them coefficients. This is fine as long as it is applied consistently to pre-swap and post-swap distance product computations.