Skip to content

Protocol Specification

campadrenalin edited this page Dec 25, 2012 · 17 revisions

About this document

This is not going to be an IEEE form thing or anything. Just a simple space to work out how the protocol will work - ideally, as simply as possible.


Basics about the protocol

DEJE is based on EJTP. It includes the following tenets:

  • Metadata is stored within documents themselves
  • Changes require consensus
  • There is one canonical version of the document, verified by the blockchain
  • Validation is automatic and involves the document's handler.

Protocol messages

Document retrieval

deje-get-version

docname: string

Get the current version of the document.

deje-get-block

docname: string
version: int

Get the successful checkpoint object for the given version.

deje-get-snapshot

docname: string
version: int

Get a "snapshot" of the state of the document at the given version. Many peers may reject requests for snapshots not currently cached/stored, for example anything older than the current version, as this can incur unreasonable computational costs to recreate.

deje-doc-version

docname: string
version: int

The current version of the document according to the sender.

deje-doc-block

docname: string
block: {
    author: string
    content: arbitrary checkpoint object
    version: int
    signatures: { name : sig }
}

A serialized block. Currently this is the only deje-doc-* message that does not contain a version key in the top-level of the message. This may be changed in the future for convenience, but in theory, the version will always be provided in the block for backwards-compatibility, even if it's redundant.

deje-doc-snapshot

docname: string
version: int
snapshot: { path : {serialized resource} }

As shown, resources are provided keyed on their respective paths. Resources are serialized as follows:

{
    path: string
    type: string (MIME)
    content: string
    comment: string
}

Checkpoint validation and subscribing

deje-checkpoint*

version: int
checkpoint: object
author: ident (string)

Proposes a checkpoint with the given arbitrary object, as a transition from the given version. This is done within the locking mechanism of the document, and needs a write consensus to accomplish. Members of the quorum may reject checkpoints for being out of date or failing handler tests. Writes contend for quorum space, so only one may succeed at a time.

* Not for direct use, see deje-lock-acquire

deje-subscribe*

subscriber: string (ident name)

Requests that all new checkpoint successes be forwarded to you. Requires a read consensus, but these do not "rent" quorum space the way writes do, so read consensuses do not collide with each other or with writes. You can think of them as "ghost quorums" for this reason.

* Not for direct use, see deje-lock-acquire

Locking mechanism

deje-lock-acquire

docname: string
content: object

Attempt to acquire a lock to perform the action in the content variable. This will be a message with the type being either "deje-subscribe" or "deje-checkpoint".

deje-lock-acquired

docname: string
signer: string
content-hash: string
signature: string

Response indicating success in acquiring a remote host's lock permission. Content is hashed for bandwidth normalization. May be sent to multiple idents at the remote host's discretion, to help prevent lockouts (the subscription system will also eventually be used in such a way that designated "server" participants can re-send information when necessary).

The signature contains two parts: the expiration date (ISO 8601), and the signed hash, with a null byte in between as a separator. The hash should be of the concatenation of the expiration date and the content-hash. Something along these lines:

def lock-sign(content_hash):
    expires = get_expiration()
    return expires + "\x00" + sign(expires + content_hash)

deje-lock-complete

docname: string
content-hash: string
signatures: { ident: string }

Finalizes the locking procedure. When a write consensus has been achieved and the appropriate signatures are accumulated, this message should be sent to every ident who signed it (as well as admins). This allows them to release their locks and enact the action.

Signatures should be in the format given above in deje-lock-acquired.

Misc

deje-error

docname: string
code: int
explanation: string
data: object

Reports an error. Used to report all sorts of failures, including lock rejections.


Locking model

For synchronization, we have to lock writes so that each operation requires some degree of consensus. But we also have to do this in a way such that performance isn't terrible. To do this, we limit the situations where we need to acquire or verify locks, taking advantage of the existing versioning chain.

For reading...

We don't lock for any read actions except subscription, and even that is a non-colliding lock.

For writing...

A transition (pair of version and checkpoint object) is proposed to the quorum through the locking mechanism. If it achieves a write consensus, it is appended to the end of the block chain among those nodes, and the changes will be picked up the rest of the quorum over time.

How the locking mechanism works

Operations that require locking should not be sent directly - they should be wrapped in locking messages. Such operations should be rejected if received directly. The operations that must be wrapped are marked with an asterisk above.

In locking, an operation must accumulate a certain amount of vote power. On success, the operation runs and the lock is released.

Dealing with deadlocks

In any lock-based system, even (especially?) one based on an abstract lockspace of multiple weighted authorities, you have to worry about the dreaded deadlock. This is any situation where the locking mechanism "jams" because of an unforeseen logical error, most commonly a catch-22. An entire document locking up because of one misbehaving numskull is a bad thing.

Individually, locks happen per ident, per document. A lock can be released, or devoted to some content object. If it is already devoted to one thing, it cannot be devoted to another until it is first released. Release is implicit, and happens in several different scenarios:

  • A lock proposal succeeds, and the content is acted on.
  • The individual lock expires.
  • In the case of writes, each write proposal is tied to a specific version. If the document version ends up superseding this in any way, the lock is released. This allows failure in write contentions.

Even so, this still allows for deadlocks when there's a lock contention and unreasonably futuristic expiration dates. The only solution to this is to make sane expiration dates, and if your project requires high-latency (hours/days) locking and experiences a lot of contention, the proper solution is to break up your data into more separate documents. Eventually, DEJE will support document referencing transparently, but until then, it's up to the DEJE-using developer to implement this in a manual fashion.


v0.1