The Ink & Switch Dispatch
Keep up-to-date with the lab's latest findings, appearances, and happenings by subscribing to our newsletter. For a sneak peek, browse the archive.
2025 Jan 21
As we’ve seen in past lab notes, Beehive provides access control for local-first applications. We support both server-based collaboration and peer-to-peer operation without a trusted server. And individuals might work offline for extended periods of time. In the context of Automerge, our goal is to control access to documents, collections of documents, and parts of documents.
Every document has a group of users with access to that document. That group might include other groups as members (in which case the members of those groups are also members of the document). Importantly, a document group’s membership is dynamic, with new members added and removed over time. We must be able to handle concurrent changes in a distributed context.
Of course, if we want to limit read access to just our group, we can’t safely share our document as plaintext via sync servers. We need a way to encrypt and decrypt our data that is accessible to only our document’s members. This means we need a way for our group to agree on the keys that will be used for encryption and decryption over time.
In the literature, this problem is known as Continuous Group Key Agreement (CGKA). A CGKA protocol enables a dynamic group to agree on a sequence of keys over time. CGKAs ordinarily guarantee two properties: forward secrecy (FS) and post-compromise security (PCS). Imagine a successful attacker compromises a single key. In the simplest terms, forward secrecy means that this key will not enable access to past data. And post-compromise security means it will not enable access to future data. If you can guarantee both, then you can limit the damage from a key compromise.
One way to achieve forward secrecy is through “ratcheting”. With a ratchet, honest users employ a key derivation function (KDF) to deterministically transform a key in a way that is effectively impossible to reverse. A cryptographic hash function is one way to achieve this. Ratcheting with such a one-way function prevents an attacker from discovering past keys since there is no feasible way to reverse the function. But a one-way function on its own does not prevent an attacker from discovering future keys, since you can derive all future keys from a compromised key by repeatedly applying the hash function.
Of course, we don’t want a system that once compromised is always insecure. That’s where post-compromise security comes in. The intuitive idea is that a system with post-compromise security has some mechanism to deny access after an attack. Compromised information will no longer be enough to derive future keys. One way to achieve this is to periodically rotate information required for determining future keys in a way that is not accessible to a past attacker.
In practice, ratcheting protocols mix in fresh information with each ratchet so that knowledge of a key is not by itself sufficient to derive future keys. For example, Signal’s Double Ratchet protocol includes sending a Diffie-Hellman public key with each message so that the receiver can derive a shared Diffie-Hellman secret to use as a side input to the key derivation function (KDF) that is used for ratcheting.
The current Message Layer Security (MLS) protocol for CGKA uses TreeKEM, a protocol for asynchronous, decentralized key agreement for dynamic groups1. TreeKEM uses a binary tree with group members’ public keys at the leaves and the current group secret encrypted at the root. All other inner nodes act like the root for their subtrees (and subtrees act like subgroups with their own shared, encrypted secrets). Members can be dynamically added and removed from the tree.
For post-compromise security, each member periodically rotates out its public keys on its leaf, which leads to cascading secret updates all the way to the root. Both updating and decrypting the root secret requires traversing the path from the member’s leaf to the root, performing log(n)
operations (although there is a linear worst case under certain conditions).
Unfortunately, Beehive’s requirements rule out TreeKEM as it stands. That’s because TreeKEM depends on a central server to create a total order of operations, and to pick winners among concurrent operations. Beehive’s local-first model is peer-to-peer compatible and does not require such a central server. And for Beehive, concurrent operations can be merged in long after they were actually performed (for example, if a member made changes while aboard a long-haul flight).
An alternative that is more aligned with our requirements is the Decentralized Continuous Group Key Agreement (DCGKA) protocol developed by Matthew Weidner and Martin Kleppmann. This protocol assumes a decentralized network that does not depend on a trusted central server. However, unlike TreeKEM, it provides linear rather than logarithmic performance. As a result, they target groups on the order of 100 members as opposed to MLS’s target of 50k members. A design goal for Beehive is to target at least thousands of members2.
Matthew Weidner has also proposed an alternative to TreeKEM called Causal TreeKEM. Whereas TreeKEM requires a total order imposed by a central server, Causal TreeKEM only requires a causal order, which is much better suited to a decentralized network. Like TreeKEM, it has logarithmic performance (with a linear worst case) and is meant to ensure both forward secrecy and post-compromise security.
However, Causal TreeKEM depends on fancier crypto than we’d prefer in order to merge concurrent updates in any order. It requires a cryptographic operation to combine updates at a node that is both associative and commutative. One option here would be BLS, but this is far less common than the standard options and there is not currently a great library option for Rust (the language Beehive is written in). And we have definitely ruled out rolling our own crypto (you probably should too).
For these reasons, we’ve proposed our own alternative3 for Beehive that we call “BeeKEM”. It is closely modelled on TreeKEM with insights from Causal TreeKEM. It requires no central server and only a causal order of operations. It provides logarithmic performance (with linear worst case). And like the other TreeKEM variants, it provides forward secrecy and post-compromise secrecy4. Furthermore, it relies exclusively on standard crypto, such as Diffie Hellman key exchange and BLAKE3 hashing.
In this section, we’ll see how BeeKEM works in more detail.
In BeeKEM (as in TreeKEM), the current group secret is stored encrypted at the root node of a binary tree. We’ll call this the “root secret”. The root secret is used for encrypting and decrypting document chunks shared with our group over the network5.
Each leaf of the tree corresponds to a member of the group and contains its ID and latest Diffie Hellman (DH) public key. A member’s ID is persistent over time but each member will periodically rotate its DH public key. When a member rotates its DH public key, that will cause the root secret to change as well. Thus, member key rotations help provide post-compromise security. From the point of view of an adversary, they introduce fresh randomness.
Each leaf has an implicit secret known only to the corresponding member (i.e. not stored in the tree). All other “inner” nodes in the tree contain a DH public key for that node and a corresponding secret key that is stored encrypted at the node.
Each node in a binary tree has a single sibling node, as illustrated in the following diagram:
When encrypting or decrypting a new secret at a parent node, a child node performs a Diffie Hellman key exchange with its sibling. That means it will use its sibling DH public key and its own secret key to derive what we’ll call a “shared DH secret”. The shared DH secret is used to encrypt and decrypt the new secret at the parent.
A brief (simplified) aside on how Diffie Hellman key exchange works. Imagine Alice and Bob each have their own DH public keys (alice_pk and bob_pk) and DH secrets (alice_sk and bob_sk). If Alice combines her DH public key with Bob’s secret key, she can derive a shared DH secret. If Bob combines his DH public key with Alice’s secret key, he can derive the same shared DH secret. In this way, they can agree on a shared secret just by exchanging their public keys in the open.
We use this same principle to derive a shared DH secret for any sibling pair in our tree. For example, to decrypt Alice’s parent node, Alice can use its secret alice_sk
and its sibling’s public key bob_pk
to derive a shared DH secret. It can then use that shared secret to decrypt the secret at the parent node.
In pseudocode, this might look like:
shared_dh_secret = DH(bob_pk, alice_sk)
parent_secret =
encrypted_parent_secret.decrypt_with(shared_dh_secret)
That parent secret can in turn be used for a Diffie Hellman exchange with the parent’s sibling’s DH public key.
For a member to decrypt the root secret, it must start from its leaf and traverse the tree one parent at a time until it reaches the root. The sequence of nodes from leaf to root is called that leaf’s “path”. At each node in its path, it will derive a shared DH secret with its sibling to decrypt the secret at its parent. Once it’s decrypted the root secret, it’s done.
In the following diagram, the decrypting leaf’s path is marked in green. The siblings used as Diffie Hellman partners along the way are marked in purple:
There are three mutating operations that can be performed on the tree: Update Key (i.e. key rotation), Remove Member, and Add Member. Let’s look at these in more detail.
Every member must periodically update the DH public key at its leaf in order to guarantee post-compromise security. When we update our leaf DH public key, we must then update the secrets for all the nodes on our path, eventually updating the root secret for the entire group.
Before traversing our path, we can derive a sequence of path secrets by applying BLAKE3’s key derivation function to an initial secret once for each node on the path. As we move up each parent on our path, we will encrypt the next derived secret and store it on that parent.
In order to encrypt the secret for a parent, we use Diffie Hellman key exchange as described above. We then derive a new Diffie Hellman public key from the secret for the parent, and store both that new DH public key and the corresponding encrypted secret at the parent.
In pseudocode:
parent_secret = derived_secrets[next_idx]
shared_dh_secret = DH(child_sibling_pk, child_sk)
encrypted_parent_secret =
parent_secret.encrypt_with(shared_dh_secret)
parent_pk = DH_pk_from(parent_secret)
parent_node.insert(parent_pk, encrypted_parent_secret)
Later on, when the sibling wants to decrypt that parent secret, it can do Diffie Hellman the other way, using the encrypter node’s DH public key with the sibling node’s secret to derive the same shared DH secret that was used to encrypt the parent.
shared_dh_secret = DH(encrypter_pk, sibling_sk)
encrypted_parent_secret = parent_node.encrypted_secret
parent_secret =
encrypted_parent_secret.decrypt_with(shared_dh_secret)
In order to explain membership changes, we must introduce the concept of “blanking” a node. Blanking a node means that we remove all key and secret information from that node. If the root node is blank, then the tree does not currently hold a shared group key. Some nodes are blanked after membership change operations, and all leaves beyond the last member leaf on the right are blank.
If a tree has a blank root, then at least one member must perform an Update Key operation to restore a root secret. An update will replace all blank nodes on its update path with key information.
When we perform a Remove Member operation, we first blank the leaf corresponding to that member. We then blank the entire path from that leaf up to the root node.
Notice that if a removed member performs an update concurrently with its removal, we need to ensure that the update does not survive (or else the member will continue to have access to the root secret). When merging concurrent removes with other operations, BeeKEM ensures that the remove paths are blanked after all other concurrent operations are merged.
When we perform an Add Member operation, we add the new member’s ID and public key to the next blank leaf on the right. We then blank the path from that leaf to the root.
Notice that if two members add a member concurrently to the same tree, they will add them to the same leaf. BeeKEM resolves such conflicts on merge by sorting all concurrently added leaves and blanking their paths.
So far, we’ve assumed that every node has a sibling with key information. That’s what allowed us to use Diffie Hellman to derive a shared DH secret. But what happens when a node’s sibling is blank?
In that case, we must find the blank node’s resolution. A node’s resolution is the set of its highest non-blank descendents. Here’s an example:
If you have a blank sibling, you must do a separate encryption of the new parent secret for every member of your sibling’s resolution. For each of those members, you use its Diffie Hellman public key with your secret to derive a shared DH secret. You then encrypt the new parent secret using that shared secret and store it for that member.
This means that if the resolution of a sibling node contains 5 members, you will need to store the parent secret 5 times, each one encrypted for a separate member.
The worse case scenario is if the entire inner tree is blank. Encrypting a new path will no longer be a logarithmic operation since every leaf will be contained in the resolution of some blank node on your path. Instead, the cost will be linear in the number of leaves: you will have to perform a separate encryption for every other leaf somewhere on your path.
When decrypting a leaf with blanks on its path, you simply skip those blanks. This works because the highest blank in your path will contain its last non-blank descendent in its resolution. So when you encounter a blank on your path, you hold onto the last secret you’ve seen and start skipping. When you eventually get to a non-blank node, you’ll use that secret you’re holding onto to derive the shared DH secret you need to decrypt the non-blank parent.
Beehive assumes that concurrent operations can be merged in any causal order. Concurrent updates will always have some overlapping nodes in their paths (at least the root is shared by all paths). How does BeeKEM resolve these conflicts?
We must first consider our potential vulnerabilities. Imagine that an adversary has compromised a group member and their leaf secrets. They can use a compromised leaf secret to decrypt the root secret at some point in time. Recall that knowing a leaf secret means you can decrypt all of the inner node secrets along your path.
If an adversary knows the secret for a leaf, it’s possible they will continue to be able to decrypt the group secret even if that leaf is rotated during a future concurrent update. This depends on how we handle merging concurrent updates.
If we just naively pick a winner for updates to a series of overlapping nodes, then the new information added by the loser’s key rotation will no longer be necessary to decrypt the root secret. We effectively forget that information.
Notice that the winner used the outdated keys from the loser for its update (since the winner’s and loser’s updates were concurrent). That means an adversary with the loser’s outdated leaf keys will still be able to decrypt the winner’s nodes. Subsequent updates by other leaves that only intersect with the winner’s path will also fail to exclude our adversary.
In BeeKEM, when merging concurrent updates, we ensure that all updates contribute information along their entire paths. We keep conflicting information around at each node until it is overwritten by a causally subsequent operation (or blanked by a membership change).
If two leaves update the same node concurrently, then they would have each written a distinct Diffie Hellman public key and encrypted secret to that node. In this scenario, we call these “conflict keys” and store them both when merging conflicts.
Imagine a member subsequently updates the tree. If a node on its leaf’s path has a sibling with conflict keys, this means there is an unresolved merge at that sibling. An adversary could have access to both sides of the corresponding fork. So it wouldn’t be secure to use those conflict keys for Diffie Hellman. Instead, we take the resolution of the node, just as we did with blank nodes. We then separately encrypt the secret for every DH public key in the resolution, again just as we did with blank nodes.
This means that for BeeKEM we update the definition of the “resolution of a node” to mean either (1) the single DH public key at that node if there is exactly one or (2) the set of highest non-blank, non-conflict descendents of that node.
If we merged in both sides of a fork, then we know we’ve updated both corresponding leaves with their latest rotated DH public key. Since taking the resolution skips all conflict nodes, it ensures that we integrate the latest information when encrypting a parent node. That’s because any non-conflict nodes have successfully integrated all causally prior information from their descendents.
This means an adversary needs to compromise one of the latest leaf secrets to be able to decrypt an entire path to the root. Even knowing outdated leaf secrets at multiple leaves will not be enough to accomplish this. An honest user, on the other hand, will always know the latest secret for its leaf.
During a future update (key rotation), if you find a conflict key node on the path you’re updating, you can remove all conflict keys at that node and replace them with a single new public key and encrypted secret (as with normal parent encryption). That’s because your update operation is the causal successor of all the operations that placed those conflict keys. This means your tree contains the necessary information from all of those past updates, which is integrated into your update.
BeeKEM’s approach comes with two downsides. First, before conflicts are resolved by subsequent updates or blanks, we must store extra information at each conflict node. Second, conflict keys add extra encryption and decryption overhead. In the worst case, where the tree is populated with the maximum number of possible conflict keys, the space cost would be n log(n)
(as opposed to the best case of 2n
). The time cost in the worst case would be linear (as opposed to logarithmic), as when the tree is maximally blanked. Our current set of benchmarks reflect these time costs when we intentionally exercise our worst cases.
BeeKEM provides Beehive with a Continuous Group Key Agreement protocol that is well-suited to distributed local-first applications that require end-to-end encryption for groups on the order of thousands of members. It exhibits logarithmic performance in the common case with linear worst case. And it provides both forward secrecy and post-compromise security.
In the future, we plan to write a paper explaining this protocol and its security and performance characteristics in more detail. But hopefully this has given you a sense for how it works.
TreeKEM was inspired by earlier work on Asynchronous Ratcheting Trees (ART). ↩︎
Weidner and Kleppmann argue that secure messaging for large groups does not have a plausible threat model since it would be too easy to infiltrate them. But Beehive is designed for shared documents. In the context of private documents shared within a company with thousands of employees, for example, we would still expect access control. It’s also worth mentioning that in Beehive, a single user might have multiple device-specific keys (each of which will count as a member from Beehive’s perspective). ↩︎
BeeKEM in isolation provides forward secrecy, but Beehive as a whole does not. That’s because users require access to an entire document and Beehive is used to encrypt that document in chunks. If you can decrypt a chunk, you will gain access to the key for decrypting the previous chunk (as described in an earlier lab note). ↩︎
More precisely, we use the root secret as one input into deriving an “application secret”. It is the application secret that is directly used for encrypting and decrypting document chunks. There can be multiple application secrets derived from one root secret, but each application secret is used to encrypt exactly one document chunk. Updating the root secret provides post-compromise security by ensuring no prior key can be used to derive application secrets associated with it. We are glossing over these details in this lab note since they strictly speaking happen outside BeeKEM, which is concerned with group agreement on the root secrets. ↩︎