Page:Efficient and Secure Group Messaging.pdf/2

 In effect, the group is the complete graph $$K_n$$ and each pair of members must be connected over a secure communication channel in order for each member to be able to broadcast messages to the entire group. This naive scheme leads to a linear $$O(n)$$ number of encryption operations over the group size n which Alice (or any other member) has to perform to send a message. Moreover, the size of the transmitted cipher data is likewise of size $$O(n)$$.

A less-naive approach is that Alice's first pair-wise broadcasted message should be ciphertext that decrypts to a shared symmetric key that can then be used henceforce [sic] using a symmetric encryption scheme like AES_GCM. However, while indeed these messages would subsequently require only $$O(1)$$ encryption operations, any addition of a new member, removal of a member, or a simple desire to rotate to a new symmetric key (in case the old might have been compromised) would again require $$O(n)$$ operations on the group. For groups of large size this can be problematic if the group wishes to perform operations such as an add, a removal, or an update, fairly frequently.

In fact, as shown in this paper, it is instead possible to achieve an optimization leading to an $$O(log(n))$$ number of operations for accomplishing the establishment of a new symmetric key for the group.

This optimization is possible by use of a tree. I will define more formally how this works in the sections proceeding, however, I first provide here in this introduction a basic bird's eye view understanding for the reader:

Instead of having one member establish the key as in the example above, each member will instead generate their own secret, which will contribute slightly to the final shared group secret. Such a structure will then allow all group members to establish a symmetric key and update the key or the members of the group with a small number of operations.

Before proceeding to a definition of the desired security features and a description of the tree structure that accomplishes the desired optimizations, it is important that the reader understands the idea of a ratchet step.

A ratchet step in cryptography describes a step that can only move in one direction: Just as mechanical ratchets used for lifting heavy objects can only take steps in one direction and do not move backwards.

The simplest idea for a ratchet step in cryptography would be the use of a one-way function such as a hash function. Let $$H: X\times Y\to\{0, 1, \dots, 2^n - 1\}$$ be a collision-resistant hash function. If H acts like a random oracle, meaning that the images seem to be completely randomly mapped in comparison to the pre-images (i.e an adversary cannot guess what the pre-image is by looking at the image) then we consider the function to be one-way because it is too difficult to find the inverse of the image and go backwards. Even the Diffie-Hellman key exchange demonstrates the mechanics of a ratchet step in that it relies on the fact that it is difficult to take the discrete log of a generator g put to a power, so the function $$f(a) = g^a$$ is effectively one way (under the appropriate group).

Indeed, the Diffie-Hellman exchange algorithm could rather naively be used to achieve forward secrecy and break-in recovery. Namely, suppose that each time Alice sent Bob an encrypted message, Bob then generated a new public-key pair, replied to Alice with his new public key, and the next message from Alice to him would use the new public key. In this fashion, even if the first 3 messages get compromised, any new messages would still be secured (break-in recovery); and if today's key was compromised, all the old messages were encrypted with old keys that may have been deleted by now (forward secrecy). Each generation of a new public-key pair would be considered a ratchet step.

In practice, the way Signal uses this idea is in combination with the idea of a chain generated from a key-deriving function. A key deriving function is a function which derives a seemingly random function from an input (a simple example would be a collision-resistant hash function like SHA256). The symmetric key continues to go through this function to generate the next one, while then also being combined with new Diffie-Hellman values being generated with each message as described above. In this fashion, by combining both a one-way key chain ratchet step along with a new public/private key-pair information ratchet step, only the parties that have access both to the new Diffie-Hellman values as well as all of the old Diffie-Hellman