Paper Speedrun: zk-creds
This article is cross-posted from the Geometry Notebook.
Title: zk-creds: Flexible Anonymous Credentials from zkSNARKs and Existing Identity Infrastructure
Authors: Michael Rosenberg, Jacob White, Christina Garman and Ian Miers
Links: paper https://eprint.iacr.org/2022/878 , code https://github.com/rozbb/zkcreds-rs
TL;DR: zk-creds is an anonymous credentials system. A credential is a Pedersen commitment to attributes. The credential is shown to be valid in ZK and issued by adding it to a public list. A credential can be “shown” by proving in ZK that the committed values match some access criteria and that the credential is included in the issuance list. Privacy is maintained by blinding proofs. The paper also introduces Merkle forests and a protocol to blind and link Groth16 proofs.
The Need for Anonymous Credentials
Typically when we want to prove that we are over some legal age limit we show our identity document in full, revealing our age but also our legal name, address, nationality and much more. The risks of such privacy leaks are compounded by the fact that we now show our identity documents digitally and our precious information is stored by third parties.
Anonymous credentials offer a privacy-preserving alternative to the scenario above. A User could use their anonymous credential to enter a bar without revealing anything more than the validity of their credential and that they meet legal drinking age. Importantly, multiple shows of the same credential should not be traceable.
Previous Work
Classic solutions for anonymous credentials usually involve re-randomisable signatures. An official authority issues an anonymous credential by signing a User’s attributes: the signature proves that the Issuer approved the User’s attribute while re-randomisation provides unlinkability.
Such signature schemes are often custom-built for specific anonymous credentials schemes. Although these papers do present security proofs, their signatures are still seen as more “exotic” than standardised and battle-tested signatures such as DSA or ECDSA.
Unfortunately anonymous credentials are rarely deployed in practice, usually because official authorities are reticent to change the format of their (often physical) credentials, maintain signing keys and the relevant public key infrastructure, or deploy non-standard cryptography. zk-creds addresses these concerns and moves away from re-randomisable signatures to instead explore re-randomisable ZK proofs. Additionally, it reduces the burden and trust assumption put on the credential issuer.
zk-creds Overview
In zk-creds, a credential is a Pedersen commitment to some attributes. The credential is issued by adding the commitment to a public Merkle tree or “Merkle forest” as detailed later. Finally, the owner of the credential can anonymously “show” it by producing a ZK proof that 1) the attributes inside the credential match the access criteria and 2) the credential is included in the issuer’s tree (or forest).
Notice how in zk-creds, the Issuer is no longer equated to the official authority that emits identity documents. A zk-creds Issuer is only expected to verify a ZK proof and add a credential to their tree. In fact as we will see, this role can be fulfilled by a smart contract.
Let’s immediately consider a concrete example given in the paper: accessing age-restricted content online using a credential formed from a US passport.
An end-to-end demo: accessing age-restricted content online
To access age-restricted content online, Alice must prove that she is over 18, that her identity document is not expired and that the proof isn’t being replayed. Let’s see how zk-creds allow her to build a credential from her US passport, how she convinces an issuer that her credential is valid and finally how she can convince a third party that her credential passes the age restriction.
1. Generate a credential
Alice scans her NFC-enabled passport with a mobile application that extracts signed data from it. She can then commit to the extracted attributes, namely nationality, full name, date of birth, passport expiry date and a hash of her ID picture. To these, Alice adds a “rate key” which can be used for more complex access criteria such as creating pseudonyms or enforcing rate limiting. Alice produces a Pedersen commitment to these attributes (this is her credential) along with a ZK proof that the committed values are also signed in the passport by the relevant authority (an issuance proof).
2. ZK issuance
Alice sends her credential and the issuance proof to an Issuer. Upon verifying the proof, the Issuer adds Alice’s credential to a public list (Merkle tree or forest).
3. Verify age
Alice wishes to access age-restricted content from a third party, the Verifier. For ease of exposure, we will ignore protection against replay attacks at first. Alice produces:
- a Groth16 proof that her credential commits to a date of birth value $\mathsf{dob}$ such that $\text{today} - \mathsf{dob} \geq 18\text{yrs}$
- a Groth16 proof that her credential commits to an expiry date $\mathsf{exp}$ such that $\mathsf{exp} > \text{today}$.
- a Groth16 proof that she knows an authentication path for her credential in one of the Issuer’s trees.
She then blinds and links these proofs using the LinkGroth16 protocol described in Techniques below. This operation ensures that multiple “shows” of the same credential cannot be tracked and that all these separate proofs are about the same credential.
The Verifier can verify all these proofs in zero knowledge and upon successful verification grant access to Alice. Notice how Alice’s private information has remained private throughout the whole run of zk-creds!
Note however that these proofs are inherently re-randomisable: a malicious actor could copy Alice’s proofs, re-randomise them and convince an unsuspecting Verifier. zk-creds prevent these replay attacks by enforcing that proofs are “session binding”: a credential show is only valid with respect to a unique verifier-chosen challenge. As noted in the paper, this property can be achieved by including an additional “empty” proof that takes the challenge as public input.
Implementation and Benchmarks
The zk-creds team have released code that covers all steps of this example, repo here. Their benchmarks show that the system is usable in practice. $\mathsf{ShowCred}$ is the time it takes to compute all the proofs except the inclusion of the credential in the Issuer’s list, $\mathsf{ShowCred} , (\text{full})$ includes the latter.
Notice how the inclusion proof takes approximately 800ms and dominates the proving time of $\mathsf{ShowCred} , (\text{full})$.
Some Noteworthy Properties
Here is a quick overview of some noteworthy properties of zk-creds:
- this system does not require official authorities such as a passport issuer to change their format or include heavyweight cryptography. All that is needed is that documents are digitally signed.
- the access criteria and circuits used to prove them in zero-knowledge can be set and updated after credentials are created and issued.
- credentials are auditable: anyone can see the whole list of issued credentials.
- issuance lists can be maintained by an automated, trustless party (smart contract).
- revocation is achieved by removing a credential from the issuer’s list.
There are however some drawbacks, the main one being that users must prove that their credential appears is the Issuer’s Merkle tree. Such a proof can be costly for deep trees as they require to prove a lot of hashes. Furthermore, if the tree changes (i.e. new credentials are issued), then the tree root and authentication path also change; users will have to recompute their witness and proof. The authors introduce the idea of a Merkle forest - a group of Merkle trees - to minimise this adverse effect while not compromising performance elsewhere.
Cryptographic Techniques
In this section we look at specific techniques introduced in the paper to achieve the system’s goals efficiently.
Merkle Forests: tunable trade-off for Merkle Trees
Merkle forests address the “witness management” issue mentioned above. Rather than having one large monolithic tree, a Merkle forest will separate the data into $m$ Merkle trees, each with their own tree root as we would expect. Each of these are shorter than the would-be monolithic tree. Proving membership of an item in the Merkle forest is done in two stages: first prove that the item is included in one of the tree roots (classic inclusion proof) and then prove that the tree root is one of the forest’s many trees.
This allows for a tunable trade-off: a larger number of shorter trees allows for less hashing but requires to keep a larger set of tree roots. The other benefit is that a smaller tree fills up faster; once it is full, the authentication path will no longer change.
Blind Groth16 Proofs
zk-creds rely on a result from [BKSV21] to re-randomise Groth16 proofs. Using random scalars, the three group elements that compose a Groth16 proof are blinded in such a way that they follow the same distribution as a regular proof and that the Groth16 verification equation still holds. The procedure is shown in the screen capture below, I encourage you to plug in these new values into the verification equation to convince yourself that it works.
Linking Groth16 Proofs
zk-creds also introduce a protocol to link Groth16 proofs. This non-interactive zero-knowledge proof (NIZK) shows that the prover knows $k$ valid Groth16 proofs over different circuits that all share the same first $t$ wires. The protocol does so while keeping the proofs and shared wires private. It takes all $k$ common reference strings and all the non-shared wire values as public inputs.
The proving strategy is as follows: first blind each proof, then produce a proof that all the input proofs share the same first $t$ wire values; finally send the blinded proofs and wire equality proof. The verifier can then verify each proof individually and accepts if they all accept. Notice that this protocol is not succinct: overall proof size and verifier computation grow linearly in the number of input proofs.
The wire equality proof is a classic sigma protocol. The Groth16 verification equation uses a sum of common reference string elements (EC points) multiplied by wire values (scalars). The sigma protocol proves that although these sums are over different $\mathsf{crs}$ elements (the EC point basis), they involve the same scalars (wire values). Similar proofs are used to show that two Pedersen commitments commit to the same values.
References
[BKSV21] Baghery, K., Kohlweiss, M., Siim, J. and Volkhov, M., 2021, March. Another look at extraction and randomization of Groth’s zk-SNARK. In International Conference on Financial Cryptography and Data Security (pp. 457-475). Springer, Berlin, Heidelberg.