Decentralized, Privacy-Preserving, Single Sign-On

,


Introduction
Authenticated Key Exchange (AKE) is one of the most broadly used cryptographic primitives that enable two parties to create a shared key over a public network. Typically, the parties need to have authentication tokens, e.g., cryptographic keys (asymmetric or symmetric high-entropy keys) or short secret values (low-entropy passwords). ey also securely store these authentication tokens in a trusted service provider during the registration phase.
ere are various types of authentication factors such as knowledge, possession, and physical presence; low-entropy passwords are widely present in practice. An example of an authentication protocol that relies on passwords is Password-Based Authenticated Key Exchange (PAKE) [1].
However, passwords are usually vulnerable to both online and offline attacks [2,3]. An attacker who compromises the data stored with the service provider (user account data, consisting of usernames and associated (potentially salted) password hashes) can run an offline dictionary attack on that data. Such an attack leads to the disclosure of user accounts and this has happened several times in the past, cf. [2,4,5]. Even if low-entropy passwords are correctly salted and hashed, they still do not resist the brute force of modern hardware. Already in 2012, a rig of 25 GPUs could test up to 350 billion guesses per second in an offline dictionary attack [6].
Multi-Factor Authentication (MFA) schemes overcome this risk by adding additional authentication factors. MFA combines (low-entropy) passwords with, e.g., secret values stored in physical tokens. Recent advancements in fingerprint readers and other sensors have led to the increased usage of smartphones and biometric factors in MFA schemes (e.g, the use of biometrics to securely retrieve private information [8]) Figure 1, although these methods make the guessing of authentication factors more difficult. However, some MFA schemes incorporate password authentication and second-factor authentication as separate mechanisms and store a salted password hash (or biometric) on the server, leading to different vulnerabilities such as spoofing and offline attacks [7,9]. In other words, an adversary compromising the server is still able to recover the actual password (even if that password is no longer usable without the additional associated factors). Moreover, mobile devices (smartphones, wearables, FIDO U2F, etc.) are considered more likely to be subject to loss or theft, and particularly smartphones and wearables open a large, highrisk attack surface for malware [10,11].
In general, authentication schemes are designed to uniquely identify a user. Consequently, they do not aim at protecting user privacy, and users' activity in the digital world can easily be logged and analyzed. Leakage of individual information may have serious consequences for users (including financial losses). To meet the increasing need of privacy protection in the digital world, multi-factor authentications are enhanced with privacy-preserving technologies. For instance, anonymous authentication schemes allow a member of a legitimate group, called a prover, to convince a verifier that it is a member of the group without revealing any information that would uniquely identify the prover within the group. Various schemes for anonymous password authentication have been proposed, e.g., [12][13][14][15]. In particular, anonymous password authentication promises unlinkability: e prover (e.g., the server of a service or identity provider) should not be able to link user authentications. erefore, for any two authentication sessions, the prover is unable to determine if they have been performed by the same user or two different users.

Building a Fully Decentralized Authentication
Architecture. An Identity Provider (IDP) with a centralized database of authentication data of all users could easily provide an MFA scheme and offer convenient single sign-on (SSO) to other services for its users [16]. SSO allows users to once receive a single token (identity) provided by IDP and repeatedly authenticate themselves to servics providers. Several initiatives such as PRIMA [17], OAuth [18], SAML [19], and OpenID [16] let service providers take advantage of another centralized identity provider to authenticate users without becoming responsible for managing account passwords. In all these systems, the authentication follows a similar scheme (see Figure 2) [20]: (1) In the registration phase, the user creates credentials (e.g., a username/ID and a password) and passes them to the IDP (a trusted server) which stores the username together with the hash of the password. (2) In the authentication phase, the IDP verifies the usersupplied sign-on credential by matching the username and password hash. (3) After successful verification, the IDP issues an authentication credential (a digital signature or a message authentication code) using a master secret key that authenticates the user to the service provider (e.g., a website) they want to visit.
However, this kind of centralized system poses several challenges: (1) e IDP represents a single point of failure and an obvious target for attacks, such as: (a) extraction of the secret key to forge tokens, which enable access to arbitrary services and data in the system; (b) capturing hashed passwords (or biometrics) to run offline dictionary attacks in order to recover user credentials, both potentially resulting in severe damage to the reliability of the system [20]. (2) e IDP is actively involved in each authentication session and can, therefore, track user activity, leading to serious privacy issues [21,22]. (3) e IDP takes a significant amount of control over the digital identity away from the user. Users cannot fully manage and store their identity by themselves but always need to rely on and interact with an available IDP that offers the identity management system to them and the service provers they want to interact with (active verification).

Our Contribution.
To address the above challenges, we construct a novel decentralized privacy-preserving single sign-on scheme using a new Decentralized Anonymous Multi-Factor Authentication (DAMFA) scheme, where the process of user authentication no longer depends on a single trusted third party. Instead, it is fully decentralized onto a shared ledger to preserve user privacy while maintaining the single sign-on property. at is, users do not need to register their credentials with each service provider individually. e scheme also permits services where authenticating users remain anonymous within a group of users. Subsequently, our scheme does not require the IDP to be online during the verification (passive verification). Moreover, since there is no single third party (i.e., the IDP) in control of the whole authentication process, user and usage tracking by the IDP is inhibited. e passive verification property of our scheme allows service providers to authenticate users at any time without requiring additional interaction with an IDP except what is available on the shared ledger. is property removes the cost of running secure channels between the service provider and the identity provider. Simultaneously, the IDP is eliminated as a single point of failure and attack within the authentication process. e scheme relies on personal identity agents as auxiliary devices that assist the user in the authentication process. e personal identity agents participate in a threshold secret sharing scheme to store the distributed private key of their users. In the authentication phase, the user unlocks their private key through a combination of biometrics and a password, combining biometric, knowledge, and possession factors. e distributed architecture prevents offline attacks against data extracted from compromised agents, as long as only a set of agents below the threshold is compromised or corrupted.
We define the ideal functionality and real-world definitions for the security of our DAMFA scheme. We prove our construction's security via ideal-real simulation, showing the impossibility of offline dictionary attacks. Finally, we demonstrate that our protocol is efficient and practical through a prototypical implementation and through a comparison of our scheme with other SSO works.

Single-Factor (Password) Authentication Key Exchange.
For a long time, knowledge was (and still is) used as a primary means of authentication. Single-factor authentication based on passwords and PINs is a mechanism that is well-studied. Bellovin and Merritt [24] proposed Encrypted Key Exchange (EKE) where a client and a server share a password and use it to exchange encrypted information to agree on a common session key. EKE was followed by several enhancements (cf. [25][26][27]). Bellare et al. [1] expanded this to a general formal provable model for Password Authentication Key Exchange (PAKE). After that, two generic schemes of PAKE were proposed by Gennaro and Lindell [28] and by Groce and Katz [29] which are among the most efficient ways of constructing PAKE in the standard model. Benhamouda and Pointcheval [30] explicitly introduce a verifier into the authenticated key exchange, where a verifier is a hash value or transformation V � H(s, pw) of the secret password pw with a public salt s, and the server stores the pair (s, V) for each user.

Multi-Factor Authentication.
A single knowledge-based authentication factor has the disadvantage that an adversary needs to only compromise that single factor. Multi-factor authentication (MFA) overcomes this by combining multiple different factors. e widely used combination is longterm passwords with secret keys, possibly stored in tokens (e.g., FIDO U2F). Shirvanian et al. [31] introduce a framework to analyze such two-factor authentication protocols. In their framework, the participants are a user, a client (e.g., a web browser), a server, and a device (e.g., a smartphone). In the authentication phase, the user sends a password and some additional information provided by the device. In most existing solutions, including Refs. [31][32][33], during the registration process, the user gets a value called the "token," while the server records a hashed password. During the authentication phase, the two required factors (the password and the token) are sent to a verifier.
Jarecki et al. [34] provide a device-enhanced passwordauthenticated key exchange protocol employing mobile device storage as a token. is setting serves two purposes: Firstly, for an adversary to successfully mount an offline dictionary attack, they must corrupt the login server in addition to the mobile device storage. Secondly, the user must confirm access to the mobile device storage during login.
Another popular factor used to authenticate users to remote servers is biometrics [35][36][37][38]. Fleischhacker et al. [39] also propose a modular framework called MFAKE which models biometrics following the liveness assumption of Pointcheval and Zimmer [37]. However, Zhang et al. [40] demonstrate that their scheme does not adequately protect privacy. Indeed, biometric authentication becomes a weak point when the framework directly uses the biometric template for authentication. In addition, it requires to, respectively, execute a lot of sub-protocols which makes the scheme inefficient.

Anonymous Authentication.
Another approach towards user authentication is the anonymous password authentication protocol proposed by Viet et al. [12]. ey combine an oblivious transfer protocol and a password-authenticated key exchange scheme. Further enhancements were proposed by Refs. [14,15,38].
An anonymous authentication protocol permits users to authenticate themselves without disclosing their identity and   becomes an important method for constructing privacypreserving authenticated public channels. Zhang et al. [40] presented a new anonymous authentication protocol that relies on a fuzzy extractor. ey consider a practical application and suggest several authentication factors such as passwords, biometrics (e.g., fingerprint), and hardware with reasonably secure storage (e.g., smartphone).

Summary of Related Works.
Single-factor authentication based on passwords is a primary means of many authentication protocols [1,25,28,41]. Multi-factor authentication (MFA) overcomes the problem of compromise in a single factor by combining multiple different factors [31,34,40,42,43]. An anonymous authentication protocol permits users to authenticate themselves without disclosing their identity [12,14,15,44]. Finally, SSO allows users to once receive a single token provided by IDP and repeatedly authenticate themselves to service providers [16,17,19,45,46].
e above scheme can be modified to obtain a signature on a hidden message (commitment) and also offers a protocol to show a zero-knowledge proof of a signature σ � (σ 1 , σ 2 ).

Oblivious Pseudo-random Function (OPRF).
A pseudorandom function (PRF) F is a function that takes two inputs: a secret function key k and a value x to compute on. It outputs F k (x), a function picked randomly from a PRF family, which is secure if it is distinguishable from a random function with the same domain and range with a negligible probability for all probabilistic polynomial time (PPT) distinguishers. An oblivious PRF (OPRF, cf. [48]) is a protocol between two parties (a sender and a receiver) that securely computes F k (x) where both x and k are the inputs of sender and receiver, respectively, such that no party learns anything except for the input holder that learns F k (x).
A threshold OPRF (TOPRF, cf. [49]) is an extension of the OPRF which allows a group of servers to secret share a key k for a PRF F with a shared PRF evaluation protocol which lets the user compute F k (x) on an input x, so that both x and k are secret if no more than t of n servers are corrupted (see Figure 3).
A formal definition of the TOPRF protocol as a realization of the TOPRF functionality is given in Figure 4. Note that we just duplicate these functionalities so that readers can easily follow our ideal functionality and construction (for more details see [49]).

Secret Sharing Scheme.
A secret sharing scheme consists of two PPT algorithms [50]: First, TSSGen generates n shares of the secret key K as 〈k 1 , . . . , k n 〉 ⟵ TSSGen(K), and second TSSRecon uses t shares to retrieve the primary secret value K as K ⟵ TSSRecon(s 1 , . . . , s t ).
e security assumption of this scheme is that any amount of shares below the threshold does not disclose any info about the secret key.

Public Append-Only Ledger.
A ledger allows us to keep a list of public information and maintains the integrity of the dataset. It guarantees a consistent view of the ledger for every party. Every user can insert information into the ledger and, once some data are uploaded, nobody can delete or modify it. Moreover, the ledger assures the correctness of pseudonyms and guarantees that no one can impersonate another participant to release information. Furthermore, it distributes up-to-date data to all participants. In this paper, we assume this assumption holds and construct our system on the blockchain technique as a public append-only ledger (blockchain).
ere are already some works constructing advanced applications based on this assumption, such as Refs. [51][52][53]. Yang et al. [54] formally define a public append-only ledger, which we use for constructing our DAMFA system (see Figure 5). F B executes the following steps with parties PA 1 , . . . , PA n and an ideal adversary S as follows: (1) Initialize. Initialize creates an empty list L p in the beginning. (3) Retrieve. On input (Retrieve, PA i ), returns the list L p to PA i .

Zero-Knowledge Proof of Knowledge.
In a zeroknowledge proof of knowledge system [55], a prover proves to a verifier that it possesses the witness for a statement without revealing any additional information.
In this paper, we use noninteractive zero-knowledge proofs known as Fiat-Shamir heuristic [56] as they have the advantage of being noninteractive. For example, NIZKPoK denotes a noninteractive zero-knowledge proof of the elements x and y as NIZKPoK (x, y) : h � g x ∧c � g y } that satisfies both h � g x and c � g y . Values (x, y) are assumed to be hidden from the verifier. Similarly, the algorithm can admit a message as input, thus it is also called signature proof of knowledge denoted as 3.6. Dynamic Accumulators. A dynamic accumulator is a primitive allowing a large set of values to be accumulated into a single quantity, the accumulator. For each value, there exists a witness which is the evidence attesting that the value is indeed contained in the accumulator. e proof of showing that a value is part of an accumulator can be zeroknowledge proof, which reveals neither the value nor the witness to the verifier. Camenisch et al. [57] define a concrete construction of dynamic accumulators with the five algorithms AccSetup, AccAdd, AccUpdate, AccWitUpdate, and AccVerify: (1) AccSetup: is is the algorithm to output the public parameters. Select bilinear groups params BM � (q, G, G T , e, g) with a prime order p and a bilinear map e. Select g ∈ G. Select c ∈ Z p . Generate a key pair msk and pk for a secure signature scheme. Compute and publish p, G, T, e, g, g 1 � g c 1 , . . . , g n � g c n , g n+2 � g c n+2 , . . . , g 2n � g c 2n } and z � e(g, g) c n+1 as the public parameters.
is is the algorithm to compute the accumulator using the public parameters. e accumulator acc V of V is computed as (4) AccWitUpdate: is is the algorithm to compute the witness that values are included in an accumulator, using the public parameters. Given V and the accumulator acc V , the witness of values i 1 , . . . , i k in U is computed as (5) AccVerify: is is the algorithm to verify that values in U are included in an accumulator, using the witness and the public parameters. Given acc V , state U , and ω, accept if

Security and Communication Networks
As Camenisch et al. [57] point out, the purpose of an accumulator is to have accumulator and witnesses of size independent of the number of accumulated elements.

Pedersen Commitments.
Using a commitment scheme, users can bind themselves to a chosen value without revealing the actual value to a third party receiving the commitment. ereby, a user cannot change their choice (binding), and, at the same time, the recipient of a commitment does not learn anything about the actual value the user committed to (hiding of the value). Pedersen commitments [58] have a group G of prime order q and generators (g 0 , . . . , g m ) as public parameters. For committing to the value (z 1 , . . . , z m ) ∈ Z q , a user picks a random r ∈ Z q and sets

Decentralized Anonymous Multi-Factor Authentication (DAMFA)
We build a new practical decentralized multi-factor authentication scheme, Decentralized Anonymous Multi-Factor Authentication (DAMFA), where the process of user authentication no longer depends on a single trusted third party. e scheme also permits services where authenticating users remain anonymous within a group of users. Subsequently, our scheme does not require the IDP to be online during the verification. To protect the private key of their user, we use personal identity agents as auxiliary devices that participate in a threshold secret sharing scheme to store the distributed private key of the user.

System
Model. e overall system model of DAMFA is shown in Figure 6. e protocol is executed between four participants: (1) User U: A user who wants to access various services offered by different service providers. During the registration phase (which runs only once), U obtains a biometric template Bio from a sensor and chooses a password pw. In the authentication phase, users U interact with a set of personal identity agents to authenticate themselves in an anonymous manner. (2) Personal identity agent PA i : We associate each user with a set of personal agents which are auxiliary devices that assist a user in creating a credential for authentication. ese personal agents remain under the administrative control of their associated users, who can freely choose where to run them. For example, they could be run on a smart home controller, at a cloud provider, or even on a mobile phone. U generates a private key and executes threshold secret sharing on the private key to generate secret shares of that private key. e user stores the secret shares among their personal agents such that each PA i has one share of the overall secret key. (3) Service provider (verifier) SP: ese are the service providers (untrusted and distributed servers) that require authentication from a user U. After verifying a user's credentials, they provide access to the corresponding service. (4) Identity provider IDP: e identity provider is an entity that issues credentials to users. ese credentials grant permission to use specific services by proving membership of a specific permission group (clients, employees, department members, account holders, subscribed users, etc.).
In addition, users act as nodes in the blockchain network: ey collaboratively maintain a list of credentials in a public ledger (blockchain) and enforce a specific credential issuing policy when adding to that list. For more details on how these steps work, we refer to subsection 4.3., High-Level View.

reat Model.
In order to demonstrate the security of the proposed protocol, we determine the capabilities and possible actions of an attacker. We consider a PPT attacker who has perfect control of the communication channels.
ey can eavesdrop all messages in public channels and also modify, add, and remove messages on the network. e attacker can, at any time, corrupt (t − 1) of the user's agents (no more than threshold t), in which case the attacker knows all the long-term secrets (such as private keys or master shared keys).
In the proposed protocol, we consider some privacy requirements such as unlinkability, identity privacy, and user data privacy: Unlinkability means that an adversary cannot distinguish a user who is authenticating from any (other) user who has authenticated in the past. Identity privacy means that an adversary cannot determine if a given authentication credential belongs to a specific user. User data privacy means that an adversary cannot learn anything about the user's sensitive authentication data (i.e., biometric data, password).

High-Level View.
To build a fully decentralized authentication architecture, we need to set up a small distributed shared database (to store credentials) between nodes. Data are highly available, but nobody has control over the database. Furthermore, users would never want to modify data in the past. User data need to be immutable, and data should be publicly accessible. We employ a public append-only ledger in order to fulfill our requirements. A ledger (blockchain) maintains the integrity of the dataset and guarantees a consistent view of the data for every party. Every participant can append information to the ledger and, once uploaded, nobody can delete or modify the data.
Definition 1 (DAMFA). A DAMFA system consists of a global transaction ledger instead of a single party representing the organization. Moreover, the DAMFA scheme consists of the following phases: (1) Setup: In the setup phase, we define the public parameters and execute the following algorithm: U generates a private key and executes threshold secret sharing (TSS) on the private key to generate shares of that secret. e user stores the secret shares among their personal agents (similar to the initialization of TOPRF [49], done via a distributed key generation for discrete-log-based systems, e.g. Ref. [59]). (2) Registration: In the registration phase, the user U first selects a password pw and collects their biometric Bio at a sensor. en, U runs the TOPRF protocol by interacting with personal agents to reconstruct the TOPRF secret key. After that, the IDP issues a membership credential that shows that U is a valid member (employee, account holder, subscribed user, etc.). For this purpose, U sends a request with a pseudonym and a (noninteractive) zero-knowledge proof (NIZK) which indicates they are the owner of the pseudonym (they know the secret key that belongs to the pseudonym) and authenticate themselves to the IDP. en, U receives a membership credential, which is a signature on their pseudonym. e user U creates a pseudonym Nym o u and verification information, namely, a protected credential PC i , by encrypting the membership credential with the TOPRF secret key. Subsequently, U computes a NIZK proof that (1) the credential PC i and the pseudonym contain the same secret key and (2) proof of knowledge of the signature which is issued by the ID provider (i.e., she has valid group membership). Note that the user can execute these actions in an offline state because no interaction with the public ledger is required. Finally, nodes accept the credential to the ledger if and only if this proof is valid.
(3) Authentication: e user U attempts to access the services of an SP in an anonymous and unlinkable way. SP authenticates the user if and only if the user provides a valid credential. First, a service provider sends an authentication request (which is a signature) to U. e user inserts the password pw * and the biometric Bio * and runs the TOPRF protocol by interacting with personal agents to reconstruct the TOPRF secret value. U first scans the public ledger to obtain the accumulator AC, which is a set PC � �→ � PC 1 , . . . , PC n consisting of all credentials belonging to a specific IDP. en, U finds their own protected credential PC * i within this set (via the pseudonym Nym o u ). U decrypts PC * i using the TOPRF secret key and recovers the initial credential (a signature from IDP). U presents the credential under a different pseudonym Nym v u by proving in zero-knowledge that (1) they know a credential PC i on the ledger from IDP, (2) the credential opens to the same secret key as their own pseudonym Nym v u , and (3) they prove possession of a membership credential from IDP (the signature), cf. [52]. SP scans the public ledger to obtain the accumulator AC which is a set PC � �→ � PC 1 , . . . , PC n consisting of all credentials belonging to a specific organization. Security and Communication Networks en, it checks the validity of the candidate credential by finding the candidate credential in the set PC * i ∈ PC � �→ and checking proof of knowledge on the credential and pseudonym.

4.4.
e DAMFA Functionality. We formally define the proposed scheme's security by presenting its ideal functionality that is implemented via a trusted party F TOPRF with a public ledger. All communication takes place through this ideal trusted party. In the UC framework [60,61], there may be some copies of the ideal functionality running in parallel. Each one is supposed to have a unique session identifier (SID). Each time a message is sent to a specific copy of functionality, such that this message contains the SID of the copy that is intended for. As noted in Ref. [49], we also use the ticketing mechanism, which ensures that in order to test a password and biometric guess, the attacker must impersonate t + 1 agents. To this end, they define a counter tx(p, PA i ) for each PA i ∈ SI in which the parameter p is also used to identify it. In addition, when an agent PA i ∈ SI completes its interaction, the functionality increases the counter tx(p, PA i ). On the other hand, when a user, either honest or corrupt, completes an interaction that is associated to PA i , tx(p, PA i ) decreases by 1. It ensures that for any honest agent PA i , the number of user-completed OPRF evaluations with PA i is no more than the number of agentcompleted OPRF evaluations of PA i . It sets t + 1 agent tickets for accessing the proper TOPRF result by reducing (nonzero) ticket counters tx(p, PA i ) for an arbitrary set of t + 1 agents in SI. e ideal functionality as:

Setup Phases
We select a bilinear pairing e: G 1 × G 2 ⟶ G T that is efficiently computable, nondegenerate, and three groups with prime order p. We let g 1 and g 2 be generators of G 1 and G 2 , respectively, and g t � e(g 1 , g 2 ) the generator of G T . Note that it is assumed to support one-way Bio-hash function H 1 , which resolves the recognition error of general hash functions [62]. We consider two additional hash functions as H 2 : M ⟶ 0, 1 { } λ and H g : M ⟶ G 1 . We publish params ⟵ (G 1 , G 2 , g 1 , g 2 , p, h nym , H 1 , H 2 , H g ) as the set of system parameters where h nym ∈ G 1 . e user U generates a private key K, then executes a secret sharing construction scheme on K to create secret keys for each personal agent 〈k 1 , k 2 , . . . , k n 〉 ⟵ TSS(K). U stores secret shares among personal agents.

Registration Phase
To register a user to the system, U first chooses a password pw and scans her biometric impression Bio at the sensor. en, U runs the following steps to register herself in the system.
(i) A user runs TOPRF protocol [49] with agents to compute the secret value usk � F K (pw, Bio) as follows: (a) e user U picks a random number r ∈ Z p and computes A � H g (pw, H(Bio)) r and sends the message (ii) In order to obtain a membership credential from IDP, we use PS signatures protocol [47] to derive a signature on a hidden committed message as follows: (a) KeyGen(pp): e IDP runs this algorithm to generate private and public keys. is algorithm selects (x, y, y 1 2 ), and sets sk ⟶ (X, y, y 1 ) and pk ⟶ (g 1 , g 2 , Y, X ′ , Y ′ ). (b) Protocol. A user first selects a random r 2 ⟵ Z p and computes C � g r 2 1 · Y usk , which is a commitment on her secret key. She then sends C to the IDP. ey both run a proof of knowledge of the opening of the commitment (authentication). If the signer is convinced, the IDP selects a random u ⟵ Z p and returns σ ⟵ (σ 1 � g u 1 , σ 2 � (X· C · Y m 1 ) u ). e user can now unblind the signature σ and get a valid signature over her secret key and the message m 1 by computing σ ⟵ (σ 1 , σ 2 /(σ 1 ) r 2 ) described in Sect. 3.1.
(c) Verify. To verify this signature, the user can execute this algorithm and compute: (d) Verify(pk, m, σ): (iii) CreatePC. e user generates a protected credential with TOPRF secret key usk derived from the password and the biometric: U picks a random number s ∈ Z p to generate a pseudonym as Nym o u � g s 1 · h usk nym and computes an El-Gamal encryption of the credential σ with secret TOPRF values usk into a ciphertext as: PC i � [σ] usk . (iv) Proof. A NIZK proof of knowledge of the credential (PS signature [47]) works as follows: U selects random r 3 , t 1 ⟵ Z p and computes σ ′ ⟵ (σ to the verifier and carries out a zero-knowledge proof of knowledge (such as the Schnorr's interactive protocol) of m, usk, and t 1 such that (v) At the end of this phase, U submits the resulting values (PC i , π, Nym o u ) to the public ledger nodes where π is a proof of knowledge on the Nym o u and the PC i . If the signature verifies successfully, output 1, otherwise 0. e nodes should accept values to the ledger if this algorithm returns 1.

Authentication Phase
In this phase, a user authenticates herself to the service provider and establishes a session key with the service provider. e following steps are executed by U, PA, and SP: (i) First of all, the server chooses a secret key y ⟵ Z p and computes Z ⟵ g y 1 .
en, SP generates a signature σ s on message Z (i.e., Schnorr's signature [55]) using its secret key and sends the message M 1 � Z, σ s to the user. (ii) When receiving a pair (Z, σ s ), the client verifies whether σ s is valid on message Z under the SP's public key. If σ s is valid, U inserts pw * and scans her personal biometric impression Bio * at the sensor. (iii) e user interacts with personal agents and runs the necessary steps to compute the TOPRF protocol F K (Bio * , pw * ) � usk � h(pw * , i∈SR b r − 1 i ). en, U decrypts ciphertext [σ] usk with the TOPRF secret key usk to recover the credential σ.
(iv) Show: e user creates a NIZK π ensuring that the credential is well-formed and the credential related to the same secret values as her pseudonym. Here we prove: (1) she knows a credential on the ledger from the IDP, (2) the credential includes the secret key as her pseudonym, (3) she possesses of a credential (signature). We use the bilinear maps accumulator [57] to accumulate the group elements g 1 , . . . , g n instead of, e.g., the integers 1, . . . , n { }. In addition, Camenisch et al. [57] describe an efficient zero-knowledge proof of knowledge such as Schnorr's protocol [55,56] that a committed value is in an accumulator. See Refs. [57,63] to find how this proof works.
U runs the following steps to authenticate herself: (a) e user selects a random number r 4 ∈ Z p to generate a pseudonym Nym v u � g r 4 1 · h usk nym for communication with service providers. (b) U picks random numbers d, t 2 ⟵ Z p and computes a randomized commitment credential (like in the previous step) as σ ′ ⟵ (σ , PC * i ), carries out a zero-knowledge proof of knowledge of the credential, and outputs the following proof of knowledge π such that NIZK usk, ω, d, t 2 , m, r 4 : Finally, U sends the message M 4 � Nym v u , D, Hmac, π}, to the service provider. (i) After receiving the message M 4 � Nym v u , D, Hmac, π}, from the user, the service provider first scans through the ledger to obtain a set PC � �→ consisting of all credentials belonging to IDP. First, SP computes the accumulator AC � Accumulate(params, PC � �→ ). en, it verifies that π � 1 is the aforementioned proof of knowledge on PC i and Nym v u using the known public values. If the proof verifies successfully, output 1, SP computes the session key as follows: SK � D y � g y·d 1 . en, SP computes Hmac * (SK, D, Z) and checks Hmac � Hmac * . If π � 1 and Hmac holds, SP accepts SK as the session key and also the user is authentic.
Note that we can simply send σ ′ alongside the message of the proof of knowledge. With this, we can prove the construction is a Σ-protocol (see Ref. [47] to see how proof of knowledge of PS signature works).

Optimization.
To exploit the accumulator AC in our construction which can be computed incrementally, we consider that any node mining a new block can add this block's accumulator to the previous one. e node stores the result as a new accumulator value in the transaction at the beginning of the new block, namely, the accumulator checkpoint. Peer nodes validate this computation before accepting the new block into the blockchain. With this optimization, SP no longer needs to compute the accumulator AC. Instead, SP can merely reference the current block's accumulator checkpoint and compute the secret key SK starting from the checkpoint preceding her mint (instead of starting at the beginning).

Theorem 1.
Our proposed protocol is secure against any nonuniform PPT adversary corrupting t − 1 many personal agents PA by assuming that the El-Gamal encryption, zeroknowledge proof of signature, and the TOPRF protocol are secure and also the hash function is collision resistant. 4.7. Security Proofs of eorem 1 4.7.1. Proof Sketch. Our construction DAMFA is modular and relies directly on the TOPRF and the zero-knowledge proof. e security is then straightforwardly inherited from those algorithms: e credential security requires that no adversary is able to present a credential (guess passwords and biometrics) and generate a session key, which they have not had any access to. If we use a TOPRF on passwords and biometric of users, then the security properties of TOPRF would make it hard to guess. e proof is once again twofold: (i) First, the authentication is done through a zeroknowledge proof. At this step, the adversary presents an invalid credential or manages to build a valid proof. Hence, the adversary breaks the soundness of the underlying proof of knowledge we used, or else uses a valid credential. (ii) At this step, we now assume the adversary wins by using a valid credential. We now rely on the obliviousness of the TOPRF. We interact with a TOPRF challenge to answer every adversarial request, and at the end, we can use the (valid) credential output by the adversary to break the TOPRF obliviousness, which leads to the conclusion.

Anonymity.
During the registration phase, when a user reveals her pseudonym but does not (intentionally) reveal her secret key usk, no adversary should learn any information about the secret key or the identity. Besides, during the authentication phase, a user proves her credential using zero-knowledge proof, which reveals no additional information about her secret key and identity to the SP. e simulator S is essentially an ideal world adversary that interacts with the functionality F DAMFA and the environment ξ. We also assume that our zero-knowledge signature of knowledge includes an efficient extractor and a simulator and also that the signature is unforgeable. To guarantee that the view of the environment in the ideal world is indistinguishable from its view in the real world, it has to invoke the real-world adversary A by simulating all other entities for A ′ . en, for the most parts, the simulator follows the action of adversary A ′ appropriately.

Description of the Simulator.
Once the adversary registers a new user to the system via storing a tuple (Nym o u , PC i , π i ) to the bulletin board, the simulator registers this user in the ideal world via the following process. It makes an interface between honest parties in the real world (which are the user U and n − t + 1 personal agents denoted by PA i where i � t, . . . , n wlog. since all personal agents in our solution are identical) and corrupted parties in the ideal world (which are the service provider SP and t personal agents denoted by PA ic where ic � (1, . . . , t). e simulator behaves as follows: Remark 1. Since S simulates PA ic in the ideal world, S receives whatever they receive from F DAMFA .
(2) After receiving (sid, PA ic , PC i , Nym o u , π i ) from A ′ for some PA i ∈ SI, it checks if it has a record of (U, k ic , Nym o u ) on its list of users. If the user with Nym o u exists, then S retrieves K associated with (U, k ic , Nym o u ) and proceeds. e simulator then employs the knowledge extractor to obtain usk. If it is not on the list, S follows the protocol to register Nym o u as a user by choosing a random password pw * and Bio * . It generates secret shares k ic ′ on K for each corrupted personal agent, records 〈Reg, U, sid, SI, pw * , Bio * , k ic , K〉, and sends 〈k ic 〉 to PA ic ∈ SI and A ′ .
(3) Upon receiving (RegComplete, sid, SI) from A ′ , retrieves 〈Reg, U, sid, SI, pw * , Bio * , k ic , K〉 computes a pseudonym Nym v u and a credential PC i ′ � h · g usk where usk ic � F K (pw * , Bio * ). It records 〈Nym v u , PC i ′ , U, SI, usk ic 〉 and sends (sid, PA ic , PC i , Nym v u , π i ) to its public ledger and A ′ where π i is proof of knowledge. S stores (pw * , Bio * , K, usk ic , Nym v u , PC i ′ , π i ) in its list of granted credentials.

Remark 2.
When an honest user wants to establish a credential through the functionality, the simulator creates a credential and uses the extractor of the signature of knowledge to simulate the associated proof. It then transmits the credential information (PC i ′ , π i , Nym v u ) to the trusted store.
(2) Authentication (1) Upon receiving (Auth, U * , sid, ssid, SR) where |SR| ≥ t + 1 from A ′ , retrieves 〈Nym v u , PC i ′ , U, SI, usk ic 〉 corresponding to U as stored in the registration phase. If there is a set (Bio, pw, K) stored in the registration phase and usk ic is defined, then executes the TOPRF protocol with each personal agent using the password pw * and Bio * and receives ρ ic � T(p, (pw * , Bio * )) from F TOPRF and sends (Auth, sid, ssid, U, SR) to A ′ .

Remark 3.
e initialization also specifies a parameter p used to identify a table T(p, .) of random values that define the proper PRF values computed by the user when interacting with any subset of t + 1 honest servers from the set SI. An additional parameter p * , and corresponding tables T(p * , .), can be specified by the adversary to represent rogue tables with values computed by the user in the interaction with corrupted servers (see more on this [49]).
(2) Upon receiving (Auth, sid, ssid, U, ρ ic ) from F TOPRF , S recovers SR and usk ic corresponding to U as stored during the registration phase in the database (ignores this message if no corresponding tuples exist). S checks ρ ic � usk ic and if each PA ic used the correct corresponding share ic � (usk ic , k ic ) values. Ignores this message if either of the following conditions fails: if ρ ic � usk ic then |S|tx(p, S) > 0| > t or all servers in SR are honest. Otherwise, sends (Auth, sid, SR, pw * , Bio * , sk) to F DA MFA where sk is a random secret key and sets for (flag, pw * , Bio * , sk) as follows: (a) Case 1: Correct share ic � (ρ ic , k ic ) employed by the adversary in the real protocol. S detects this by verifying that usk ic � ρ ic . erefore, S sets (flag, pw * , Bio * , sk) � (1, ., .) and sends (usk ic , k ic ) in its database to F DAMFA where usk ic , k ic was sent by F DAMFA . (b) Case 2: Otherwise, incorrect usk ic , k ic employed by the adversary in the real protocol. S detects this by verifying that usk ic ≠ ρ ic . So, S sets (flag, pw * , Bio * , sk) � (0, ., .) and defines x as the set of values pw and Bio in the dictionary such that T(p * , (pw, Bio)) is defined. For every x in lexicographic order, sets v: � T(p * , x) and checks if v � usk ic . If so, sets (flag, pw * , Bio * , sk): � (2, x, sk * ) and breaks the loop. If the above loop processes all pw and Bio without breaking, sets (flag, pw * , Bio * , sk) � (0, ., .).
(3) On receiving (Auth, sid, ssid, SR, x � pw * , Bio * ) from party P ∈ (U, A ′ ) and (Auth, sid, ssid, P, ρ ic ) from A ′ , recovers usk ic corresponding to U as stored in step 1. It ignores this message if either of the following conditions fails: If ρ ic � usk ic then |S|tx(p, S) > 0| > t or if all servers in SR are honest. Otherwise, picks T(p * , x) ⟵ 0, 1 { } l if it has not been defined and sends (Auth, sid, ssid, T(p * , x)) to A ′ . If ρ ic � usk ic (without resulting in the failure of conditions) then adds every PA ∈ SR to tested(x) and sends (TestPwBio, sid, PA, pw * , Bio * ) to F DAMFA . If F DAMFA replies sk, then records it.
Remark 4. F DAMFA employs the ideal user-provided password and biometric test in the ideal world. erefore, if the adversarial personal agents in the real world acted honestly, it means that the simulator provided correct pairs (usk i , k i ).
en, the calculated credential and pseudonymous will be valid (consisting in the ledger) since it is computed using the actual password and biometric. On the other hand, if personal agents acted maliciously in the real world, S would have detected this in the previous step and would have provided wrong pairs to F DAMFA in the ideal world. So, in both worlds, the response will be invalid.
(4) Upon receiving (Auth, sid, ssid, SR, Nym v u , PC i ) from F DAMFA , S forwards 〈Nym v u , PC i 〉 to the A ′ in the real world.
(3) e Indistinguishability (i) Game Real . is is the real world: the system constructed in this work is run between n − t + 1 honest parties and t parties controlled by the adversary. (ii) Game 1 . is is identical to Game Real except that the encryption generated in the registration phase by honest users is replaced with a simulated one. Indistinguishability between Game Real and Game 1 comes from the El-Gamal encryption security properties. (iii) Game 2 . is is identical to Game 1 except that in TOPRF, each share (b i and usk) generated by honest users using an actual password pw and biometric Bio is replaced by pw * and Bio * chosen randomly. Since, S does not have the correct password and biometric, indistinguishability between Game 1 and Game 2 comes from the indistinguishability of the TOPRF algorithm and TSS construction.
(a) Reduction 1. e TOPRF security ensures that senders (adversarial personal agents) cannot distinguish between the receiver (the simulated user) input, whether they are the actual password pw and Bio or another randomly chosen pair of password pw * and biometric Bio * . (b) Reduction 2. e TSS security ensures that less than the threshold number of agents cannot reconstruct the secret and also cannot check if the shares are indeed related to the same secret. erefore, there is no efficient way for the adversary to distinguish this from real behavior since one more agent needs to be corrupted to mount a successful offline attack.
(iv) Game 3 . is game is identical to Game 2 except that an authentication response (Nym v o and PC * i ), which are two random group elements generated by the adversary will be rejected if the extracted secret key does not fulfill the requirements. Indistinguishability between Game 2 and Game 3 comes from the verified consistency of the bilinear pairing algorithm and the simulation breaks the soundness of the underlying proof of knowledge we used before (assuming that there is no hash collision).
(v) Game 4 . is is the world simulated by S. It is not hard to check that Game ideal is identical to Game 4 .
We already know that the possibility of TOPRF and NIZK proofs to break is negligible.

Implementation
In this section, we illustrate the practicability of the proposed protocol. To this end, we provide the public ledger part which is realized by well-known blockchains, namely, Namecoin and Ethereum. e results are summarized in Table 1. Here, initial data size shows the size of the blockchain needed for downloading and storage. Initial sync time is the time required to sync and connect to the blockchain. Confirmation time is the time required to confirm that the data are uploaded in the blockchain.

Namecoin Implemention.
e public ledger can be implemented by a blockchain system. One of the smooth ways to realize a public ledger is using Namecoin blockchain. Namecoin allows for registering names and stores related values in the blockchain, which is a securely distributed shared database. It also enables a basic feature to query the database and to retrieve the list of existing names and associated data. us, we can store credentials, scan them based on namespace, and then verify them. We execute the following steps in order to participate in the Namecoin system and store credentials by the namecoin id as pseudonyms: (i) We need to install a Namecoin client that has a full copy of the Namecoin blockchain and keep it in sync with the P2P network by fetching and validating new blocks from connected peers. We use implementation of the Namecoin client [64], which can be controlled by HTTP JSON-RPC, command line, or graphical interface. It spontaneously connects to the Namecoin network and downloads the blockchain. (ii) e Namecoin client also creates the user's wallet, which includes the private key of Namecoin address of the user.
(iii) To save credentials in the blockchain, the user needs to register a namespace "id/name" as the owner of the name by paying a very small fee (currently 0.006 4 USD). An id name can be registered using the Namecoin graphical interface or commands "name_new" and "name_firstupdate." e following description shows how the id name in Namecoin namespace is registered and how those names can be accessed.
namecoind name-new id/3608a30756b0... e output will look like this: is transaction shows a hashed version of the name, salted with a random value (which is "9f213. . ." for transaction ID "0e0e0351. . .").
(1) Cost: Initially, a reasonable transaction fee of either 0.00 or 0.01 NMC is charged. We can choose this fee based on how fast we want to process a transaction. (2) Latency: Namecoin and Bitcoin both attempt to generate blocks every 10 minutes; on average, it takes nearly 5 minutes to see the data appear on the blockchain. In practice, it then takes the necessary time to solidify the transactions and the data to be verified. For Namecoin, it takes about 2 hours to confirm that the data are uploaded in the blockchain (12 confirmations). at is why name_firstupdate will only be accepted after a mandatory waiting period of 12 additional blocks.
Remark 5. Note that these costs and delays occur only once during the setup and registration phases. ey do not affect the authentication phase. us, we focus on the computation time of the authentication phase that is frequently used in the authentication system (see Section 5.3).  All transactions need some amount of gas to motivate processing. A transaction fee is between 0 and 0.000 424 ETHER depending on how fast we want to approve the blockchain transaction. (2) Latency: Ethereum creates a new block every few seconds so that the data will appear on the blockchain instantly. As mentioned in Ethereum Blog, 10 confirmations are sufficient to achieve a similar security degree as that of 6 confirmations in Bitcoin. It takes around 3 minutes to confirm the transaction/data. Note that these costs and delays occur only once during the setup and registration phases.

Performance of the Authentication System.
We now examine the performance of our anonymous authentication system. ere are two main steps: the registration phase and the authentication phase. However, since time-critical operations in both registration and authentication phases are the same, we concentrate our evaluation on the efficiency of these processes. ese processes include OPRF, issuing/receiving a credential, and proving knowledge of the signature and pseudonym. To simplify the evaluation criteria of the experiment results, we only assume a simple policy with a threshold t � 2 for two agents. e experiment is based on a laptop with Intel Core i5-6200U CPU 2.30 GHz, 8.00 GB RAM, and 64-bit Ubuntu OS in Java 8, building upon the upb.crypto library (available at https://github.com/cryptimeleon) [65]. is library offers elliptic curve math and several useful building blocks for anonymous credentials like Sanders signatures [47], Pedersen's commitment [58], Nguyen's accumulator [66], Shamir secret sharing, generalized Schnorr protocols, proofs of partial knowledge [67], Damgård's technique for concurrently blackbox secure Sigma protocols, and the Fiat-Shamir heuristic [56]. Table 2 shows the computational performances of the protocols over 50 iterations. For issuing and proving protocols in such a way that a certain policy is satisfied by a credential, we assume equality of two attributes as Policy: StuID � "11111" and GENDER � "male" and credential: certifying only these attributes.

Computational and Communication Complexity.
We analyze the communication and the computation complexity of our proposed protocol using the size of each element exchange involved in our protocol, the number of exponentiation needed for issuing a credential (executed only once in the registration phase) and the proving of a credential (the most frequently executed phase), respectively. We show the following efficiency analysis in Table 3. r, t, E G 1 , and P denote the number of attributes that can be certified, the number of agents that need to be connected, the cost of exponentiation in G 1 , and the cost of a pairing computation, respectively. By POK E G 1 [n] (resp. POK P[n] { }), we denote the cost of proving knowledge of n secrets involved in a multi-exponentiation (resp. pairingproduct) equation, and Ver(POK) indicates the cost of verifying this proof.

Comparison.
We provide a comparison of DAMFA with some of the most popular SSO schemes in Table 4. We compare DAMFA with the above schemes in terms of Decentralization (Decent.), Passive verification (PV), Multi-Factor (MF), Formal definitions (FD), Anonymity (Anony.), and Selective Disclosure (SD) attributes. Decent denotes the decentralization of the SSO schemes (i.e., user authentication process no longer depends on a trusted third party). We provide this by applying a distributed transaction ledger and the blind issuing protocol. PV shows that service providers can verify users (who have registered a particular credential) without requiring interaction with an identity provider. We fulfill this property using a distributed transaction ledger and anonymous credentials. Anonymity guarantees that no one can trace or learn information about the user's identity during the authentication process. We fulfill this property by applying NIZNP + SP signature + Pseudonym. Here, • denotes that it is unfeasible for IDP's to track users' sign-on activity onto different SPs. Also, it shows that it is impossible to correlate multiple accounts created from the same credential on different SPs. Subsequently, ◐ indicates that either IDP's s or SPs can create a correlation between different accounts of the same user. FD demonstrates if proposed schemes provide a formal security definition. In this case, DAMFA is the only scheme that provides a formal security definition and proof. SD allows to disclose a subset of user attributes and proves statements about their attributes. Finally, to protect the user's private information against offline (OA) attacks, we use the TOPRF primitive. Here, ◐ means that other related schemes are resistant against offline attacks as long as IDP does not compromise or the theft/loss/corruption of a user's device does not happen when they use this device as 2FA token. • means that resistance to offline attacks is satisfied even in the presence of a corrupted IDP or user's device.

Conclusion
In this paper, we proposed a decentralized authentication and key exchange system DAMFA (SSO scheme) under TOPRF protocol and standard cryptographic primitives. e proposed scheme builds upon a trustworthy global appendonly ledger that does not rely on a trusted server. DAMFA fulfills the following properties: (1) Decentralization property means that the process of user authentication no longer depends on a trusted party. To realize such a distributed ledger, we propose using the blockchain system already in realworld use with the cryptographic currency Bitcoin. (2) Passive verification means that service providers who have access to the shared ledger can verify users without requiring interaction with an identity provider. (3) Single sign-on property ensures that a user logs in with a single ID into the identity provider and then gains access to any of the several related systems. So, users do not need to register with each service provider individually. (4) Anonymity guarantees that no one can trace or learn information about the user's identity during the authentication process. Finally, we evaluated that our protocol is efficient and practical for authentication systems.
Moreover, we provided comparison of our scheme (DAMFA) with some of the most prominent SSO schemes. To demonstrate a more detailed analysis of the performance of our scheme, we analyzed the communication and the computation complexity of our proposed protocol using the size of each element's exchange involved in our protocol and the number of exponentiation, respectively. We proved our construction's security via ideal-real simulation, showing the impossibility of offline dictionary attacks. Finally, we demonstrated that our protocol is efficient and practical through a prototypical implementation and implemented the public ledger using Ethereum and Namecoin blockchains.

Data Availability
No additional data are available. Disclosure is paper is an extended version of the paper entitled "DAMFA: Decentralized Anonymous Multi-Factor Authentication" [23], including complete proofs, formal security models, an Ethereum implementation, a comparison with other SSO schemes, a computation and communication complexity analysis, and improved experimental results.

Conflicts of Interest
e authors declare that they have no conflicts of interest.
Acknowledgments is work was supported by Johannes Kepler Open Access Publishing Fund and has been carried out within the scope of Digidow, the Christian Doppler Laboratory for Private Digital Authentication in the Physical World. It has partially been supported by the LIT Secure and Correct Systems Lab. e authors are grateful for the financial support by the