An Authentication Scheme Based on Novel Construction of Hash Chains for Smart Mobile Devices

With the increasing number of smart mobile devices, applications based on mobile network take an indispensable role in the Internet of Things. Due to the limited computing power and restricted storage capacity of mobile devices, it is very necessary to design a secure and lightweight authentication scheme for mobile devices. As a lightweight cryptographic primitive, the hash chain is widely used in various cryptographic protocols and one-time password systems. However, most of the existing research work focuses on solving its inherent limitations and deficiencies, while ignoring its security issues. We propose a novel construction of hash chain that consists of multiple different hash functions of different output lengths and employ it in a timebased one-time password (TOTP) system for mobile device authentication. The security foundation of our construction is that the order of the hash functions is confidential and the security analysis demonstrates that it is more secure than other constructions. Moreover, we discuss the degeneration of our construction and implement the scheme in a mobile device. The simulation experiments show that the attacker cannot increase the probability of guessing the order by eavesdropping on the invalid passwords.


Introduction
The Internet of Things (IoT) is becoming more and more closely connected with people's daily lives, due to the popularity of mobile devices which takes a central role in IoT. Users can use applications installed on mobile devices to obtain and transfer sensitive data, so the user authentication is a necessary process to ensure the security and privacy. For instance, SMS-based (Short Message Service) authentication is widely employed in many applications, such as Gmail. However, in the latest draft of the Digital Authentication Guideline, NIST (National Institute of Standards and Technology) has announced that the SMS-based authentication is deprecated and may no longer be allowed in the future releases of the guideline [1]. Furthermore, unlike traditional personal computers and laptops, mobile devices have limited energy, computing power, and storage capacity. Thus, it is not practical for the authentication schemes to employ expensive cryptographic primitives.
Hash chain used in Lamport's one-time password (OTP) [2] has become an important lightweight cryptographic primitive since it was proposed, which greatly improves the security of the simple password system. It also has been widely adopted to design key management and authentication mechanisms. The time-based one-time password (TOTP) system based on the hash chain is the most widely applied authentication system in which each password is valid only for a fixed time interval. TOTP system is more secure than SMS-based authentication, and the user can be authenticated implicitly. For example, some multifactor key management and authentication schemes [3][4][5] adopt the hash chain to ensure forward security, because the security foundation of the hash chain is guaranteed by its unidirectional nature. Hash chain is also employed by some broadcast authentication protocols [6] as one of the building blocks. And in many countries, the TOTP system has been chosen as a major security component of Internet banking to authenticate account holders. Thus, the TOTP system can be an alternative to replace SMS-based authentication [7,8].
Hash chain is a very important tool to design the TOTP system. However, the hash chain still has some limitations and disadvantages that are issued by many literatures. First, one hash chain can only perform one-way authentication for two parties. The mutual authentication implemented by the hash chain requires both parties to store a secret value that is known as "seed," which cannot be implemented or may cause huge security issues in many scenarios. Second, due to the limited length of the hash chain, a new hash chain must be generated and both parties have to reregister when the last hash chain has been consumed. Park [9] proposed an endless hash chain composed of many short chains in which each authentication message contains the commitment of the next short hash chain. The studies of [8,9] designed a multilevel hash chain to avoid its exhaustion. And [10][11][12][13][14][15] presented a selfrenewal hash chain in which a new hash chain will be established if the old one is consumed. Third, the computational burden is much higher when the sender generates a new one-time password in which multiple hash operations are required. [16][17][18] proposed construction methods to store and recover an intermediate value of a hash chain. [18,19] further discussed the optimal time-memory tradeoff for the traversal of sequence hash chain. Hu et al. [10] gave two construction methods enabling faster verification.
Although most of the existing researches have addressed the issues caused by the above limitations and disadvantages, there is little related literature on the security research of the hash chain itself. Given the complex mobile environment and the open nature of channels (wireless channels) connected to mobile devices, the Lamport's hash chain is easier to be inverted than a single hash function if the attacker can eavesdrop on lots of values of hash chains. Thus, Kogan et al. [20] improves the safety of Lamport's hash chain by domain separation.
1.1. Our Work. In this paper, we introduce a novel construction of the hash chain and design a TOTP scheme for authentication of mobile devices. In our scheme, the hash chain is composed of k different hash functions with different output length. Compared to Lamport's hash chain, our construction can address the issue that the invalid password may help the attacker to invert the hash chain. Therefore, keeping the order of these hash functions confidential can effectively prevent the attacker from inverting the hash chain. And meanwhile, this design presents a challenge whether an attacker can easily find the order, which will come through a simulation. Besides, our scheme can be easily adopted by multifactor authentication scheme because of the simplified structure, especially adaptive to mobile devices.
The remainder of this paper is organized as follows. Section 2 illustrates the overview of hash chains. Our construction of hash chain and a TOTP system is given in Section 3. In Section 4, we analyse the security of our construction. We discuss the degeneration of our construction and run simulation experiments in Section 5. Section 6 concludes this paper.

Overview of Hash Chains
In this section, we briefly review the Lanport's and Kogan et al.'s hash chains and their security.
2.1. Lamport's Hash Chain. The hash chain fx i = hðx i−1 Þ, i = 1, ⋯, Ng proposed by Lamport is the continuous iteration of a secret value x 0 with the same hash function, as shown in Figure 1.
The secret value is generated by the user and the root of the hash chain x N , which is called commitment, should be registered on the server in advance. Whenever the server receives a password x i from the user, the server verifies whether it equals the commitment x i+1 by a hash operation, namely, hðx i Þ? = x i+1 . If the verification is passed, the server stores x i as a new commitment for the next authentication.
A classic attack on the hash function is the birthday attack. The Lamport's hash chain is generated by iterating the same hash function, so birthday attack is still an effective attack to the hash chain. Hu et al. [10] has discussed the susceptibility of iterating the hash function to birthday attack. Furthermore, Håstad and Näslund [21] got a conclusion that inverting the k-th iteration is k times easier than a single hash function if the same hash function is used in each step of the hash chain. Nevertheless, this research still works on the hash chain based on the same hash function.

Kogan et al.'s Construction.
Kogan et al. [20] proposed a TOTP system called T/Key that is designed as a secondfactor authentication scheme, which can be used in mobile devices. The hash chain used in T/Key is composed of independent hash functions h k in every step, as shown in Figure 2.
where t init is the initialization time and id ∈ f0, 1g s is a random salt. For a numeral t, <t> c denotes the c-bit binary representation of t, and tj n denotes the n-bit prefix of t. And these functions have the same domain so that it can withstand length extension attack. In T/Key, each password is valid for a specific time interval I. The user computes at first and sends it along with random salt id to the server as a commitment. At a later time t, the user sends a password x i to the server. When the server receives it, the server verifies whether it equals the stored commitment x i+j by j hash operations, namely, If the verification passes, the server stores the x i as a new commitment for the next authentication.
Kogan et al. also demonstrate that the difficulty of inverting this construction is almost the same as inverting a single hash function. However, as the length of the hash chain increases, a larger domain of H is integrant. 2 Wireless Communications and Mobile Computing

Our Construction
The hash chain used in T/Key has a limitation that the domain should be larger as the length of the hash chain increases, and the independent hash functions are actually generated by the same hash function. Therefore, the basic ideal of our construction is that multiple hash functions, e.g., SHA-2 and MD5, are selected to build the hash chain, and these hash functions have different output length. The order of these hash functions ord is confidential. We denote these hash functions as h 0 , ⋯, h k−1 , and their output lengths are n i ði = 0, ⋯, k − 1Þ. For example, if two hash functions (k = 2), SHA-2 and MD5, have been chosen, there are two orders, ord = fh 0 = SHA − 2, h 1 = MD5g or ord = fh 0 = MD 5, h 1 = SHA − 2g. These hash functions are used cyclically in the hash chain, namely, where h j i is one of the k hash functions, and i = ðj − 1Þ mod k represents the index of hash function h i ði = 0, ⋯, k − 1Þ used in j-th iteration and m is the hash chain length. Since the hash function used in j-th iteration can be inferred from its index, h j i will be simplified as h j , and h ½a,b ðb > aÞ is equiva- The secret information x = sk‖ord consists of two parts: the secret key sk and the order of these hash functions ord.
The hash chain we constructed is used as a time-based one-time password scheme to authenticate the mobile device where each password is valid for a short time interval. The scheme consists of only two phases so that it can be easily integrated into many multifactor authentication schemes or used as a separate auxiliary authentication method for mobile devices. Since that, a hash chain can only be used for authentication by two parties; the roles in our scheme are simplified to two. The party requesting authentication is a mobile device M (or a mobile application), and the verifier who verifies the password sent by a mobile device or application is a server S. Figure 3 shows the design of our proposal.
3.1. Setup. The mobile device M selects k hash functions h 0 , ⋯, h k−1 with different output length n i and then chooses and stores the order ord, the hash chain length m, and a random secret key sk. The mobile device M (or the sever S) notes the time interval I (in seconds), which represents the valid time of each password and the initial time of the hash chain t init (measured in I). The public parameters ðh 0 , ⋯, h k−1 , n 0 , ⋯, n k−1 , I, t init , mÞ can be sent to both two parties through an insecure channel. The order of hash functions ord should be sent to the server in a secure channel.
Moreover, M computes and sends it to the server S. Then, S stores it as mes v for the next verification and records t init as t v .
3.2. Authentication. At some time t > t init , M wants to access the server. The mobile device M and the server S proceed as follows: (i) M generates the password mes t using the secret key sk and the order ord. And M sends it to the server Particularly, if t = m, then mes t = x.
(ii) Upon receiving the mes t , S firstly checks whether the password lenðmes t Þ is valid. If lenðmes t Þ = n ðm−t−1Þ mod k , S accepts it, otherwise, refuses it (iii) S then verifies the password according to ord and computes (iv) If mes t ′= mes v , then S sets mes v = mes t and t v = t, and the authentication successes, otherwise, the authentication fails 3.3. Clock Synchronization. In the TOTP system, a synchronized clock is necessary to ensure the authentication process. However, time skew or network natural delay is unavoidable, which may cause authentication failure. If time skew can be quickly repaired, then no additional mechanism is needed, as this only causes one authentication failure when it happens. Otherwise, when the time skew or network natural delay continues, a solution is that each password is valid for serval time intervals (related to I), instead of only valid for a time interval. In this case, the server S needs to verify the password in each valid time, and t v should be updated to the time of successful verification.
x 0 Figure 2: The structure of T/Key's hash chain. x 0 ∈ f0, 1g n is a uniformly random secret key.

Wireless Communications and Mobile Computing
In the worse situation where ord is no longer a secret, the mes i computed by the adversary is expired, so the adversary cannot pass the verification. Thus, our protocol achieves the forward security.

A Lower Bound of Inverting Hash Chain.
Our construction is hard to invert because it is composed of multiple different hash functions that have different lengths. However, this is vulnerable to length extension attacks, especially when some hash functions are MD-based hash functions. To prevent this kind of attack, the server has to verify the length of the password first after receiving it.
The next crucial question is how difficult is it to invert our construction in our TOTP system. For clarity and completeness, we review some key theories before analysing the security of our scheme.

Lamport's Construction.
In the Lamport's hash function, the attacker can utilize invalid passwords which have been authenticated in the past to invert the hash chain. As more and more passwords are collected, attackers will have a greater chance of obtaining a legitimate preimage by oracle query.
Furthermore, Håstad and Näslund show that, in a hash chain composed of the same hash function, the adversary inverts the k-th iteration is actually k times easier than inverting a single hash function. The representation is rearranged here for completeness and better understanding.
Theorem 1 (Inverting the original hash chain) (see [21]). Let A be an algorithm that tries to invert a hash function h : f0, 1g n ⟶ f0, 1g n which makes T oracle queries at most. Given y = f ðkÞ ðxÞ for a randomly chosen x, then Moreover, A succeeds with the probability at most OðTk/2 n Þ . Theorem 2 (Inverting the T/Key's hash chain) (see [20]). Let functions h 1 , ⋯, h k : f0, 1g n ⟶ f0, 1g n be chosen independently and uniformly at random. Let A be an algorithm that can get oracle queries to all of the functions h 1 , ⋯, h k which makes T oracle queries at most overall. Thus, where Proof. We briefly prove this theorem which will prepare for the following proof of other theorems. Let P = ðp 0 , p 1 , ⋯, p k Þ be the values of the hash chain, i.e., p 0 = x, p i = h i ðp i−1 Þ, and i = 1, ⋯, k. The algorithm A also needs to maintain a list L = fði, x, yÞ, h i ðxÞ = yg, which records the oracle queries x and their answers y. For any query ði, xÞ, if x = p i−1 , A responds with y = p i and adds ði, p i−1 , p i Þ to the list. Else if ði, xÞ ∈ L, A replies with y. Otherwise, A chooses y = f0, 1g n randomly. Thus, to invert the hash chain, at least one query result should collide with P. It follows that Therefore,  Figure 3: The design of our proposal. x = sk‖ord is the secret information and t end is the moment that the hash chain is consumed. 4 Wireless Communications and Mobile Computing Theorem 2 demonstrates that inverting a hash chain using independent hash function results in a loss of security by a factor of 2, but by a factor of OðkÞ if the same hash function is used in the entire chain. Inverting this construction is as hard as a single hash function. Moreover, in the proof, algorithm A is designed to get oracle access to all the functions h 1 , ⋯, h k . In fact, these functions have the same domain as well as the function H, which means that algorithm A actually only gets oracle access to one hash function H.

Our Construction.
In our proposal, the hash chain is composed of k different hash functions, and the attacker is hard to invert the chain due to the secrecy of the order of these functions. For an attacker who does not know ord, the entire hash chain can be regarded as consisting of multiple independent hash functions.
When the attacker wants to invert the chain, he has two choices. The attacker either "guesses" a hash function that may be the right one and queries (see Theorem 3), or averages T oracle queries to all hash functions (see Theorem 4). These two choices are both analysed.

Theorem 3.
Let functions h 0 , ⋯, h k−1 : f0, 1g * ⟶ f0, 1g n i be different hash functions with distinct output length n i . Let A be an algorithm that can get oracle queries to a certain hash function which can make T oracle queries at most overall. Thus, Proof. We say the attacker successfully inverts the hash chain when he finds a preimage that meets the length requirement. Let A be an algorithm as in the statement of Theorem 2, and we used it to construct another algorithm A 1 for finding the preimage of the last iteration. The first step of A 1 is to select a hash function H which is used in h m−1 and Pr Then, algorithm A 1 runs algorithm A to get oracle access to hash function H and query T times at most and inputs the result to h m . It follows that According to Theorem 2, we know Overall, Therefore, Proof. Let A be an algorithm as in the statement of Theorem 2. We use it to construct another algorithm A 2 which finds the preimage of the last iteration of the hash chain. The algorithm A 2 runs algorithm A first, which queries T times at most to all k hash functions, scilicet T/k oracle queries for each function. The legal preimage could be got only from the query results which returned by the oracle access to h m−1 , and the algorithm A makes T/k oracle queries to it. Therefore, according to Theorem 3, The above two theorems establish the difficulty of finding a preimage of the last iteration of the hash chain in the authentication scheme. And the difficulty of inverting the hash chain which is composed of k different hash functions with different output length is further reduced to OðT/k2 n Þ, where n is the average length of these functions.

Formal Security Analysis.
In the following, we show our protocol is provably secure in the random oracle model since the hash function behaves closely like a random oracle [22][23][24]. We first present a formal description of the proposed protocol before defining the security game, which is a tuple P = ðs, k, p, vÞ.
In the setup phase, the polynomial time algorithm sðk, mÞ ⟶ ðn i Þ takes as input the number of hash functions k and the length of hash chain m and outputs the password length n i . Next, the algorithm kðn i , mÞ ⟶ ðsm, vsÞ takes as input the password length n i and hash chain length m and outputs the secret message sm and the verifier state vs. In the authentication phase, the prover pðsm, tÞ ⟶ pwd 5 Wireless Communications and Mobile Computing outputs a one-time password pwd by taking as input the secret message sm and a time t. While the verifier vðvs, pwd, tÞ ⟶ ðACCEPT/REJECT, vs ′ Þ takes as input the verifier state vs, one-time password pwd and a time t outputs a state that the password is accepted or rejected and a new verifier state vs ′ . Afterwards, we are ready to define the attack game.
Attack Game 5. Let P be a one-time password protocol and let R be a random oracle. Given a challenger C and an adversary A, the attack game acts as follows: (i) Setup. The challenger generates password lengths ðn i Þ ⟵ s C ðk, mÞ (ii) KeyGen. The challenger generates ðsm, vsÞ ⟵ k C ðn i , mÞ by random oracle (iii) Password Query. The adversary sends a time t to the challenger. The challenger generates a password pwd ⟵ p C ðsm, tÞ, which is fed to the verifier ðACCEPT/REJECT, vs ′ Þ ⟵ v C ðvs, pwd, tÞ (iv) Test. The adversary submits a password ðt A , pwd A Þ The attacker will win the game if the verifier output ACCEPT by v C ðvs, pwd, tÞ. We denote Adv Test P ðAÞ as the probability that A successfully impersonates as a legal user in the execution of protocol P . Theorem 6. Let P be the proposed protocol in the Section 3. Let A be a probabilistic polynomial-time adversary attacking the protocol that makes at most T oracle queries with the length m and the number of hash function k.

Adv Test
Proof. Let A be an algorithm stated before. We, using algorithm A, construct algorithm A 3 that makes the oracle query of the last iteration of the hash chain.

Adv Test
5. Discussions and Experiment 5.1. Degeneration. The hash chain constructed by us consists of k different hash functions with different output length that can be freely selected by the mobile device or the server. This freedom of choice has caused more changes in our construction that needs to be discussed.
First and foremost, as shown in security analysis, the length of the hash chain is no longer a key factor affecting the difficulty of inverting the hash chain, but the quantity of hash function selected is a key factor. Obviously, the larger the value of k, the more difficult to invert the hash chain. Furthermore, when there is only one hash function ðk = 1Þ in the hash chain, our construction will degenerate to Lamport's hash chain.
Second, the quantity of hash function with different output length is not unlimited. So, what happens if there are several functions that have the same output length, or the same hash function is used twice or more in the hash chain? At a macro level, it would be easier for an attacker to forge a password with this length because there is a greater probability of "guessing" the preimage correctly. If all the hash functions used in the hash chain have the same output length but the hash functions are different, it will degenerate into a special instance of T/Key's construction. The chain is equivalent to connecting T/Key's construction multiple times, and they have a similar difficulty for inverting the hash chain.
Last but most importantly, the security of our construction is guaranteed by the confidentiality of the order of these hash functions ord. If the ord is leaked, the difficulty of breaking our solution will be reduced to OðT/nÞ which is the same as the difficulty of inverting a single hash function, because the invalid password only provides a limited effect to the attacker. In theory, the probability that an attacker finds the order without any prior knowledge is 1/k!. However, in practice, the mobile device (or application) authenticates to the server as a Poisson process. We will show how difficult it is for an attacker to guess ord if the attacker can eavesdrop on the authentication message through simulation in the next subsection.

Experimental Evaluation.
As we analysed above, in our scheme, the order of hash functions ord is one part of the secret information x. Once it is leaked, the security of our mechanism will be greatly affected. In this section, we run a simulation to compute the difficulty of the attacker finding ord and then implement our construction in a real smart mobile device.
Before starting the simulation, two crucial issues, the login pattern of the mobile device and the attacker' behaviour, have to be discussed.
In the real world, it is reasonable that the login behaviour of a mobile device (or application) follows the Poisson Process, which means that the time interval between consecutive logins can be modelled using the exponential distribution. Thus, the probability density function that the next authentication of mobile device at time t is where λ is the average login rate. Then, we need to know how the attacker guesses the ord which actually has k! possible permutations ord i , i = 1, 2, ⋯, k!. The authentication message intercepted by the attacker can help him/her to guess ord. Since the lengths of these authentication messages are totally random, it is 6 Wireless Communications and Mobile Computing hard for an attacker to guess ord based on the context between them. But, for ord = h 0 , ⋯, h k−1 , in the sequence of attacker owned, h j only appear after h i ðj < iÞ before h 0 appears. Thus, the attacker can count the number that ord i appears through the sequence. We define that the attacker finds the real ord by the probability Figure 4 shows how the probability of an attacker's guesses the ord changes over time (per month). If the attacker can eavesdrop on every authentication message, as the number of chain value held by the attacker increases, the probability of the attacker's success decreases. This is because when the attacker has fewer chain values, the higher probability of correct ord appearing makes the attacker more likely to succeed.
The conclusion still holds when we change the eavesdropping probability of the attacker. Figure 5 compares the probability that the attacker finds ord in the following eavesdropping probability: 0.2, 0.5, and 0.8, respectively. When p ≤ 0:5, the probability peaks at third and fifth months, respectively. Meanwhile, due to the increasing number of hash chain values captured by attackers, the probability of attackers' success is still decreasing. According to the law of large numbers, the probability that the attacker observes the correct ord approaches the theoretical value when the attacker has a large amount of hash chain values.
Theoretically, the probability that an attacker finds the correct ord is 1/4! = 1/24 ≈ 0:0417. While in our simulation, even if the attacker can intercept every authentication message, the probability does not exceed 0.0400. Therefore, an attacker eavesdropping on a channel does not increase the probability of finding the correct ord.
The experiment environment is set up as follows: The smart device is a HUAWEI Mate 30 smartphone with 2.86 GHz CPU and 6 GB RAM running Android 10, and the server is executed on a 2-core CPU and 1024 MB memory running Ubuntu 18.04.
We use 4 hash functions to instantiate our scheme discussed in Section 3, which are MD5, SHA-1, SHA-2, and BLACK2b, and the output length is 128 bits, 160 bits, 256 bits, and 512 bits, respectively. The time interval I when each password is valid uses time slots of 30 seconds. We generate a hash chain with the length 1:05 × 10 6 that would be valid in one year. We assume that mobile device login once a day on average (λ = 1/86400). Furthermore, an attacker eavesdrops on authentication message with a certain probability p. These parameters used in simulation experiments are shown in Table 1.
As a comparison, we also implement Lamport's hash chain with 4 kinds of hash functions, respectively. We evaluate the following time: mobile device setup time, average password generation time (mobile device), and average verification time (server). Table 2 shows the results.
As shown in Table 2, the MD5 and SHA-1 hash functions have better performance, while it is not a good choice to construct the Lamport's hash chain because both of them are not secure enough [20,21]. Our solution has the best computational performance on a mobile device while ensuring security. And the server only takes several milliseconds to verify the password which is acceptable in general. More experimental results can be obtained through online resources (https:// github.com/qinglong-huang/hash_chains_experiments).
Both theoretical analyses in Section 4 and simulation experiments that we performed demonstrate that the hash chain scheme proposed in this paper is still harder to invert. Therefore, our scheme has better performance on computation and security.

Conclusion
In this paper, a novel construction of hash chain was presented, and a TOTP system based on this construction was 7 Wireless Communications and Mobile Computing designed for mobile device authentication. This system could be easily employed by some lightweight authentication schemes to ensure forward security or be applied as a second-factor authentication method replacing SMS-based authentication for mobile devices. We gave a formal security analysis regarding the difficulty of inverting the hash chain and demonstrate that the attacker inverting the hash chain in T queries is OðT/knÞ at most. Besides, we discussed several situations that may reduce its security when the selection of hash function changes. Finally, we implemented the scheme on a smartphone, and the simulation result showed that even if an attacker can eavesdrop on every password; the probability that she/he uses these invalid passwords to guess ord successfully is not higher than the theoretical value. Therefore, our scheme met the higher security requirement in mobile device authentication.

Data Availability
Data available is on request.

Conflicts of Interest
The authors declare that they have no conflicts of interest. The probability of an attacker eavesdrops on passwords