An Encryption Technique for Provably Secure Transmission from a High Performance Computing Entity to a Tiny One

,


Introduction
It is well recognized that communications should be secure and accordingly encrypted in order to avoid misuse of the transmitted information.Consequently, contemporary cryptographic algorithms for encryption play a very important role in data communication systems for various areas of applications.A particular challenge is related to addressing the resource constrained environments, where the requirements include lightweight algorithms and hardware designs.To select a suitable encryption algorithm for an application or an environment, the algorithmic requirements as well as the implementation constraints have to be taken into account.This is also in line with a discussion recently reported in [1].
On the other hand, in a number of scenarios the communication parties are with very different capabilities: one party could be with a tiny capability and the other with much higher ones.As an illustration, we point to a communication scenario over the Internet of Things (IoT) where a tiny machine (a tiny sensor, e.g.) should communicate with a more powerful one (sink of a sensor network or a gate, e.g.).According to the current state of the art, the following two problems appear as the still open ones: (i) developing encryption/decryption techniques which take into account asymmetric capabilities of the entities involved in encryption/decryption and (ii) enhancing cryptographic security of encryption in a lightweight and provable manner.
Consequently, in this paper we consider the problem of designing a dedicated encryption/decryption algorithm which fits into the communications scenarios which include the following: (i) a high performance computing party should deliver encrypted messagesin a one-way communication scenario to a number of parties which have tiny computational capabilities; (ii) implementation limitations at the tiny entity imply employment of a lightweight keystream generator (from certain reported lightweight stream ciphers); (iii) developed encryption scheme should have enhanced security in comparison with the one offered by the employed keystream generator.
A certain number of reported encryption approaches jointly employ elements of traditional stream ciphers and 2 Mathematical Problems in Engineering elements of coding theory as well as features of certain communication channels (see, e.g., [2][3][4][5][6][7][8]), and this paper follows the same track.We consider an encryption approach which involves a communication channel with the synchronization errors which appear in the form of inserted bits.In this approach, the transmitting/encrypting side requires a source of random bits and capability to insert them between message bits.Under the assumption that the transmitter has a method to inform the intended receiver about the locations (and not necessarily the values) of the inserted random bits, the intended receiver can perform decimation (i.e., discard the inserted bits) of the obtained sequence so that it can be a subject of simple traditional decryption.
Summary of the Results.This paper focuses on the following two issues which have not been addressed in the literature: (i) developing of an encryption/decryption technique which has asymmetric implementation complexity and provides lightweight decryption and (ii) security enhancement of the involved keystream generator employing paradigm of the binary channels with random insertions.An encryption/decryption technique for data transfer between a computationally powerful party and a party with limited computational capabilities is proposed which provides a tradeoff between implementation complexities at the involved parties: the implementation overhead is reduced at the lowcapability party at the expense of a higher (but still moderate) one at the party with high capabilities.In order to achieve security enhancement of the employed traditional keystream generator the proposed encryption technique at the transmitting side involves a simulator of the binary channel with synchronization errors.Security enhancement of encryption archived by the proposed scheme in comparison with the security of the employed keystream generator is based on the design paradigm and results on the mutual information between inputs and outputs of the channels with bit insertion.
Organization.The paper is organized as follows.In Section 2, we give the underlying ideas for the design and proposal of an encryption/decryption framework.In Section 3, we provide some information-theoretic results for the proposed scheme; that is, we mostly derive various mutual information rates of interest for the security evaluation.In Section 4, we provide the cryptographic security evaluation based on implications which link the information-theoretic quantities to computational complexity based ones.Accordingly, Sections 5 and 6 provide evaluation of the computational complexity security enhancement employing numerical estimation of the mutual information and enumeration of input candidates for the given output after a binary channel with insertion of random bits, respectively.(Also note that this paper is a significantly revised and expanded version of [8].)

A Proposal of a Dedicated Encryption Technique
This section proposes an encryption/decryption technique which provides asymmetric implementation complexity at the communicating parties and provably enhanced cryptographic security.Both asymmetric implementation complexity and enhanced security appear as a consequence of the design based on employment of a simulator for binary channels with insertion errors.
2.1.Underlying Ideas.Our main design goals/approaches could be summarized as follows: (i) Enhance security based on information-theoretic and coding results over channels with synchronization errors.
(ii) Assuming that Party I is more powerful than Party II move the more complex operations to the side of Party I without implications on the cryptographic security.
This paper proposes a stream cipher developed based on the following two construction principles: (i) adjustment of the construction to the asymmetric capabilities of the involved parties; (ii) employment of the results regarding binary channels with insertion errors for enhancing security.The goals are that the party with more powerful resources performs more complex operations and that the entire scheme provides a highly and provably secure level of cryptographic security resulting from the employment of the insertion communications channel paradigm.
Our design is based on employment of the following building blocks: (i) a lightweight binary keystream generator; (ii) a block for insertion (embedding)  random bits into a given -dimensional binary vector; (iii) a block for decimation of a given ( + )-dimensional binary vector which selects certain -bits.
Accordingly, we assume that the employed keystream generator outputs certain pseudo-random sequences denoted as   and   .Also, we assume that a deterministic mapping exists which maps a given   into   .We assume that the message   is additively combined (i.e., encrypted) with the shared pseudo-randomness   to obtain   , that is, and   is subject of further mapping by a simulated binary channel with random insertions where positions of random bits embedding are specified by   so that the channel outputs  () .The intended receiver (Bob), knowing both   and   , can easily decimate  () to obtain   and further perform   =   ⊕   , to obtain the message   .Since Bob can easily recover the transmitted message using a simple decimation technique, the system requires no special hardware overhead for decryption.This is especially useful if the intended receiver is a low-power device.On the transmitter's side encryption requires simulation of a binary channel with insertion errors and the transmitter needs to send (1 − ) −1 times more symbols than it otherwise would, which means that the power consumption of the transmitter goes up by a factor of (1 − ) −1 .Hence, it may be reasonable to use this scheme when the transmitter is a high computational/power device and the receiver is a low computation/power device.In essence, a properly adjusted synchronization error scheme (an insertion scheme) seems to be well suited for a resources-asymmetric communication scenario in which a base station has ample resources while each of the numerous distributed nodes has severely constrained resources.

2.2.
Framework for Encryption and Decryption.This section proposes an encryption/decryption technique for one-way communication from a transmitting party with high computational and other resources towards a receiving party with limited computational capabilities.Accordingly, the design follows the asymmetric implementation and execution constraints and the requirement regarding provable security.
As usual, it is assumed that encryption and decryption parties share a secret key and that before a transmission session, based on the common secret key and the public data, both parties (encryption and decryption ones) establish a session key to be used for the transmission session.
The encryption/decryption technique is designed employing the following components: (a) Encryption side: (i) a lightweight stream cipher (keystream generator); (ii) a block which provides deterministic mapping (see Figure 1) of a given keystream segment of dimension + into a vector with predetermined weight equal to , that is, with a number of ones equal to  which determines positions of the embedded bits; (iii) a simulator of a binary channel with random bits insertions controlled by keystream generator which performs mapping {0, 1}  → {0, 1} + .
(b) Decryption side: (i) a lightweight stream cipher (keystream generator); (ii) a block for deterministic mapping of a given keystream segment into a vector with predetermined weight, that is, the number of ones, the same as that at the encryption side; (iii) a block for decimation controlled by keystream generator which performs mapping {0, 1} + → {0, 1}  .
We assume that implementation and execution complexity of a keystream controlled simulator of a binary channel with random insertions is highly dominant in the considered encryption/decryption scheme.
Assuming that  and  are the parameters, for specification of the proposed encryption/decryption, the following notation is employed: (i) M is -dimensional binary vector of data which should be encrypted; (ii) C is -dimensional binary vector of keystream for stream ciphering; (iii) G  is ( + )-dimensional binary vector of keystream nonoverlapping with C; (iv) G is ( + )-dimensional binary vector of the weight exactly  obtained by a deterministic mapping of G  ; (v) X is -dimensional binary vector defined as X = M ⊕ C; (vi) Y is ( + )-dimensional binary vector which is equal to X with  inserted random bits.
The proposed encryption/decryption is displayed in Figure 1.

Information-Theoretic Analysis
This section yields an information-theoretic analysis of a (statistical) model of the considered encryption displayed in Figure 1.
A random variable is denoted by an uppercase letter (e.g., ) and its realization is denoted by a lowercase letter (e.g., ).An index (subscript) denotes discrete time.A discrete-time sequence of  random variables, for example,  1 ,  2 , . . .,   , is shortly denoted by   = ( 1 ,  2 , . . .,   ).Since our channel has synchronization errors, we have a need to distinguish strings from sequences.We denote a random string (indexed by discrete-time ) as  () .The string  () may not have a fixed length, and we denote its length (which is a random variable if the string itself is a random variable) as L( () ).A concatenation of two strings  and  is denoted by  ‖ .As short notation, we denote the concatenation of  strings  (1) through  () as  () =  (1) ‖  (2) ‖ ⋅ ⋅ ⋅ ‖  () .The entropy of a random object  is denoted by (), and the mutual information between two random objects  and  is denoted by (; ).The binary entropy function is denoted by ℎ Let the channel input   be a binary random variable drawn from the alphabet X = {0, 1}.The vector of all channel inputs up to time  is denoted by   ≜ ( 1 ,  2 , . . .,   ).The transmitter (Alice) observes the pseudo-random sequence   ≜ ( 1 ,  2 , . . .,   ) provided by a shared source of randomness (shared with Bob) and uses it to create a channel output (ciphertext)  () .Even though   is a pseudorandom sequence, we assume that the variables   are statistically indistinguishable from independent and identically distributed (iid) geometric random variables with parameter ; that is, for any integer ℓ ≥ 0, we have Here, the parameter  denotes the insertion probability.Namely, between any two symbols   and  +1 , Alice inserts a string  () that consists of Bernoulli-1/2 random variables, such that the length of  () equals L( () ) =   .Since   is a sequence of iid geometric random variables with parameter , it is clear that Alice's transmission scheme is equivalent to randomly inserting a Bernoulli-1/2 random variable at any point of time during the communication.Formally, we state that Alice creates a string  () obtained as a concatenation of individual strings  (1) ,  (2) , . . .,  () , that is, where each individual string  () is obtained as The length of the string  () equals that is, on average, Alice inserts /(1−) Bernoulli-1/2 random variables between any two symbols   and  +1 .Eve (the eavesdropper) and Bob (the intended receiver) both receive the string  () containing the randomly inserted symbols.The eavesdropper, not having access to the shared source of randomness   , cannot easily parse the string  ()  to recover   .The intended receiver, on the other hand, has access to   , and since   represents the length of the inserted string between any two symbols   and  +1 , the intended receiver (Bob) can easily remove the inserted symbols   from  () (i.e., decimate  () ) to recover   .In other words, by sharing the source of randomness   , Bob can resynchronize himself with Alice; see Figure 1.
The sequence   is a pseudo-random sequence, but for the purpose of computing information-theoretic quantities, we assume that   is modeled to be statistically indistinguishable from a sequence of iid Bernoulli-1/2 random variables.(It should not be understood that   implements a onetime pad.The variables   are only statistically modeled as Bernoulli-1/2 for the purposes of deriving (and computing) some information-theoretic quantities that we later use to derive a cryptographic security measure.) Here, no assumptions are made on the statistical properties of the message   , but because   is iid Bernoulli-1/2, we have that   is also iid Bernoulli-1/2.Hence, the information-theoretic quantity of interest is the iud information rate defined as the information rate between   and  () when the symbols   are independent and uniformly distributed (iud): The information rate I iud (; ) represents the amount of information that the eavesdropper can "learn," on average, about  after observing .
For large , the correction term (1/)(L( () )) in ( 7) equals If our desired accuracy of computing (bounding) I iud (; ) is 10 −4 and if  = 0.95, considerations of ( 7)-( 9) dictate that  ≥ 1.5 ⋅ 10 5 .For details on how to compute I iud (; ) using "rhomboidal" trellis techniques such that both the desired correction term (9) and the confidence interval are kept under a predetermined accuracy (e.g., 10 −4 ), see [10].Here, we only give numerical results in Figure 2, which reveal that the information rate I iud (; ) is only a small fraction of the entropy rate (  ) = 1, especially when  > 0.5.These results are very favorable for secret communication because only a small fraction of the uncertainty in   can be learned from observing  () , as the next section demonstrates.We already established that learning  after observing  is extremely unfavorable for the eavesdropper because the information rate I iud (; ) is low for large insertion probabilities .However, the eavesdropper may adopt a strategy in which she first attempts to learn the sequence   and then attempt to crack   .To study the effects of this strategy, let us define the following quantities: Proof.First, notice that because  () is a string of Bernoulli-1/2 random variables whose length is L( () ), and as  → ∞, we have Next, we also have and ( 11) is now a direct consequence of ( 15) and (17).Equality (12) follows from the fact that   is uniquely determined (by decimation) if   and  () are known; that is, (  |   ,  () ) = 0. Finally, (13) follows by adding (11) to (12) and applying the chain rule for mutual information, and ( 14) follows from (13) also using the chain rule.
By equality (11) of Proposition 1, it is clear that the eavesdropper cannot learn   simply by observing  () .Also, from Figure 2, it is clear that, from the eavesdropper's perspective, learning   from  () is extremely unfavorable because she can only learn a small fraction I iud (; ) of () ≜ (  ) = 1 by observing  () .However, equality (12) of Proposition 1 reveals a potential vulnerability in that if the eavesdropper were to somehow learn   , then secrecy would be lost because I iud (;  | ) = () = 1.Since learning either   or   individually is not favorable to the eavesdropper, the eavesdropper's strategy could be to go after the pair (, ).Indeed, equality (13) of Proposition 1 reveals that, theoretically, the eavesdropper could gain substantial knowledge of the pair (, ) by observing  () .Even for large , this posterior knowledge of the pair (, ), quantified as I iud (, ; ), is not a negligible fraction of the entropy In the next section, we further explore the cryptographic implications by studying the connection between computational complexity and the information-theoretic quantities.

Generic Framework for the Security Evaluation
Note that the above information-theoretic analysis is based on modeling the pseudo-random sequence   as a random sequence.In this section, we now take into account the fact that the sequence is indeed pseudo-random.We show that the considered encryption (see Figure 1) based on employing the binary insertion channel [  →  () ] provides enhanced security compared to the basic scheme that outputs only   .

Preliminaries: Security Notation.
A definition of security consists of two distinct components: a specification of the assumed power of the adversary and a description of what constitutes a "break" of the scheme.Generally speaking, a cryptographic scheme is secure in a computational sense, if, for every probabilistic polynomial-time adversary A carrying out an attack of some specified type and for every polynomial (), there exists an integer  such that the probability that A succeeds in this attack (where success is also well defined) is less than 1/() for every  > .Accordingly, the following two definitions specify a security evaluation scenario and a security statement.
Definition 2. The adversarial indistinguishability experiment consists of the following steps: (1) The adversary A chooses a pair of messages (m 0 ; m 1 ) of the same length  and passes them onto the encryption system for encrypting.
(2) A bit  ∈ {0, 1} is chosen uniformly at random, and only one of the two messages (m 0 ; m 1 ), precisely m  , is encrypted into ciphertext Enc(m  ) and returned to A.
(3) Upon observing Enc(m  ), and without knowledge of , the adversary A outputs a bit  0 .
(4) The experiment output is defined to be 1 if  0 = , and 0 otherwise; if the experiment output is 1, denoted shortly as the event (A → 1), one says that A has succeeded.
Definition 3.An encryption scheme provides indistinguishable encryptions in the presence of an eavesdropper, if for all probabilistic polynomial-time adversaries A where  = negl() is a negligibly small function.
Definitions 2 and 3 are more precisely discussed in [11].

Evaluation of the Security Gain Based on the Mutual
Information.We consider the encryption system displayed in Figure 1 taking into account the fact that the legitimate parties share pseudo-random secret sequences instead of random ones.Our goal is to estimate the advantage of A in the indistinguishability game specified by Definition 2 when y ← Enc(m  ), where y is a particular realization of  () , assuming that the advantage of A is known when m 0 and m 1 are two chosen realizations of   and the corresponding realization of   is known.
Proposition 4. Let the encrypted mapping of   into   be such that 1/2 +  equals the advantage of the adversary A (specified by Definition 3) to win the indistinguishability game (specified by Definition 2), and let the mutual information I  (; ) be known.Under these assumptions, for large , Proof.Note that, for simplicity of the proof, Proposition 4 addresses a restricted case where it is assumed that 1/2 +  equals the advantage of the adversary A (specified by Definition 3) to win the indistinguishability game.Let the index  of the selected message be realization of the random variable  whose distribution reflects that of the output of adversary A. The probability Pr( =  |  () = y) that A wins the game is determined by the following: According to the proposition assumption we have where x  corresponds to the selected m  , and Mathematical Problems in Engineering 7 Consequently, Pr ( () = y) Pr ( () = y) Next, we have the following general upper bound on the entropy (see [12] or [13], e.g.): where ℎ(⋅) ≤ 1 is the binary entropy function and (  )=2 − . (26)

Evaluation of the Security Gain Based on Numerical Estimation of the Mutual Information
Theorem 5. Let the encrypted mapping of   into   be such that 1/2 +  equals the advantage of the adversary A (specified by Definition 3) to win the indistinguishability game (specified by Definition 2), and let the mutual information I  (; ) be known (see Figure 2, e.g.).Under these assumptions, for large , Substitution of ( 7) and ( 9) into (28) finalizes the proof.
Accordingly, the encryption mapping   →  () enhances security by a factor  in comparison to the encryption mapping   →   because the probability that A wins the game becomes closer to 1/2, which corresponds to random guessing.

Evaluation of the Security Enhancement
Employing Enumeration of Channel Input Candidates for the Given Output 6.1.Preliminaries.Let Z ∈ {0, 1} ℓ be a binary string of length ℓ, and let  ≤ ℓ be a parameter.Recently, in [9], improved bounds on the number of subsequences obtained from a binary string Z of length ℓ under  deletions have been reported.It is known that the number of subsequences in this setting strongly depends on the number of runs in the string Z, where a run is a maximal substring of the same character.The improved bounds are obtained by a structural analysis of the family of -run strings Z, an analysis in which the extremal strings with respect to the number of subsequences have been identified.Specifically, for every , -run strings with the minimum (resp., maximum) number of subsequences under any  deletions have been considered, an exact analysis of the number of subsequences of these extremal strings has been presented, and it has been shown that this number can be calculated in polynomial time.
Let   (Z) be a set of subsequences of Z that can be obtained from Z after  deletions.The analysis of   (Z) and its size are challenging as the number of subsequences of a string Z obtained by deletions not only depends on its length ℓ and the number  of deletions, but also strongly depends on its structure.For example,   (0 ℓ ) is of size 1 and equals the single string 0 ℓ− .Clearly, |  (Z)| is at most 2 ℓ− (as after  deletions we remain with a binary string of length ℓ − ).It has been shown that the number of subsequences |  (Z)| strongly depends on the number of runs  in the string.Here, a run is a maximal substring of the same character, and the number of runs  = (⋅) in a given string Z is denoted by (Z).
where Z   is a string of length  with  runs.In [9], also a family of strings, named unbalanced strings, has been defined.A string is called unbalanced, if all of the runs of symbols in the string are of length 1, except for one run.Let  ()  ℓ, be a binary string of length ℓ with  runs, in which all runs are of length 1, except for the th run which is of length ℓ −  + 1. Due to symmetry |  ( (1)  ℓ, )| = |  ( It has been shown in [9] that these extreme cases have the least number of subsequences among the unbalanced strings and also that they have the least number of subsequences among all strings.The following theorem has been proven in [9]. Theorem 6 (Theorem 3 [9]: closed-form formula for (ℓ, , )).For all  < ℓ, 2 <  ≤ ℓ, A numerical illustration of Theorem 6 is displayed in Figure 3.

Estimation of the Security Enhancement.
Traditionally, as introduced in [14], the main information-theoretic security metric is the average information leaked, that is, the mutual information (M; Y) between the message M and the related sample Y, or, equivalently, the uncertainty, that is, the equivocation (M | Y).Recently, certain information-theoretic security measures have been considered in [15] implying that, in our case, as a strong security metric the average mutual information (M, Y) should be addressed and (1/)(M, Y) as a corresponding weak one.
where ( + , , ) is the number of certain equally likely subsequences.
Sketch of the Proof.The uncertainty about the input (the argument) into a binary channel with random insertions given its output (the image) depends on the number of equally likely candidate arguments which can generate the given image.A lower bound on the number of these candidates can be obtained based on the lower bound on the number of the subsequences which can be obtained from the given one employing Theorem 6 (i.e., Theorem 3 from [9]).By adapting this result to the considered particular case we have the following.A lower bound on the number of the argument candidates (+, , ), where  is a parameter, is given by ( 39) and ( 40 assuming that (, 0) = 1 and, for  < 0, (, ) = 0. Particularly note that the above enumerated subsequences are obtained from a sequence where all of the runs of symbols are of length 1, except for one run, and that the assumed decimation is a random one, and in addition, for simplicity of the evaluation we assume that the subsequences appear equally likely.Consequently, the uncertainty (X | Y) is lowerbounded as follows:   Note that, in order to achieve a desired high enhancement of the security, the insertion rate should be high enough as illustrated in Figure 4.When the insertion rate is low, the security enhancement is low as well, and this is analytically shown in the next corollary.

Figure 1 :
Figure 1: Encryption/decryption technique for scenarios with one-way communications between the entities with high performance computing capabilities and the very tiny ones.

( 3 ∑Figure 3 :
Figure 3: Number (No) of different subsequence of length ℓ which can be obtained from a binary sequence of length ℓ + : a numerical illustration of the statement of Theorem 3 [9].

Figure 4 :
Figure 4: Numerical examples related to Theorem 7: illustration of the security gain implied by a binary channel with embedding of random bits noting that smaller  means higher security enhancement.

Figure 4
Figure 4 yields numerical illustrations of coefficient  which determines the security gain.Note that, in order to achieve a desired high enhancement of the security, the insertion rate should be high enough as illustrated in Figure4.When the insertion rate is low, the security enhancement is low as well, and this is analytically shown in the next corollary.