Using the Fano inequality for the generalized R-norm entropy and Bayess probability of error, a generalized random-cipher result is proved by taking into account the generalized R-norm entropy of type β.

1. Introduction

It is known that a good cryptosystem can be built provided that the key rate is greater than the message redundancy [1]. Shannon obtained this result by considering the equivocation of the key over a random cipher. By counting the average number of spurious decipherments over a restricted class of random ciphers, Hellman [2] obtained the same result. A similar result was proved by Lu [3] by using the average probability of correct decryption of a message digit as a measure of performance and the Fano inequality for a class of cryptosystems. The analysis done by Lu is precise, whereas in [1] approximations are used. All of these results are obtained by taking into account the Shannon entropy.

Sahoo [4] generalized the results of Lu by considering Renyi’s entropy and the Bayes probability of error. But in the literature of information theory, there exist various generalizations of Shannon’s entropy. One of these is the R-norm information, which was introduced by Arimoto [5] and extensively studied by Boekee and Van der Lubbe [6]. The objective of this paper is to generalize the results of Lu by considering the generalized R-norm entropy of type β and Bayess probability of error.

2. Generalization of Shannon’s Random-Cipher Result

Consider a discrete random variable X, which takes values x1,x2,…,xn, having the complete probability distribution {p(xi);i=1,2,…,n}. Also consider the set R* of positive real numbers not equal to 1; that is, R*={R:R>0,R≠1}. Then the R-norm information [5] is defined as
(1)HR(X)=RR-1[1-(∑i=1npR(xi))1/R],R∈R*.

This measure is different from the entropies of Shannon [1], Renyi [7], Havrda and Charvát [8], and Daroczy [9]. The most interesting property of this measure is that when R →1, it approaches to Shannon’s [1] entropy and in case R→∞,RH(P)→(1-maxpi),i=1,2,…,n. The measure (1) can be generalized in so many ways; however, Hooda and Ram [10] studied a parametric generalization as follows:
(2)HR/(2-β)(X)=RR+β-2×[1-(∑i=1npR/(2-β)(xi))(2-β)/R],
where 0<β<2,R+β≠2.

The measure (2) may be called the generalized R-norm entropy of type β and it reduced to (1) when β=1. In case R = 1, (2) reduces to
(3)H1/(2-β)(X)=1β-1[1-(∑i=1np1/(2-β)(xi))2-β],0<β<2,β≠1.

Setting γ=1/(2-β) in (3), we get
(4)Hγ(X)=γγ-1[1-(∑i=1npγ(xi))1/γ],γ>12(≠1).

The information measure (4) has also been mentioned by Arimoto [5] as an example of a generalized class of information measures. Although (4) and (1) are the same form, yet these differ as the ranges of R and γ are different. However, (2) is a joint representation of (1) and (4). So it is interesting to study the applications of the generalized R-norm entropy of type β.

Let us consider now another discrete random variable Y, with values y1,y2,…,ym, having the complete probability distribution {p(yj);j=1,2,…,m}. Now consider a two-dimensional discrete random variable (X,Y) with (x1,y1),(x2,y2),…,(xn,ym) as values with probabilities p(x1,y1),p(x2,y2),…,p(xn,ym), respectively. If p(xi|yj) is the conditional probability of xi given yj, then using Bayes’s theorem, we have
(5)p(xi|yj)=p(xi,yj)p(yj).

Definition 1.

The joint R-norm information measure of type β for R∈R* and 0<β<2,R+β≠2 is given by
(6)HRβ(X,Y)=RR+β-2×[1-{∑i=1n∑j=1mpR/(2-β)(xi,yj)}(2-β)/R].
It is easy to see that HRβ(X,Y) is symmetric in X and Y. Due to the nonadditivity property, it follows at once that if X and Y are stochastically independent,
(7)HRβ(X,Y)=HRβ(X×Y)=HRβ(X)+HRβ(Y)-(R+β-2R)HRβ(X)HRβ(Y)
holds.

To construct a conditional R-norm information measure of type β, holds we can use a direct and an indirect methods. The indirect method leads to the following definition.

Definition 2.

The average subtractive conditional R-norm information of X given Y is for R∈R* and 0<β<2,R+β≠2, defined as
(8)H1Rβ(X∣Y)=HRβ(X,Y)-HRβ(Y)=RR+β-2×[{∑j=1mpR/(2-β)(yj)}(2-β)/R-{∑i=1n∑j=1mpR/(2-β)(xi,yj)}(2-β)/R].
Note that by choosing this definition, we have assumed additivity in the sense that
(9)HRβ(X,Y)=HRβ(Y)+HRβ(X∣Y).
If X and Y are statistically independent, then using Bayes’s theorem, we have
(10)HRβ(X,Y)=HRβ(X×Y)=HRβ(X)+HRβ(Y).
A direct way to construct a conditional R-norm information measure of type β is the following.

Definition 3.

The average conditional R-norm information measure of type β of X given Y is for R∈R* and 0<β<2,R+β≠2, defined as
(11)H2Rβ(X∣Y)=RR+β-2×[1-∑j=1mp(yj){∑i=1npR/(2-β)(xi∣yj)}(2-β)/R].
To discuss the two forms for conditional R-norm information measure of type β, we introduce two requirements which can be imposed on conditional information measures; that is,

if X and Y are independent, then
(12)HRβ(X∣Y)=HRβ(X).

(13)HRβ(X∣Y)≤HRβ(X)

with equality if and only if X and Y are independent.

It is clear that (b) includes (a) and therefore is a stronger restriction. However, it is a basic property since it is of fundamental importance in applications. In the next two theorems, we state the behavior of the two conditional measures with respect to the requirements (a) and (b).

Theorem 4.

If X and Y are statistically independent random variables, then for R∈R* and 0<β<2,R+β≠2,
(14)H1Rβ(X∣Y)=RR+β-2×[{∑j=1mpR/(2-β)(yj)}(2-β)/R-{∑i=1npR/(2-β)(xi)}(2-β)/R×{∑j=1mpR/(2-β)(yj)}(2-β)/R](15)=HRβ(X)-(R+β-2R)HRβ(X)HRβ(Y).(16)H2Rβ(X∣Y)=HRβ(X).

Proof.

The proof of (14) and (16) follows from the expressions (8) and (11). From (7), we obtain (15).

From this theorem, we may conclude that the measure 1HRβ(X∣Y), which is obtained by the formal difference between the joint and the marginal information measures, does not satisfy requirement (a). Therefore, it is less attractive than the measure 2HRβ(X∣Y).

In the next theorem, we consider requirement (b), for the conditional information measure 2HRβ(X∣Y).

Theorem 5.

If X and Y are discrete random variables, then for R∈R* and 0<β<2,R+β≠2,
(17)H2Rβ(X∣Y)≤HRβ(X)
holds.

The equality signs hold if and only if X and Y are independent.

Proof.

We know by [11] that for R+β>2. Consider
(18)[∑i=1n[∑j=1mxij]R/(2-β)](2-β)/R≤∑j=1m[∑i=1nxijR/(2-β)](2-β)/R.
Setting xij=p(xi,yj)≥0 in (18), we have
(19)[∑i=1n[∑j=1mp(xi,yj)]R/(2-β)](2-β)/R≤∑j=1m[∑i=1npR/(2-β)(xi,yj)](2-β)/R
or
(20)[∑i=1npR/(2-β)(xi)](2-β)/R≤∑j=1mp(yj)[∑i=1npR/(2-β)(xi|yj)](2-β)/R(21)⟹1-[∑j=1mp(yj)[∑i=1npR/(2-β)(xi|yj)](2-β)/R]≤1-[∑i=1npR/(2-β)(xi)](2-β)/R.
Using R/(R+β-2)>0 as R+β>2, we find that
(22)H2Rβ(X∣Y)≤HRβ(X).
On the same line, we can prove that (22) holds for R+β<2. Hence, (17) holds for all R∈R* and 0<β<2,R+β≠2. The equality sign holds if and only if xij is separable in the sense that xij=p(xi)p(yj). This is the independence requirement for probabilities.

It follows from Theorem 5 that 2HRβ(X∣Y) fulfill requirements (a) and (b).

Let us consider the mathematical model as in [3]. For the sake of clarity, we will repeat some definitions of terms from [3]. In this model, the output {Mi} of the message source is a stationary random sequence. Each component {Mi} takes a value from a finite set AM. The incoming message sequences {Mi} are converted by the instantaneous block encipherer into message words of length n. Let us denote a message word by M={M1,M2,…,Mn}. The key source output is a random variable K,K∈AK, that is statistically independent of M and takes equiprobable values in a finite set. The key rate rn,R,β is defined as
(23)rn,R,β=1nHRβ(1|AK|,1|AK|,…,1|AK|)=Rn(R+β-2)[1-|AK|(2-β-R)/R].
It is easy to see that
(24)limβ=1,R→1rn,R,β=1nlog|AK|,
which is the key rate as defined in [3].

Also, the generalized R-norm entropy of type β redundancy of the message block M is defined as the difference between the maximal value of the generalized R-norm entropy of type β and the actual generalized R-norm entropy of type β; that is,
(25)dn,R,β=-HRβ(M)n+Rn(R+β-2)×[1-|AM|n(R+β-2)/R].
It is easy to see that
(26)limβ=1,R→1dn,R,β=-H(M)n+log|AM|,
which is the redundancy of the message block M in the case of the Shannon entropy.

The key is input to the keyword generator; when the key takes the value K, the keyword-generator output is a keyword of length n,cK=(cK,1,cK,n), where cK,i is an element of the key alphabet Ac. For each digit i, an instantaneous block encipherer produces the ith digit of the cryptosystem word X using the following combiner:
(27)Xi=f(Mi,cK,i),i=1,2,…,n,
where f is bijective mapping from AM×Ac to AX, the set of cryptogram letters.

It is assumed in this paper that the sets AM,Ac, and AX are of finite cardinality and the combiner f is one to one in each variable. The instantaneous decipherer uses the key to generate the keyword and applies the inverse of f,g(·,·) to each letter Xi of the cryptogram word X to recover the message word M; that is, Mi=g(f(Mi,cK,i),cK,i),i=1,2,…,n. The cryptoanalyst intercepts the cryptogram words and attempts to decrypt by using his or her knowledge of the a priori message and key, using probabilities, the combiner f, the block length n, the key rate rn,R,β, and the cryptogram. That is, for the cryptogram, he or she assumes a key K* and decrypts the message as M*={M1*,M2*,…,Mn*}, where
(28)Mi*=f(Xi,cK*,i),i=1,2,…,n.
The decrypted message word M* is assumed to be one of the possible message words, given the cryptogram word X and the cipher.

Let M∈AMn be a message word of length n and M* the decrypted word for the message word M. If pe(i,n) denotes the probability of an error in correct decryption of the ith digit according to the Bayes decision scheme, then
(29)pe(i,n)=1-∑K=1|AK|p(ck*,i)maxip((ck,i)(ck*,i)),
where |AK| is the cardinality of the set AK. Thus, the average probability of error per letter for a message word of length n is defined as
(30)pe(n)=1n∑i=1npe(i,n).
In the following theorem a random-cipher result is proved in the case of the generalized R-norm entropy of type β.

Theorem 6.

Consider the stationary random discrete source with alphabet size AM. Let dn,R,β be the generalized R-norm entropy of type β redundancy, rn,R,β the key rate, and pe the average probability of correct decryption. Then
(31)RR+β-2×[peR/(2-β)}(2-β)/R1-{peR/(2-β)(1-pe)R/(2-β)+(|AM|-1)(2-β-R)/(2-β)×peR/(2-β)}(2-β)/R]≥rn,R,β-dn,R,β,
where R∈R*.

Proof.

Since the message and the key are statistically independent, we have
(32)HRβ(M,K)=HRβ(M)+HRβ(K).
Now f is one to one in each variable and X=f(M,cK), so
(33)HRβ(M,K)=HRβ(X,K),
where X is the cryptogram. Then from (32) and (33), we have
(34)HRβ(X,K)=HRβ(M)+HRβ(K).
Also we have
(35)HRβ(X,K)=HRβ(X)+HRβ(K∣X).
Using (35) in (34), we have
(36)HRβ(K∣X)=HRβ(M)+HRβ(K)-HRβ(X).
Since the total number of cryptograms is equal to |AM|n, therefore
(37)HRβ(X)≤RR+β-2[1-|AM|n(2-β-R)/R].
Using (37), (36) becomes
(38)HRβ(K∣X)≥HRβ(M)+HRβ(K)-RR+β-2[1-|AM|n(2-β-R)/R].
We assume that the keys are chosen randomly, so that each key is equiprobable and hence
(39)HRβ(K)≤RR+β-2[1-|AK|(2-β-R)/R].
Using (39), the inequality (38) becomes
(40)HRβ(K∣X)≥HRβ(M)+RR+β-2×[|AM|n(2-β-R)/R-|AK|(2-β-R)/R].
Since K* is a function of X only, we have
(41)HRβ(K∣K*)≥HRβ(K∣X).
Thus using (41), the inequality (40) becomes
(42)HRβ(K∣K*)≥HRβ(M)+RR+β-2×[|AM|n(2-β-R)/R-|AK|(2-β-R)/R].
Associated with each key, there is a keyword cK and the keywords are distinct. Therefore
(43)HRβ(cK∣cK*)=HRβ(K∣K*),
where cK=(cK,1,cK,2,…,cK,n) and cK*=(cK*,1,cK*,2,…,cK*,n). We also have
(44)HRβ(cK∣cK*)=HRβ(cK,1∣cK*)+HRβ(cK,2∣cK,1,cK*)+⋯+HRβ(cK,n∣cK,1,cK,2,…,cK,n-1,cK*).
From (23), we can prove in a simple way that
(45)HRβ(X∣Y,Z)≤HRβ(X∣Y),
where X,Y, and Z are three random variables. Using this result, (44) becomes
(46)∑i=1nHRβ(cK,i∣cK*,i)≥HRβ(cK∣cK*).
Thus using (45) and (43), the inequality (42) becomes
(47)∑i=1nHRβ(cK,i∣cK*,i)≥HRβ(M)+RR+β-2×[|AM|n(2-β-R)/R-|AK|(2-β-R)/R].
Using the Fano inequality for the R-norm information [6], we have
(48)HRβ(cK,i∣cK*,i)≤RR+β-2×[RR+β-21-{[pe(i,n)]R/(2-β)[1-pe(i,n)R/(2-β)]+(|AM|-1)(2-β-R)/(2-β)×[pe(i,n)]R/(2-β)}(2-β)/RRR+β-2].
From (47) and (48), we have
(49)1n[HRβ(M)+RR+β-2×(|AM|n(2-β-R)/R-|AK|(2-β-R)/R)RR+β-2]≤RR+β-2×[1-1n∑i=1n{∑i=1n[1-pe(i,n)R/(2-β)]+(|AM|-1)(2-β-R)/(2-β)×[peR/(2-β)(i,n)]∑i=1n}(2-β)/R∑i=1n].
Since the function (∑i=1nxiR/(2-β))(2-β)/R with xi≥0 is convex for R+β>2, it follows from Jensen’s inequality that for R+β>2,
(50)1n∑i=1n{∑i=1n[1-pe(i,n)R/(2-β)]+(|AM|-1)(2-β-R)/(2-β)[peR/(2-β)(i,n)]∑i=1n}(2-β)/R≥[{1n∑i=1npe(i,n)}R/(2-β)1-1n∑i=1npe(i,n)R/(2-β)+(|AM|-1)(2-β-R)/(2-β)×{1n∑i=1npe(i,n)}R/(2-β)](2-β)/R.
Now using (50), the inequality (49) becomes
(51)1n[HRβ(M)+RR+β-2×(|AM|n(2-β-R)/R-|AK|(2-β-R)/R)RR+β-2]≤RR+β-2×{+(|AM|-1)(2-β-R)/(2-β)peR/(2-β)](2-β)/R1-[RR+β-2(1-pe)R/(2-β)+(|AM|-1)(2-β-R)/(2-β)peR/(2-β)RR+β-2](2-β)/R}.
It is easy to see that (49) is valid also for R+β<2. Thus, using (23) and (25), we have
(52)RR+β-2×[+(|AM|-1)(2-β-R)/(2-β)peR/(2-β)}(2-β)/R1-{peR/(2-β)(1-pe)R/(2-β)+(|AM|-1)(2-β-R)/(2-β)peR/(2-β)}(2-β)/R]≥rn,R,β-dn,R,β,
which proves the result.

Remarks.

The result in the theorem is valid for all ciphers considered and for any cryptoanalyst performing feasible decryption. The theorem states that in order to design a good cipherer, it is sufficient to use a key rate rn,R,β greater than the generalized R-norm entropy of type β redundancy of the message.

Particular Case. If β=1,R→1, the inequality (31) reduces to the inequality
(53)H(pe)+pelog(|AM|-1)≥rn,1,1-dn,1,1,
where
(54)rn,1,1=1nlog|AK|,dn,1,1=log|AM|-1nH(M).
This result was derived by Lu [3]. However, instead of using the Bayes probability of error as defined in this paper, Lu [3] considered as definition the probability of error in correct decryption:
(55)pe=1n∑i=1np(cK*,i=cK,i).

ShannonC. E.Communication theory of secrecy systemsHellmanM. E.An extension of the Shannon theory approach to cryptographyLuS. C.The existence of good cryptosystems for key rates greater than the message redundancySahooP. K.Renyi's entropy of order α and Shannon's random cipher resultArimotoS.Information-theoretical considerations on estimation problemsBoekeeD. E.Van der LubbeJ. C. A.The R-norm information measureRenyiA.On measure of entropy and information1Proceedings of the 4th Berkeley Symposium on Mathematical Statistics and ProbabilityJune 1961Berkeley, Calif, USA547561HavrdaJ.CharvátF.Quantification method of classification processes. Concept of structural α-entropyDaroczyZ.Generalized information functionsHoodaD. S.RamA.Characterization of the generalized R-norm entropyBeckenbachE. F.BellmanR.