The Unifying Frameworks of Information Measures

Information measures are capable of providing us with fundamental methodologies to analyze uncertainty and unveiling the substantive characteristics of random variables. In this paper, we address the issues of different types of entropies through q-generalized Kolmogorov-Nagumo averages, which lead to the propositions of the survival Rényi entropy and survival Tsallis entropy.Therefore, we make an inventory of eight types of entropies and then classify them into two categories: the density entropy that is defined on density functions and survival entropy that is defined on survival functions. This study demonstrates that, for each type of the density entropy, there exists a kind of the survival entropy corresponding to it. Furthermore, the similaritymeasures and normalized similarity measures are, respectively, proposed for each type of entropies. Generally, functionals of different types of informationtheoretic metrics are equally diverse, while, simultaneously, they also exhibit some unifying features in all their manifestations. We present the unifying frameworks for entropies, similarity measures, and normalized similarity measures, which helps us deal with the available information measures as a whole and move from one functional to another in harmony with various applications.


Introduction
Measures of probabilistic uncertainty and information have attracted growing attentions since Hartley introduced the practical measure of information as the logarithm of the amount of uncertainty associated with finite possible symbol sequences, where the distribution of events is considered to be equally probable [1].Today, entropy plays a basic role in the definitions of information measures with various applications in different areas.It has been recognized as the fundamental important field intersecting with mathematics, communication, physics, computer science, economics, and so forth [2][3][4][5].
The generalized information theory arising from the study of complex systems was intended to expand classical information theory based on probability.The additive probability measures, which are inherent in classical information theory, are extended to various types of nonadditive measures and thus result in different types of functionals that generalize Shannon entropy [6][7][8].Generally, the formalization of uncertainty functions involves a considerable diversity.However, it also exhibits some unifying features [9].
Let () be a density function of r.v. with ∫  () = 1.The Khinchin axioms [10] are capable of obtaining the Shannon entropy in a unique way.However, this may be too narrow-minded if one wants to describe complex systems.Therefore, a generalized measure of an r.v. with respect to Kolmogorov-Nagumo (KN) averages [11] can be deduced as where  is a continuous and strictly monotonic KN function [12] and hence has an inverse  −1 .

Entropies Defined on Survival Functions.
As narrated in [19], information measures defined on the density function suffer from several drawbacks, since the distribution function is more regular than the density function.Therefore, the cumulative residual entropy, which was defined on the cumulative distribution function or equivalently the survival function, was proposed as an alternative information measure of uncertainty.
If the density function is replaced by the survival function,  is set as 1, and () =  in (3), it yields the survival Shannon entropy (SSE) [19] defined as Since eight different types of entropies and their corresponding similarity measures will be discussed subsequently, it is worth pointing out that some notations and names of the existing information measures will be changed in harmony with the unifying frameworks throughout this paper.
To consider the conditional survival entropy, we denote  | ( | ) as the conditional distribution function of  given  =  and also  | ( | ) as the respective conditional survival function.
The cross survival Shannon entropy (CSSE) of r.v.s (, ) was given by [19] where H( | ) is the conditional survival Shannon entropy of r.v.s  given  defined as [19] H and   (⋅) here is the expectation with respect to an r.v..The nonnegativity of CSSE was proven in [19] and thus CSSE was used as a similarity measure in image registration [20].
The generalized versions of SSE in dynamic systems were discussed in [21,22].
If the density function in ( 9) is replaced by the survival function, this yields the survival exponential entropy (SEE) [23] of an r.v. ∈   + with order  given by where  > 0 and  ̸ = 1.As an ongoing research program, generalized information measure offers us a steadily growing inventory of distinct entropy theories.Diversity and unity are two significant features of these theories.The growing diversity of information measures makes it increasingly more realistic to find a certain information measure suitable for a given condition.The unity allows us to view all available information measures as a whole and to move from one measure to another as needed.
To that end, motivated by the researching approaches on Shannon entropy, Shannon mutual information [2], SSE [19], and SEE [23], we attempt to study information-theoretic metrics in their manifestations.On one hand, we propose several new types of entropies and their similarity measures; on the other hand, for each type of the existing entropies, except for Shannon entropy, we give the definitions of similarity measures (see Tables 1 and 2).Finally, we deduce the unifying frameworks for information measures emerging from the study of complex systems based on probability.
The remainder of this paper is organized as follows.Section 2 will propose the similarity measures defined on the density function.In Section 3, the survival Rényi entropy and survival Tsallis entropy are presented.In Section 4, we address the similarity measures defined on the survival function.The unifying frameworks of information measures and examples are provided in Section 5. Finally, we conclude this paper in Section 6.

Similarity Measures Defined on the Density Function
Shannon mutual information measures the information of an r.v. conveying about another r.v..It has been widely used in image registration [24,25] and pattern recognition [26,27].Generally, as SMI is defined on Shannon entropy, each type of entropies would lead to corresponding similarity measures.In application, an idea similarity measure should be nonnegative.To that end, we take the way as [15,19,23] to define the similarity measures by linear expectation operator rather than KN average operator weighted by the escort distribution [28].This section will present the similarity measures defined on the density function corresponding to Rényi entropy, Tsallis entropy, and the exponential entropy, respectively.Proof.For a real convex function (), using Jensen's inequality [29], we obtain
Lemma 1 plays an important role to prove the nonnegativity for the similarity measure to be introduced, which is defined on the density function.
Definition 2. Let X and Y be r.v.s; the conditional Rényi entropy of  given  with order  is defined by where  > 0 and  ̸ = 1.Motivated by the definitions of the joint Shannon entropy (, ) = () + ( | ) and joint survival Shannon entropy H(, ) = H() + H( | ), the joint Rényi entropy can be similarly introduced.
Definition 5.The Rényi mutual information (RMI) of r.v.s  and  with order  is defined as where  > 0 and  ̸ = 1.It is worth pointing out that the definition of RMI parallels with the definitions of SMI (5) and CSSE (13).The nonnegativity of RMI is ensured by Theorem 4. Considering Theorem 4 that parallels with (5), we can give another form of the definition for RMI as There are no essential differences between these two forms of definitions for RMI.We only consider the similar definitions as (24) for similarity measures throughout this paper.
The normalized Shannon mutual information (NSMI) [16] of r.v.s  and  was given as NSMI often acts as a robust similarity measure in image registration [16,30], attribute abstraction [31], and clustering [32].Note that 1 ≤ (, ) ≤ 2. In a similar way, different forms of the normalized mutual information will be deduced in this work.Definition 6.The normalized Rényi mutual information (NRMI) of r.v.s  and  with order  is defined by where  > 0 and  ̸ = 1.We immediately obtain (, ) = lim →1   (, ) by L'Hôpital's rules.

Exponential Mutual Information
Definition 12.The conditional exponential entropy of r.v. given  with order  is defined by where  > 0 and  ̸ = 1.
Definition 13.The joint exponential entropy of r.v.s  and  with order  is defined as where  > 0 and  ̸ = 1.
Theorem 14.For two r.v.s X and Y, for all  > 0 and  ̸ = 1.

Entropies Defined on the Survival Function
The existing survival Shannon entropy and the survival exponential entropy extended the corresponding functionals from the density function to the survival function.In this section, we will propose the survival Rényi entropy and the survival Tsallis entropy defined on the survival function which, respectively, parallel with the classical Rényi entropy and Tsallis entropy defined on the density function.

Survival Rényi Entropy.
If the density function is replaced by the survival function,  is set as 1, and  is chosen as () =  (1−) in (3), it yields the survival Rényi entropy.
Definition 18.The survival Rényi entropy (SRE) of an r.v. with order  is defined as where  > 0 and  ̸ = 1.
We complete the proof by using Jensen's inequality and considering that 1 −  < 0 and   is convex of  > 0 for all  > 1 in a similar way.

Survival Tsallis Entropy.
If the density function is replaced by the survival function,  = , and () =  in (3), or, equivalently, the density function is replaced by the survival function and logarithm function is replaced by logarithm function in (7), it yields the survival Tsallis entropy.
Definition 30.The survival Tsallis entropy (STE) of an r.v. with order  is defined as where  > 0 and  ̸ = 1.
Note that   () = ∫ ∞    ().Using integration by parts formula, we obtain Hence, the survival Tsallis entropy can also be written as Definition 31.The conditional survival Tsallis entropy of r.v. given  with order  is defined as Definition 32.The joint survival Tsallis entropy of r.v.s  and  with order  is defined as T  (, ) = T  () + T  ( | ), where  > 0 and  ̸ = 1.It is easy to see that H() = lim →1 T  () and H(, ) = lim →1 T  (, ) using L'Hôpital's rules.
Theorem 33.Let  be an r.v.; then Proof.It is easy to verify using (51) and Theorem 22.
Since there is a relation among T  (), R  (), and E  () by (51), some properties of STE can be deduced by the theorems of SEE and SRE.We only list these properties as propositions and provide necessary explanations for their proofs.
Proof.It can be proven in a similar way as Proposition 26 by considering  − 1 < 0 and  −   is convex of  > 0 for all 0 <  < 1 and  − 1 > 0 and  −   is concave of  > 0 for all  > 1.

Similarity Measures Defined on the Survival Function
Paralleling with the similarity measures and the normalized similarity measures defined on the density function in Section 2, this section will focus on the corresponding similarity measures and the normalized similarity measures defined on the survival function.Traditionally, the kernel point is the proofs of the nonnegativity for the similarity measures to be introduced.

Cross Survival Rényi Entropy and Cross Survival Exponential Entropy
Lemma 38.Let X and Y be r.v.s and let () be a real convex function; then If, moreover, () is strictly convex, then equality holds in (54) if and only if X and Y are independent.The inequality is reversed if () is concave.
Lemma 38 was proven in [23].It is a cornerstone to prove the nonnegativity of each form of the similarity measure to be introduced according to the survival function.
Proof.Since   is concave of  > 0 and 1/(1 − ) > 0 for all 0 <  < 1, note that log is strictly concave of t.Using Lemma 38 and Jensen's inequality, we obtain Similarly, considering that   is convex of  > 0 and 1/(1 − ) < 0 for all  > 1, the conclusion is the same.
Definition 42.The normalized cross survival Rényi entropy (NCSRE) of r.v.s  and  with order  is defined as where  > 0 and  ̸ = 1.
Definition 43.The normalized cross survival exponential entropy (NCSEE) of r.v.s  and  with order  is defined as where  > 0 and  ̸ = 1.
Proof.Since   () =   ( | ( | )), using Lemma 38, considering 1/( − 1) < 0 and   is concave of  > 0 for all 0 <  < 1, we obtain (62) The similar way can be used to define the symmetric similarity measures and normalized similarity measures for those defined on the density function.

Unifying Frameworks and Examples
In this section, based on the generalized denotations on the entropies discussed previously, we will classify the fourteen types of entropies in two categories and then deduce the unifying presentations for entropies, similarity measures, and normalized similarity measures.Examples are also provided to unveil some properties of the information measures.
As enumerated in Table 1, different types of entropies have been discussed in this paper.There are three components in each item: entropy, conditional entropy, and joint entropy.In general, entropies in Table 1 can be classified into two categories: one is defined on the density function and the other is defined on the survival function.For simplicity, we refer to them as the density entropy and survival entropy, respectively.It is demonstrated that, for each type of the density entropy in Column 2, there is a survival entropy in Column 4 corresponding to it.

The Unifying Frameworks of Information Measures.
For convenience, we view Shannon entropy, Rényi entropy, Tsallis entropy, and the exponential entropy as the classical density entropy and view their corresponding survival entropies as the classical survival entropy.We can see that the classical density entropy, the classical survival entropy, and their conditional entropy and joint entropy share similar presentations.
Let H  () be one type of the generalized density entropy or the generalized survival entropy of r.v. with order .If  = 1, then H 1 () = H() means the Shannon entropy or the survival Shannon entropy.In these notations, H  ( | ) is the conditional entropy with order  > 0. The corresponding joint entropy of r.v.s  and  with order  > 0 can be introduced as For r.v.s  and , one has H  (, ) ≤ H  () + H  () for all  > 0.
Entropies, conditional entropies, and joint entropies are listed in Table 1.The similarity measures and the normalized similarity measures are shown in Table 2 in detail, where the similarity measure is followed by the normalized one in each item.In a similar way, the similarity measure can be classified into the density similarity measure defined on the density function and the survival similarity measure defined on the survival function and so can the normalized similarity measures.Therefore, the unifying presentations for a similarity and a normalized similarity measure associated with a type of entropies can be deduced as Note that I 1 (, ) = I(, ) and NI 1 (, ) = NI(, ).
We obtain I  (, ) ≥ 0 for all  > 0. Their symmetric versions can be, respectively, given by The unifying frameworks make it possible to view all the available entropies listed in Table 1 as a whole and to move from one to another as necessary.Subsequently, the similarity measures and the normalized similarity measures are simultaneously obtained.
If  = 1, then () =  − = ().Hence, the survival entropies become the corresponding density entropies.Figure 1(b) shows Shannon entropy, Rényi entropy, Tsallis entropy, and the exponential entropy of an r.v. with 0.5 ≤  ≤ 1.5.We can see that some properties, such as concavity and monotonicity, may be changed when we generalize entropies by extending their definitions from the density function to the survival function.
Example 2. Many literatures have pointed out that Shannon entropy has drawbacks: each frequency of the occurrence event contributes equally in summation or integral of its functional and, simultaneously, the spatial information is neglected.In fact, those entropies that are defined on density functions have similar drawbacks to Shannon entropy.However, the survival entropy can overcome this drawback.For instance, as shown in Figure 2, Lena is used as a fixed image with size of 256 × 256.We exchange those pixels with the same occurrence frequency in Figure 2 3, respectively, where  = 0.8.We can see that density entropies provide the same values, whereas survival entropies provide different values for these two images.In other words, these two images contain the same information in the view of the density entropy.But, to the survival entropy, the amount of information there is different.It is demonstrated that the survival entropy is capable of distinguishing the two images, whereas the density entropy is not.The reason is that spatial information is taken into account in survival entropy formulas.calculated with the same parameter settings,  = 0.8, as Figure 3(a).Figure 3 demonstrates that when the rotation angle is zero or, equivalently, the template and the floating image are perfectly aligned, all the values of the considered similarity metrics reach their maximums.In other words, they all can be adopted as similarity metrics in image processing.
Generally, the similarity measures defined on survival function are more regular than those defined on density function, since the density is computed as the derivative of the survival function [19].Therefore, the similarity measures defined on survival function are more robust in matching problems.
A kernel problem arising from matching problem is the choice of similarity metrics.In Table 2, sixteen different types of similarity measures and normalized similarity measures are provided with unifying presentations by ( 64) and (65).Their symmetric versions are given by ( 66) and (67) simultaneously.The proposed similarity measures are capable of developing the means of matching in signal processing to some extent.

Conclusions
This research is conducted in two dimensions.On one hand, we extend KN averages to -generalized KN averages to redefine the existing classical entropies and to define the survival Rényi entropy and survival Tsallis entropy on the survival function.On the other hand, for each type of entropies discussed, a similarity measure and a normalized similarity measure are proposed accordingly.Some properties of the information measures are researched.
We make an inventory of sixteen types of entropies, similarity measures, and normalized similarity measures which exhibit diversity and unity.This leads to the proposition of the unifying frameworks for information measures.Therefore, our work addresses a broad spectrum of information measures as a whole through the unifying frameworks.Undoubtedly, some of them will be conceptual if they are not adapted for applications.More applications of the proposed information measures will dominate our research in the near future.

Definition 16 .
The exponential mutual information (EMI) of r.v.s X and Y with order  is defined as E  (, ) = E  () − E  ( | ), where  > 0 and  ̸ = 1.Using Theorem 15, we have E  (, ) ≥ 0. Definition 17.The normalized exponential mutual information (NEMI) of r.v.s X and Y with order  is defined by
(a) and then generate the negative image as shown in Figure 2(b).Images (c) and (d) are their histograms, respectively.Thereby images (a) and (b) share the same occurrence frequency but specified to

Example 3 .Figure 3 :
Figure 3: The values of similarity measures (a) and normalized similarity measures (b) for image rotation transformations with angle range [−, ].

Table 1 :
Entropies, conditional entropies, and joint entropies defined on density functions and survival functions.

Table 2 :
The similarity measures and normalized similarity measures defined on density functions and survival functions.

Table 3 :
The values acquired from different types of entropy for testing images.