A Numerical Study on the Regularity of d-Primes via Informational Entropy and Visibility Algorithms

Let a d-prime be a positive integer number with d divisors. From this definition, the usual prime numbers correspond to the particular case d � 2. Here, the seemingly random sequence of gaps between consecutive d-primes is numerically investigated. First, the variability of the gap sequences for d ∈ 2, 3, . . . , 11 { } is evaluated by calculating the informational entropy. (en, these sequences are mapped into graphs by employing two visibility algorithms. Computer simulations reveal that the degree distribution of most of these graphs follows a power law. Conjectures on how some topological features of these graphs depend on d are proposed.


Introduction
Prime numbers, the building blocks of any positive integer, fascinate math lovers [1,2]. From a purely theoretical perspective, primes are crucial for understanding the properties of numbers [1,2]. From an applied science perspective, primes have been used in cryptographic keys [3], can be found in the life cycles of cicadas [4], and characterize the energy spectrum of chaotic quantum systems [5].
Since 300 BC, the irregular distribution of primes throughout the sequence of natural numbers has been extensively investigated. Giants as Chebyshev, Dirichlet, Eratosthenes, Erdös, Euclid, Euler, Fermat, Gauss, Legendre, and Riemann analyzed this matter [1,2]. To get some insight on this distribution, the statistical properties of the gaps between consecutive primes [6,7], second-order gaps (the gaps between these gaps), and higher-order gaps [8,9] have been examined. e distribution of primes has also been studied by using graphs. For instance, consider that primes are the nodes of a graph. Since any even number can be written as the sum of two primes, a pair of nodes is linked in this graph if it represents the sum of a given even number [10]. In another graph-based approach, the natural numbers are the nodes, and there is a connection between two nodes if they share a common prime divisor [11]. Graphs are also used in this work.
Assume that a d-prime is a positive integer with d divisors. erefore, the usual prime numbers correspond to the case d � 2. In this work, the frequency of the gaps between d-primes is computationally analyzed. First, the difference of successive d-primes is calculated for d ∈ 2, 3, . . . , 11 { }, and the gap sequences thereby obtained are considered as discrete time series. To evaluate their variability, the informational entropy [12] of these series is computed. In addition, these series are transformed into undirected graphs by applying two visibility algorithms [13,14]. e degree distribution and the average degree of these graphs are determined and compared. ese are the main contributions of this work, which are described in the next section. e aim of this study based on d-primes is to understand how the gap sequences depend on d. e proposed approach can be used to analyze other sequences of numbers found in nature [15], such as the energy levels of atomic nuclei and the quantum space-time structure [16].

Methodology and Numerical Results
Let a d-prime be defined as a positive integer p d greater than 1 that is divisible by 1, the number itself, and d − 2 (smaller) positive integers.
us, a d-prime has exactly d distinct divisors. For instance, number 12 is a 6-prime because its (six) divisors are 1, 2, 3, 4, 6, and 12. A d-prime with d > 2 is usually called a composite number [1,2]. Note that the 2primes are just the usual primes. us, d-prime is a naive generalization of the concept of prime number.
Let p d (n) be the n th d-prime, with n ∈ N * . For instance, p 3 (4) is the fourth 3-prime, which is equal to 49. Table 1 presents a list of the first d-primes for d ∈ 2, 3, . . . , 11 { }. Let x d (n) � p d (n + 1) − p d (n) be the gap between consecutive d-primes. For instance, x 5 (1) � 65 because p 5 (2) � 81 and p 5 (1) � 16. Note that the sequence x d (n) can be taken as a time series, in which n corresponds to the time variable. Table 2 shows the first numbers of the series x d (n) for d ∈ 2, 3, . . . , 11 { }. In order to evaluate the variability of the series x d for d ∈ 2, 3, . . . , 11 { }, the informational entropy H [12] was computed. is entropy has been calculated, for instance, in investigations on the dynamics of biological [17] and social systems [18]. Its normalized value, denoted by Δ, is given by e entropy H � − q i�1 p i ln p i and its maximum value H max � ln q (obtained in the case of p i � (1/q) for i � 1, . . . , q [12]) are calculated by taking p i as the relative frequency of occurrence of distinct x d . In these expressions, q is the number of distinct values of x d .
As the next step, the time series x d (n) were converted into undirected graphs by using two visibility algorithms [13,14]. ese algorithms have been employed, for instance, in the analysis of stock indices [19] and electroencephalography recordings [20]. Here, in the visibility graphs, each node represents a distinct value of x d .
Consider that n a < n i < n b . In the natural visibility (NV) graph [13], the nodes corresponding to x d (n a ) and x d (n b ) are connected if any intermediate point (n i , x d (n i )) in the time series satisfies the inequality us, these nodes are connected if there is a straight line joining (n a , x d (n a )) and (n b , x d (n b )) in the plot x d (n) × n, provided that any intermediate point (n i , x d (n i )) is below such a line. For instance, the four first points of the sequence x 3 (n) shown in Table 2 , and x 3 (4) � 72. ere is a connection between the nodes corresponding to x 3 � 16 and x 3 � 72 because the intermediate datum x 3 � 24 is below the straight line linking these points in the plot x 3 (n) × n, that is, because . Also, note that there is not a connection between the nodes x 3 � 5 and x 3 � 24 because In this case, the datum (2, 16) is high enough to prevent the data (1,5) and (3,24) from seeing each other.
In the horizontal visibility (HV) graph [14], the nodes associated with x d (n a ) and with n a < n i < n b . us, these nodes are connected if any intermediate point (n i , x d (n i )) in the plot x d (n) × n is below the horizontal line joining (n a , x d (n a )) and (n b , x d (n b )). In the HV graph built from the beginning of the sequence x 3 (n), the nodes corresponding to x 3 � 16 and x 3 � 72 are not connected because x 3 (2) � 16 < x 3 (3) � 24; that is, (2,16) and (4, 72) cannot see each other because (3,24) is high enough to block their horizontal visibility. us, x 3 � 16 and x 3 � 72 are connected in the NV graph, but they are not connected in the HV graph.
Numerical experiments were performed by taking n � 1, 2, . . . , 10000, that is, the first 10,000 d-primes for each value of d. From these 10,000 d-primes, 9999 values of x d (n) were computed.
e values corresponding to n � 1, 2, 9998, 9999 were discarded in order to neglect the initial transient (n � 1, 2) and the effects of truncation (n � 9998, 9999). e normalized entropy Δ was computed by using equation (1). As shown in Table 3, Δ � 1 for d � 5, 7, and 11 because there is no repeated value of x d in the corresponding sequences. For d � 3 and d � 9, Δ≃1; for d even, Δ ≈ 0.7.
From the gap sequences x d (n) for d ∈ 2, 3, . . . , 11 { }, undirected visibility graphs were built by using equations (2) and (3). en, the corresponding degree distributions P(k) and the average degree 〈k〉 were determined. Recall that the degree of a node is the number of edges connected to this node [21]. Recall also that the degree distribution expresses how the percentage P(k) of nodes with degree k varies with k [21]. Usually, P(k) is interpreted as the probability of randomly picking a node with degree k. Figures 1 and 2 exhibit the log-log plot of P(k) for NV and HV graphs, respectively, for d ∈ 2, 3, . . . , 11 { }. Observe that most P(k) decay as a power law in these plots. Power laws have been found in the distribution of prime gaps [22] and in a myriad of contexts, such as a psychiatric ward [23] and financial crashes [24]. Here, P(k) is proportional to k taken to the power −c, that is, P(k) � Ak − c , in which A is the proportionality constant. In a log-log plot, log P(k) � log A − c log k; thus, the power-law dependence is transformed into a linear relation between log P(k) and log k. At least as a first approximation, this relation can be taken as linear in most plots shown in both figures; that is, a powerlaw form for P(k) can be considered a plausible model for these plots. Possible exceptions are the NV plots for d � 7, 11 { }. Tables 4 and 5 present A, c, and 〈k〉 for the 20 graphs. e average degree 〈k〉 for the HV graphs is smaller than that for the NV graphs because HV graphs are subgraphs of the corresponding NV graphs. For the NV graphs, 〈k〉 ≈ 6, with the exception of d � 5, 7, 11 { }, which present higher values. For the HV graphs, 〈k〉 > 3.90 for d odd and 〈k〉 < 3.90 for d even. Note also that the values of A and c, determined from least square fitting [25], present smaller variability for the HV graphs than for the NV graphs. In addition, for the NV graphs, c≃1.1; for the HV graphs, c≃1.7.
By reducing the quantity of d-primes used in this computational study from n � 10, 000 to n � 5000, the values of Δ, A, c, and 〈k〉 vary about 1% on average as compared to the numbers presented in Tables 3 and 5. For the results shown in Table 4, larger variations are found for A and c, the parameters of the fitted straight lines; the variations related to 〈k〉 are also about 1% on average, with the exception of d � 11.

Discussion and Conclusion
e absence of a discernible pattern in the sequence of prime numbers has historically hampered the derivation of a formula for correctly generating such a sequence [1,2]. e analytical generation of d-primes, with d > 2, can be an even more challenging task mainly if d is not prime. Here, the distribution of the gaps between d-primes was examined by using a variability measure and two visibility graphs, which were deterministically built.
As shown in Table 3, Δ≃1 for d � 3, 5, 7, 9, 11 { } because repetitions in the sequences of x d are absent or rare; that is, these sequences are aperiodic. In addition, Δ ≈ 0.7 for d � 2, 4, 6, 8, 10 { }; thus, the variability of the gap lengths for d even is lower than for d odd. erefore, a possible conjecture is: the normalized entropy Δ distinguishes d even (Δ ≈ 0.7) from d odd (Δ≃1). In other words, the parity of d determines the value of Δ.
As shown in Figures 1 and 2 and Tables 4 and 5, most degree distributions P(k) of the visibility graphs built from the differences between successive d-primes for d ∈ 2, 3, . . . , 11 { } approximately follow a power law written as P(k) � Ak − c . Fluctuations observed around the fitted straight lines shown in both figures can be effects of the finite size of the gap sequences x d (n) used in the numerical experiments [26]. Recall that power-law distributions in the connectivity are associated with complex networks known as scale-free [26][27][28]. Scale invariance in the degree distribution implies self-similarity; that is, P(k) of renormalized networks, obtained by a coarse-graining procedure [29], also follows a power law. Note that the value of c in the NV and HV plots is not a good parameter to show the influence of the value of d on the networks derived from the gap sequences; however, 〈k〉 can highlight this influence: in NV plots, for d that is prime, 〈k〉 increases with d; for d that is not prime, 〈k〉 ≈ 6; in HV plots, 〈k〉 > 3.90 for d odd and 〈k〉 < 3.90 for d even. erefore, another possible conjecture is: 〈k〉 in NV plots distinguishes d that is prime from d that is not prime; 〈k〉 in HV plots distinguishes d odd from d even.
It is well known that, for HV graphs obtained for periodic sequences of period T (without repeated number within a period), the average degree is given by 〈k〉 � 4[1 − (1/(2T))] [30]. As a consequence, for aperiodic series, 〈k〉 � 4 (because T ⟶ ∞). According to Table 5, this is the approximate value of 〈k〉 found for d-primes with d odd, which is in agreement with the value of Δ presented in Table 3. It is also well known that, for an uncorrelated random sequence, the HV graph has P rand (k) � (1/3)(2/3) k− 2 [30]. erefore, deviations from this degree distribution reveal that the studied sequence was not generated by an uncorrelated random process. e straight line corresponding to P rand (k) is shown as a dotted line in Figure 2. Note that P(k) for d-primes has a smaller slope than the slope of P rand (k). is smaller slope and 〈k〉 � 4 can be features of chaotic sequences [30].    A final comment about the generation of d-primes by deterministic equations: when d is prime, then p d (n) � (p 2 (n)) d− 1 ; however, the way of obtaining p d (n) when d is not prime is not yet evident. Also, a formula to generate p 2 (n) (the exact sequence of primes) remains to be found.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.