Calculation of Precise Constants in a Probability Model of Zipf ’ s Law Generation and Asymptotics of Sums of Multinomial Coefficients

Let ω0, ω1, . . . , ωn be a full set of outcomes (symbols) and let positive pi, i = 0, . . . , n, be their probabilities (∑ni=0 pi = 1). Let us treat ω0 as a stop symbol; it can occur in sequences of symbols (we call them words) only once, at the very end.The probability of a word is defined as the product of probabilities of its symbols. We consider the list of all possible words sorted in the nonincreasing order of their probabilities. Let p(r) be the probability of the rth word in this list. We prove that if at least one of the ratios logpi/ logpj, i, j ∈ {1, . . . , n}, is irrational, then the limit limr→∞p(r)/r−1/γ exists and differs from zero; here γ is the root of the equation∑ni=1 pγ i = 1. The limit constant can be expressed (rather easily) in terms of the entropy of the distribution (pγ 1 , . . . , pγ n).


Introduction: The Statement of the Main Theorem
1.1.Brief Literature Overview.The wide presence of power laws in real networks, biology, economics, and linguistics can be explained in the framework of various mathematical models (see, e.g., [1,2]).According to Zipf 's law [3], in a list of word forms ordered by the frequency of occurrence, the frequency of the th word form obeys a power function of  (the value  is called the rank of the word form).One can easily explain this law with the help of the so-called monkey model.Recall that the word forms "the"; "of"; and "and" are used most frequently in English texts.According to Zipf 's law, the word "the" is used in the texts twice as much as "of" and three times as much as "and"; in other words the word form occurrence frequency obeys the power function of rank  (the position number of the word form in an ordered frequency list) whose exponent is approximately −1.It should be noted that further surveys showed that Zipf 's law is roughly realised only for the most frequent words.At present, the researches try to describe the main part of the lexicon using the power law with an exponent −, where  > 1. Zipf explained his law on the basis of the principle of least effort.In accordance with this principle, the authors aim to minimise the length of the text, which is required to convey their thoughts, even if this introduces ambiguities.On the other hand, readers want to minimize the effort required to understand the text [4].
Another explanation of Zipf 's law was suggested by Mandelbrot who slightly modified the law by introducing translation constant [5] into the argument of the power function.The important thing for our case is that later he hypothesized the existence of more simple explanation of the Zipf law associated with a simple probability model when all symbols in the text (including white-space) appear independently of each other with certain probability.Moreover, he analysed the Markovian dependence between these symbols and wrote out the correct (in a typical case) formula on the basis of special cases to determine the parameter  by the transition probabilities matrix in the Markov model [6].
First, we will consider the model thoroughly described by Miller [7] and Li [8] for a special case of Mandelbrot's experiment in which the monkey types the keys with uniform probability.To learn some other important references on the monkey model, we recommend to read the recent article by Richard Perline and Ron Perline [9] (see also references in the next subsection).
2 International Journal of Mathematics and Mathematical Sciences

Statement of the Main Theorem and Its Connection with
Other Results.Assume that a monkey types any of 26 Latin letters or the space on a keyboard with the same probability of 1/27.We understand a word as a sequence of symbols typed by the monkey before the space.Let us sort the list of possible words with respect to probabilities of their occurrence (the empty word, whose probability equals 1/27, will go first in this list followed by 26 one-letter words whose probabilities equal 1/27 2 and then by 26 2 possible two-letters words and so on).We can prove (see [7,8]) that the probability () of a word with the rank of  satisfies the inequality where  = log 27/ log 26 and  1 ,  2 > 0 (here and below we use the symbol log if the base of the logarithm is not significant; but for the natural logarithm we use the symbol ln).
Relatively recently inequality (1) was generalized to the case of nonequiprobable letters.Let  0 be the probability that the monkey types the space, let   ,  = 1, . . ., , denote probabilities of choosing the th letter from the set of  letters (  > 0, ∑  =0   = 1), and let () be, as above, the probability of a word with a rank of .Then, as is proved in [10,11], the following inequality analogous to (1) takes place; namely, ∃ 1 ,  2 : 0 <  1 <  2 , such that and  is the root of the equation ∑  =1    = 1 (evidently, 0 <  < 1).Note that inequality (2) is equivalent to the boundedness of the difference − log () −  log .
In the case when the probability of each letter is not fixed but depends on the previous one, words represent trajectories of a Markov chain with the absorbing state  0 and transient states  1 , . . .,   .Then the value () is the probability of the th trajectory in the list of possible trajectories sorted in the nonincreasing order of probabilities.In this case, the asymptotic behavior of () does not necessarily have a power order.Namely, in this case one of the two alternatives takes place [12,13].The first variant is that there exists the limit lim where  is some positive integer constant value that depends on the structure of the transition probability matrix and the structure of states, where the initial distribution of the Markov chain is concentrated.The second variant is that independently of the initial distribution there exists the following nonzero limit (the so-called weak power law): This limit equals 1/, where  is now defined with the help of the substochastic matrix  of transition probabilities where the row and the column that correspond to the absorbing state  0 are deleted.Namely, raising all elements of the mentioned matrix to the power of  would equate its spectral radius to 1.
These results were obtained independently in [12,14] and later refined in [13].Namely, as appeared, the first alternative means the subexponential order of the asymptotics; that is, in this case ∃ 1 ,  2 : 0 <  1 <  2 , such that (5) The case of the second alternative is much more difficult.If the matrix  does not have the block-diagonal structure with coinciding powers such that raising elements of blocks to these powers makes the spectral radius equal 1, then one can replace the weak power law with a strong one.Namely, in this case the asymptotic behavior of () has the power order; that is, inequality (2) is valid (with "matrix"  defined above).Therefore, inequality (2) takes place in a "typical" case of letter probabilities.
However, one more natural question still remains without an answer.
Inequality (2) means that the asymptotic form has a power order but does not imply the exact power asymptotics.In a general case, as follows from the first example given in this section, useful properties can be established neither when letters in words are Markov-dependent nor when they are independent.However, as we prove later in this paper, in a "typical" case, for words composed of independent letters, the asymptotic behavior of the function () is exact power.The following theorem is valid.Here and below we always write the function under consideration in the numerator and do the norming (defined analytically) function in the denominator of the fraction, whose limit is to be calculated.In intermediate calculations it may be more convenient to do the opposite, but since this results only in the trivial raising of the limit constant to the power of −1, we sacrifice the convenience of calculations for the clarity of statements of results.Evidently, the theorem asserts that under certain assumptions there exists the nonzero limit ()/ − (where

Theorem 1 (main
Let us describe the structure of the remaining part of the paper.In Section 2 we state the main theorem in terms of multinomial coefficients (of the Pascal pyramid).The proof of the theorem is reduced to the estimation of the limit behavior of the sum of these coefficients over some simplex.In Section 3 we prove an analog of this theorem with an integral in place of the sum.In this section we essentially use the Stirling formula which allows us to reduce calculations to the evaluation of a multivariate Gaussian integral.We establish an explicit formula for the determinant of the matrix Theorem 2 (the case of  = 1).Let   > 0 be the probability of the symbol   ,  = 1, . . ., , while ∑  =1   = 1 (there is no stop symbol).Assume that at least one of the ratios log   / log   , ,  ∈ {1, . . ., }, is irrational.Let us consider all possible finite words (including the empty one) and sort them in the nonincreasing order of probabilities (we equate the probability of the empty word to 1 and calculate the probability of any other word as the product of probabilities of its letters).Let () be the probability of the th word in the list (the word with the rank of ).Then the limit lim →∞ ()/ −1 exists and equals  −1 (p), where (p) is the entropy of the vector p = ( 1 , . . .,   ); that is, In the statement of Theorem 2, as well as in Theorem 1, we use the bold font for the vector whose components are denoted by the same letter with the index ranging from 1 to .In what follows we use the bold font for analogous denotations without mentioning this fact.
One can easily see that Theorem 2 is not just a particular case of Theorem 1, but these theorems are equivalent.Namely, the replacement of probabilities    with new ones   turns the general case into the particular one.Therefore, in what follows we neglect  0 , assuming (without loss of generality) that ∑  =1   = 1.Fix some probability  ∈ (0, 1] and denote by () the rank of the last word whose probability is not less than  in the list of all words sorted in the nonincreasing order of their probabilities.Let us redefine the function () for noninteger  as () = (⌊⌋) (here ⌊⋅⌋ is the integer part of a number).Evidently, functions  = () and  = () ( ∈ (0, 1],  ≥ 1) are inverse (more exactly, quasi-inverse); namely, the graph of one of the hyperbola-shaped, decreasing stepwise functions turns into another one when axes  and  switch roles (in the first case,  is the argument and  is the value and vice versa in the second case).
It can be clearly seen that lim →∞ ()/ −1 = 1 is equivalent to Therefore the equality in the assertion of Theorem 2 is equivalent to that Denote the logarithm of the denominator in the last fraction by  = − ln  (i.e.,  =  − ) and let Q() = ( − ).In view of considerations in the above paragraph the equality in the assertion of Theorem 2 is equivalent to that lim Recall the proof of inequality (2) in [11].It is reduced to the proof of the boundedness of the difference ln Q() −  for the introduced function Q() with  ≥ 0. Nonnegative values of  form the definition domain of the function Q() because  ≤ 1 ⇔  ≥ 0. For convenience we redefine the function Q() by putting Q() = 0 for  < 0.
Let   = − ln   .Considering all possible variants of the last letters in words, whose quantity equals the value of the function Q, we obtain the functional equation , where  is the Heaviside step (i.e., the function that vanishes with negative values of the argument and equals 1 with nonnegative values).For  ≥  = max{ 1 , . . .,   } we get the following recurrent correlation: where The equality ∑  =1   = 1 implies that the function const exp  satisfies (10).Since the function   () takes a finite number of positive values within [0, ] interval, there exist positive  1 and  2 such that for all 0 ≤  ≤ .
Replacing terms in the right-hand side of the recurrent correlation (10) with their lower (upper) bounds, we extend the solution set of inequality (11) to the domain 0 ≤  ≤  + , where  = min{ 1 , . . .,   }.Repeating this procedure several times, in a finite number of steps we prove that the inequality is valid for any arbitrarily large .Performing the logarithmic transformation of the inequality, we conclude that ln   () −  is bounded, and then so is the difference ln Q() − .
Let us return to Theorem 2. As was mentioned above, Theorem 2 asserts (under certain assumptions) not only the boundedness of ln Q() −  but also the validity of equality (9).Let us recall the combinatory sense of the function Q; it is mentioned in [11].Evidently, all words that contain  1 letters of the 1st kind,  2 letters of the 2nd kind, . .., and   letters of the th kind have one and the same probability of Pr ; ranks of these words are consecutive.The quantity of such words is defined by the multinomial coefficient Considering the nonnegative part of the -dimensional integer grid and associating the point ( 1 , . . .,   ) with the number ( 1 , . . .,   ), we get one of the variants of the Pascal International Journal of Mathematics and Mathematical Sciences pyramid.By the definition of the function Q the value Q() equals the sum of multinomial coefficients (k) over all integer vectors k that lie inside the -dimensional simplex As a result, we obtain one more equivalent statement of the main theorem, which we are going to prove.

The Proof of an Analog of Theorem 3 with Integration instead of Summation
Proof.Let us first recall some evident properties of the integrand.Note that the existence of the (Riemann) integral of () over the compact set () evidently follows from the continuity of (x) in the domain under consideration.If all components of the vector ( 1 , . . .,   ), possibly, except one component   , equal zero, then by definition we have ( 1 , . . .,   ) ≡ 1.Let us prove that otherwise the function ( 1 , . . .,   ) is strictly increasing in   .Since the gamma function is positive definite, it suffices to prove that in this case the partial derivative of ln ( 1 , . . .,   ) with respect to   is positive.It equals The positiveness of this difference follows from the fact that the function (ln Γ)  is increasing; this property, in turn, follows from the logarithmic convexity of the gamma function (it is well known [15] that The proved assertion implies that the function (x) attains its maximum in the domain () at the boundary ⟨a, x⟩ = , where ⟨a, x⟩ = ∑  =1     .Let us calculate the exact asymptotics of the maximal value of the function (x) in the domain () with  → ∞.For the vector x we denote by  the sum of its components and parameterize x by the value  and ratios   =   /: Let us use one simplest corollary of the Stirling formula [15], namely, the fact that with a nonnegative argument the value of the difference ln Γ( + 1) − ( ln() −  + ln( + 1)/2) is bounded.We obtain that, with any  > 0, ln  ( 1 , . . .,   ) =  (q) +  (ln ( + 1)) , where (q) = − ∑  =1   ln   (this correlation is closely connected with the so-called entropy inequality for multinomial coefficients).
We seek for the maximum of this function with  → ∞ under one additional condition (namely, the requirement that the maximum is attained at the boundary) ⟨a, x⟩ = , where   = − ln   , 0 <   < 1, and ∑  =1   = 1.Since   > 0, we get (ln( + 1)) = (ln ).Moreover, the condition ⟨a, x⟩ =  with mentioned   gives the correlation where (q; p) = ∑  =1     = − ∑  =1   ln   .Substituting this expression in (18), we conclude that the maximum of ln  (accurate to (ln )) is attained at a vector q such that the fraction (q)/(q; p) takes on the maximal value.Recall that the difference (q; p) − (q) takes on only nonnegative values and is called the Kullback-Leibler distance (divergence) (q | p) between distributions q and p (see [16]).The minimum of this difference is attained at only one value of q = p; evidently, an analogous assertion is also true for (q; p)/(q): if q ̸ = p Consequently, the maximum of the function ln (x) in the domain () is attained (accurate to (ln )) at the intersection of the hyperplane ⟨a, x⟩ =  with the straight line   =   ,  = 1, . . ., , where it equals  + (ln ).
Let us now immediately prove Theorem 4. Note first that by using the L'Hopital rule we can reduce the proof to that of the formula obtained by differentiating ()/ exp() numerator and denominator with respect to  and to the proof of the equality lim →∞ f () where f() = ∫ x≥0 (x)( − ⟨a, x⟩)x and (⋅) is the delta function.
Let  be a real arbitrarily small positive value.Denote by Λ  the sector consisting of points x,   =   , and ∑    = 1, such that With fixed  on the hyperplane ⟨a, x⟩ =  correlations ( 18) and ( 19) take the form ln  (x) =  (q)  (q; p) +  (ln ()) .
Let us now strengthen inequality (20); namely, let us prove that if for q correlations (22) are violated, then where  1 (p) is a positive constant independent of q.Since (q; p) is a convex combination of − ln   , it evidently is bounded: Consequently, formula (24) is equivalent to the inequality The latter correlation follows from the well-known property of the Kullback-Leibler divergence (see, e.g., lemma 3.6.10 in [16]).
The proved inequality (24) (in view of formula (23)) implies that outside the domain Λ  the function (x) is exponentially small in comparison to the maximal value inside the domain which equals exp().More precisely, with  ∉ Λ  and ⟨a, x⟩ = , we get  (x) < exp {(1 −  2 ) } for some  > 0. (28) Note that the condition of the exponential smallness in comparison to exp  remains valid, even if  depends on  and tends to 0 as  increases, though not too fast.In what follows we assume that where  > 0 is sufficiently small. ( One can easily see that the same exponential upper bound as in (28) also takes place not only for  function but also for its integral over the domain whose volume grows according to a power law: with  → ∞.Therefore in limit (21) we can treat f() as the integral Let us define the asymptotics (18) of the function (x) in the domain Λ () more precisely.Let us use the standard Stirling formula, namely, the fact that with  → ∞ it holds that ln Γ(+1) =  ln()−+ln()/2+ln(2)/2+(), where 0 < () < 1/(12).We obtain that, in the domain Λ () , Here, as usual,  = ∑  =1   ;   =   /.Therefore, we conclude that when considering the asymptotics of function (31) we can treat (x) as follows: In the latter formula we can write the exponent as Let us write the Taylor expansion up to second-order terms near the maximum point in the plane ⟨a, x⟩ = , that is, near the point x  = p/(p) (in what follows we denote by    coordinates of the point x  and do by   the sum of these coordinates which evidently equals (p) −1 ).First of all, note that One can easily calculate second derivatives of expression (34): (note that we do not use first derivatives in the Taylor expansion near the maximum point).
If  ∈ Λ  , then by formula (19) we have  −   = ((q; p) −1 − (p) −1 ) = () (in the latter inequality we use the continuity of the function (q; p) −1 ).Consequently, International Journal of Mathematics and Mathematical Sciences In particular, with chosen  = () we have |  −    | = ( 1/2+ ).We obtain that, in the domain Λ () , Here the term ( −1/2+3 ) contains both the remainder of terms of the series whose order exceeds 2 and the value of ( −1+2 ) added by some omitted second-order terms.With  → ∞ we can neglect the term of ( −1/2+3 ).Therefore, in integral (31) in place of (x) we should substitute the function M(x) which differs from M(x) in the fact that its exponent does not contain the term of ( −1/2+3 ).
Let us change variables in the integral as follows:   = (  −    )/ √   .Since the degree of homogeneity of the deltafunction equals −1, we obtain that limit (21) coincides with where B is  ×  matrix, whose all elements equal −1, except diagonal components which are greater by 1/  .

Calculation of the Determinant
Lemma 5. Let  ≥ 2. Consider  ×  matrix , where all nondiagonal elements equal 1, while   = 1 +   .Then (1) the determinant of this matrix equals With  = 2 in the formula in item (2) we get the product over the empty set; it is accepted that this product equals 1.The formula in item (1) remains valid with  = 1.In the induction step we assume that the formula in item ( 1) is proved for all dimensions less than  and has to be proved for the case when the dimension equals , while the formula in item ( 2) is proved for all dimensions not greater than  and has to be proved for ( + 1) × ( + 1) matrix.
For proving item (1) we can use the expansion by the last row.Multiplying the algebraic complement by the diagonal element   + 1, we get the sum The expansion by the entire last row, taking into account the induction hypothesis for item (2), make the third part in row (42) vanish.First two terms in formula (42) together give the desired sum.
In order to prove item (2), let us expand the determinant considered in this item (algebraic complement of the element with (, ) indices of the matrix  with ( + 1) × ( + 1) dimension) by the row whose number in the initial matrix of  was equal to .Generally speaking, for clarity, we use the same indices as in the numeration of the initial matrix.Since the algebraic complement considered in this item and the occurring algebraic complement for the element with indices (, ) (obtained by the expansion by a row of the determinant under consideration) have opposite signs, the value added by the element with indices (, ) equals (here we have used the induction hypothesis for item (1)).The difference from the desired formula consists in the last term which equals (taking into account the first multiplier) Proof of Lemma 7. By the differentiation rule for determinants, the derivative of the determinant of  ×  matrix equals the sum of determinants of  matrices such that in the th one all elements of the th row are replaced with their derivatives.We obtain that  2 det( 2 − 1 )/ 2 is the sum of determinants of matrices each one of which contains either the zero row or two various rows of the matrix  2 .Since rank  2 = 1, we get  2 det( 2 −  1 )/ 2 = 0. Thus, det( 2 −  1 ) is a linear function of , whose free term evidently equals det(− 1 ).It is clear that for calculating the coefficient det( 2 −  1 ) at  it suffices to summate products of each element of the matrix  2 by the algebraic complement of the corresponding element of the matrix − 1 .If an element has indices (, ),  ̸ = , then by item (2) of Lemma 5 this product equals         /∏  ℓ=1  ℓ .Let us explain the positive sign in the last formula.We calculate an algebraic complement of the − 1 matrix element.The matrix has  ×  dimension, and therefore the found algebraic complement differs from the algebraic complement of the corresponding  1 matrix element for (−1) −1 times.According to item (2) of Lemma 5, the algebraic complement of the corresponding  1 matrix element is a "minus" product of  − 2 multipliers   .In the given case each of   factors is negative (equals −1/  ) which results in positive sign of the last formula in the above paragraph.
Assume that this formula is valid for all (, ).Then we get the sum However by item (1) of Lemma 5 the algebraic complement of the diagonal element   of the matrix − 1 equals (here and below we omit the evident requirement that values of all indices belong to the set []).
Multiplying the first term in parentheses, that is, ∏ : ̸ = (−1/  ), by (−1) −1  2  and summing over all , we get ∑  =1  2    /∏  ℓ=1  ℓ .Let us multiply the resting term in parentheses (48) by (−1) −1  2  , sum over all , and subtract the value from the obtained result (note that the subtrahend was "illegally" included in formula (47)).It gives the overall contribution of the second term in formula (48), which equals Taking into account all the calculation elements of the determinant det( 2 −  1 ) allows completing the proof of Lemma 7.
For completing the proof of Theorem 4 let us use Corollary 8. Let us replace the -function in integral (39) (as was proved earlier, this integral equals the limit considered in Theorem 4): () = lim →0 (1/ √ 2)exp{− 2 /2 2 }.Treating the limit multiplied by the coefficient at the exponent as a multiplier in the integral, we come to the limit of the Gaussian integral Immediately applying Corollary 8, we get desired  −1 (p).This completes the proof.

The Ratio between the Sum and the Integral
What remains is to prove that, under assumptions of Theorem 3, the ratio of the integral of the function  calculated over the domain () to the sum of values of this function at integer points of this domain tends to 1 as  → ∞.
For comparing the integral of the function and the sum of its values in the same domain one usually applies the Koksma-Hlawka inequality (see [17]).Note that usually one considers the integral over a fixed domain (as a rule, the cube [0, 1]  ), whereas the domain in the case under consideration is varying.However, we intend only to prove the convergence of the fraction to 1 and do not need to estimate the asymptotic difference between the integral and the sum, which simplifies the task.Evidently, it suffices to calculate the limit of the ratio for an arbitrary infinite increasing sequence  1 ,  2 , . .., such that   → ∞.

Theorem 9.
Let Ω 1 , Ω 2 , . . .be a sequence of Jordan measurable sets such that Ω  ⊂ Ω +1 for all  = 1, 2, . ... Assume that (),  ∈ Ω, where Ω = ⋃  Ω  , is an integrable and bounded on each of the domains Ω  function such that () ≥ 0 and ∫ ()Ω  → ∞ as  → ∞.Assume also that  is a countable set of points from Ω such that each of the sets   =  ∩ Ω  is finite.Then if for any sufficiently small  > 0 there exists a partition of Ω onto a countable number of Jordan measurable Representing this correlation as a double inequality and summing it over all  from  ℓ + 1 to   , we obtain with  > ℓ.
Note that by condition the numerator in the latter fraction (different from the integral ∫ ()Ω  by a constant value) tends to infinity.Then the same is true for the denominator.Note that the denominator differs from ∑ ∈  () by a constant value.Therefore we conclude that all limit points of the sequence ∫ ()Ω  / ∑ ∈  () lie inside the interval ((1 − ) 2 , (1 + ) 2 ).Due to the arbitrariness of the choice of positive  Theorem 9 is proved.
Corollary 10 (completion of the proof of Theorem 1).Let () be the function mentioned in assumptions of Theorem 4 and let Q() obey formula (13).Then if at least one of the ratios   /  , ,  ∈ {1, . . ., },  ̸ = , is irrational, then Proof.For clarity we denote by  the parameter that defines the boundary of the considered domain, and do by  the corresponding parameter of the hyperplane that contains a certain interior point x of this domain; that is, (x) = ⟨a, x⟩.First of all, note that considerations in Section 3.1 imply that both in the sum and in the integral we can replace () with the domain and replace the function (x) with M(x) defined by formula (34).Therefore, we need to prove that (or that the difference of logarithms of the numerator and denominator tends to zero).
In view of Theorem 4 the logarithm of the numerator in the latter fraction is a uniformly continuous function of , while the logarithm of the denominator evidently is a nondecreasing function.Therefore for proving the existence of the limit with  → ∞ it suffices to prove the existence of the limit for a sequence in the form   = ,  = 1, 2, . .., where  is an arbitrarily small positive value (as the difference between the numerator and denominator of the logarithms in an arbitrary point slightly differs from the value of difference in the nearest points   in this sequence).Namely, just for this fixed sequence we consider the ratio from the right-hand side of (62).
In order to apply Theorem 9, for an arbitrary sufficiently small positive  we construct a partition of Λ () onto domains   satisfying assumptions of the theorem.Namely, we construct this partition by dividing of an infinite quantity of "flapjacks" located between neighboring hyperplanes in the forms (x) =   and (x) =  +1 ,  = 1, 2, . .., where  +1 =   + Const, onto a finite number of domains   .
To this end, it suffices to put   = Const , where Const = /⌈2/⌉ (here ⌈⋅⌉ is an upward rounding to the nearest integer).
Let   = {x :   ≤ (x) <  +1 }.Denote by   the th "flapjack"   ∩ Λ () .We are going to "cut"   onto a finite number of domains   .We numerate the countable number of domains   ,  = 1, 2, . .., so as to make domains   obtained by "cutting"   with the least  have lesser numbers, while the order of numbering inside the partition of   plays no role.
By formula (33), where We get grad  = ln q = (ln As a result, we obtain that with sufficiently small , starting with some , it holds that Therefore, dividing   onto domains   so as to fulfill correlation (67) for all points x, y that belong to one domain, we guarantee the validity of assumption (53) in Theorem 9. Note that it suffices to fulfill condition (67) for all indices  except one, because the validity of this condition for the remaining index follows from the fact that x, y ∈   .
Finally, let us use the irrationality of   * /  * for some  * ̸ =  * .Let us denote by   * the set {1, . . ., } \ { * } and do by   *  * the set {1, . . ., }\{ * ,  * }.We are going to prove that, defining domains   by inequalities we fulfill condition (54) (with  = Z  ).Here, as usual,  is a sufficiently small real positive value, though in this case we can choose  as any number in the interval (0, 1/2) (roughly speaking, it is sufficient that the radius of the pieces   used to divide "flapjacks"   tends to infinity at  → ∞).
Evidently, we can divide "almost all"   onto domains   so as to simultaneously fulfill inequalities (67) and conditions (71) on  and  (the remaining "cuttings" on the edges of the domain   which occur due to the inconsistency between the inequality   ≤   <   ,  ∈   * and the definition of the boundary of the domain Λ () are asymptotically small).
Evidently,   = ∏ ∈  * (  −   ) × ( +1 −   )/  * .Since the difference (  −   ) grows as  → ∞, the asymptotics of the number of ways for choosing integer   such that   ≤   <   for  ∈   *  * coincide with ∏ ∈  *  * (  −   ).Here and below we understand the asymptotics as a function of  such that the ratio of the considered quantity to this function tends to 1 as  → ∞.In order to complete the proof of Corollary 10, what remains is to prove the following lemma.
If the difference   −  (it equals ( +1 −  )/  * ) is less than 1 (this inequality obviously holds for sufficiently small ) then with fixed   * the integer value   * satisfying condition (72) is defined uniquely, provided that it exists.Therefore, we need to estimate the quantity of values   * in the interval [  * ,   * ) such that {  * } ∈ [{  }, {  }); here the latter correlation is understood in the sense of an interval on the unit circle, and the length of the considered interval is independent of .
Recall the definition of a well-distributed sequence [17, section 1.5].
The sequence (  )  = 1, 2, . .., is said to be well-distributed mod 1 if for all pairs ,  of real numbers with 0 ≤  <  ≤ 1 we have lim

( 2 )Corollary 6 .
the algebraic complement of the element with indices (, ),  ̸ = , equals − ∏ ℓ∈[]\{,}  ℓ , where [] = {1, . . ., } .The matrix B in formula (39) is degenerate.Proof of Lemma 5. Note that the first item of Lemma 5 defines the value of the algebraic complement of the diagonal element of such a matrix.Let us prove the theorem by induction.