$n$-digit Benford converges to Benford

Using the sum invariance property of Benford random variables, we prove that an $n$-digit Benford variable converges to a Benford variable as $n$ approaches infinity.

Let A be the smallest sigma algebra generated by D i . Then D −1 i (d) ∈ A for all i and d. Within this framework, a random variable Y is Benford [1,2,3] if for all m ∈ N, d 1 ∈ {1, . . . , 9} and d i ∈ {0, 1, . . . , 9} for i > 1, the probability that the first m digits of a real number are d 1 d 2 · · · d m is given by While Benford variables have logarithmic distributions in all of their digits, often times, in Benford literature the focus has only been on the distribution of the first digit. Such a limitation may obscure the true nature of the quantity investigated. There are data sets which exhibit a perfect "Benford" distribution in the first digit, but fail to do so in the second. Nigrini [7] provided such an example, and consequently recommended the use of the first two digit test in order to improve the recognition of the Benford datasets, and thus to identify financial fraud. He also recommended this approach for other accounting related analysis.
Such cases were generalized in [4], where a new class of random variables, called n-digit Benford variables, was introduced. These variables exhibit a logarithmic digit distribution only in their first n digits, but are not guaranteed to be logarithmically distributed beyond the n-th digit. Unlike Benford variables whose decimal logarithm is uniformly distributed mod 1, the decimal logarithm of n-digit Benford random variables has less stringent constraints; it must only satisfy prescribed areas over a given partition of the unit interval. This provides us with a collection of random variables that contains the Benford variables as a subset.
It is intuitive to assume that when n goes to infinity, a n-digit Benford variable converges to Benford. The purpose of this paper is to prove that this is indeed the case.
This paper is structured as follows: in the next section we introduce the n-digit Benford variables together with some of their properties. In section 3 we briefly discuss sum invariance, which is fundamental for our main result. Finally, using sum invariance, in section 4 we show that an n-digit Benford variable converges to Benford, as n → ∞.

n-digit Benford
An n-digit Benford random variable behaves as a Benford variable only in the first n-digits, but may not have a logarithmic digit distribution beyond the nth digit [4].
Note that a Benford variable is an n-digit Benford variable, for any n.

Sum invariance
To define sum invariance, we first define the significand function, also known as the mantissa function. Let us consider a finite collection of positive real numbers K, and define S d1···dn to be the sum of the significands of the numbers starting with the sequence of digits d 1 · · · d n . Sum invariance means that S d1···dn is digit independent. For instance, consider the Fibonacci sequence which is known to be Benford [5]. Then for the first 50000 Fibonacci numbers we obtain  Table 1. Sum invariance illustration for the first 50000 Fibonacci numbers where S 1 denotes the sum of all significands starting with 1, etc.
Nigrini was the first to notice sum invariance in some large collections of data [6]. Allaart [8] refined this concept, by defining it in connection with continuous random variables. Specifically, a distribution is sum invariant if the expected value of the significands of all entries starting with a fixed n-tuple of leading significant digits is the same as for any other n-tuple: E [S d1···dn Y ] = E S d 1 ···d n Y . Allaart showed that a random variable is sum invariant if and only if it is Benford. Berger [3] proved that for sum invariant random variables For example, for a Benford sequence with 50000 elements, formula (4) yields S 1 = · · · = S 9 = 21714.7 rounded to the tenths, which is very close to the actual values for the Fibonacci numbers illustrated in table 1. Naturally, the more numbers are taken from the sequence, the closer one gets to the theoretical sum.

Main result
A random variable is sum invariant if and only if it is Benford [8,3]. Using this result, we will prove that an n-digit Benford variable converges to Benford as n approaches infinity by calculating the bounds for the expected value of its significand.
Lemma 4.1. Let Y and X = log Y be two random variables with the probability density functions f and g, respectively. Then Proof. Using f (y) = g(log y)/(y ln 10), we get y 10 −k g(log y) y ln 10 dy It is known that a necessary and sufficient condition for a random variable to be Benford is that g † = 1 [9,3]. Consequently, equation (4) follows immediately from lemma 4.1.
There are arbitrary many ways in which we can build a n-digit Benford variable. Let B n be the infinite collection of all n-digit Benford variables. We use E [S d1···dn B n ] to denote the collection of the expected values of the significands of the elements of B n . The next theorem leads to the main result of our paper. It provides the bounds for the expected value E [S d1···dn Y ] for Y ∈ B n .  where x n = 10 n−1 d 1 + · · · + d n .
As n → ∞, both lower and upper bounds of E [S d1···dn B n ] approach 10 1−n ln 10 , proving the sum invariance [3]. Consequently, the n-digit Benford variable converges to Benford.