A sequence by David Sycamore, this page written by Michael Thomas De Vlieger, St. Louis, Missouri, 2020 0826.
We examine a simple self-referential sequence that generates novel primes taken from the decimal digit sums of the immediate previous term, beginning with the first term 2. The examination regards the nature of features observed in the plot of this sequence, as well as computational approaches.
Contents:
Introduction.
Generation of a(n).
Computation.
Computation Alternatives and Optimization.
Analysis of a(n).
The Appearance of the Plot of a(n).
Conjectures.
Alternatives.
Appendix.
The sequence a(n) = A337099(n) starts with a(1) = 2, then, using s = decimal digit sum of a(n − 1), if s composite or if s already in the sequence, we append to the digits of prefix s a suffix t, that is, a term of A045572 possibly with a number of infixed zeros, so as to form the least prime p beginning with the digits of s not already in the sequence.
There are two routes by which a prime p enters a(n):
1. Prefix s enters a(n) as s is prime and a(n) is free of s, i.e., s is a novel prime p.
2.
The concatenation of the digits of s and those of t (here noted s<>t) enters a(n) as the least prime not already in a(n) whose decimal expansion begins with the digits of s.
There are two ways that s<>t enters a(n):
1.
via composite s;
2.
via prime s where s already occurs in a(n) for m < n.
In both cases
there always shall be a suffix t that consists of digits of A045572(k), possibly left-padded with zeros so as to have j decimal digits, since all primes q outside of 2 and 5 appear in A045572.
Observe that, if s is not a novel prime p, then we append at least one decimal digit (comprising the suffix t). Therefore, single-digit primes p, i.e., 2, 5, or 7, may enter a(n) if and only p is the sum of digits of the preceding term a(n − 1). Because of this, we expect these numbers to appear in the sequence rather early.
The number 3 will not appear in a(n) since no single-digit prime may appear unless it is the sum of digits of the previous term, a prime, and the sum of digits of a prime written in decimal, outside of 3, does not occur. Therefore, the question of all primes appearing in a(n) must take into account the absence of 3.
What we are after in the case of either composite s or prime s already in a(n), is the formation of a novel prime p beginning with the digits of s through the addition of at least one digit in suffix t. This implies that t is not a number in standard notation, that is, t may feature one or more leading zeros.
We look to A045572(k), integer k > 0, to furnish a series of its decimal digits (at times, preceded as necessary by a number of zeros), producing a suffix t so as to spell out a novel prime number p = s<>t in a(n). We know that, with regard to base b = 10, primes q such that gcd(q, 10) = 1 (i.e., nondivisor primes q necessarily coprime to 10) may only belong to certain residues r (mod 10). These are the reduced residues r, one of {1, 3, 7, 9} (mod 10). Any decimal digit may appear in higher ranks, with the highest rank precluding 0. Since at least 1 decimal digit of highest rank is furnished via s, we need not concern ourselves with ensuring t does not have leading zeros, in fact, it is at times essential that leading zeros are taken into consideration.
We might approach the suffix using two variables j = 1 and k = 1. The variable j governs the number of suffix digits, and advances when log10 k/4, while at same time k is reset to 1. Therefore the suffix t is A045572(k) padded left with zeros so as to have j digits.
We may compute A045572(k) (Formula 1):
A045572(k) = ½(5k + (3k + 2 mod 4) − 4).
Conversely we may find any term m in a(n) that is neither 2 nor 5 in A045572 via index k (Formula 2):
k = 4Q + (r + 1)/2 – [r > 5]
where Q is the quotient and r the remainder of m/10, and brackets are Iverson, the enclosed statement yielding 1 if true and 0 if false. In other words, Q = ⌊m/10⌋ and r = m mod 10. We might instead define Iverson [r > 5] as 1 – ((r mod 5) mod 2), if we dislike Iverson brackets.
a(2) = 23 since the digit sum of a(1) is 2. Though 2 is prime, a(1) = 2, therefore we try appending the digits of A045572(1) = 1, but 21 is not prime; we try appending the digits of A045572(2) = 3; indeed 23 is prime thus, a(2) = 23.
a(3) = 5 since the digit sum of a(2) is 5, and 5 is prime and not in the sequence.
a(10) = 149 since the digit sum of a(9) = 95 is composite 14; we thus attempt to find the smallest single-digit term of A045572 that when concatenated with 14 produces a prime not in the sequence; this is A045572(4) = 9. Therefore a(10) = 149.
a(11) = 1409 since the digit sum of a(10) is again 14; having not found a prime for A045572(k) with k < 4, we thus attempt to find the smallest double-digit term of A045572 that when concatenated with 14 produces a prime not in the sequence; this is 0 prepended to A045572(4) = 9. Therefore a(10) = 1409.
We note that a(1) and a(3) are not found in A045572, but a(2) = A045572(10), a(10) = A045572(60), and a(11) = A045572(504) using Formula 2.
We compute the digit sum of a(n − 1) and store it in s; if s is a novel prime p, a(n) = p.
If s is composite or not a novel prime p, then we use the decimal digits of s as a prefix, i.e., the initial or highest-rank digits of the prime to be stored in a(n). We initialize two increment variables j = k = 1, the former governing the number of decimal digits in a suffix t and the latter the index of A045572(k) which furnishes nonzero digits so as to construct the least prime p not already in the sequence, via concatenation of the digits of s and those of t. Therefore both the primality test on s<>t and a determination of the presence of s<>t in a(n) is required to be satisfied, the least nature of the prime is guaranteed by induction on k and j in a special way.
We note that we start with j = k = 1, and that A045572(1) = 1. We test s<>1 to determine if this constructs a decimal prime number not in the sequence and if not, we increment k. Thus we move on to A045572(2) = 3, etc. We note that in A045572, there are 4 terms with 1 decimal digit, {1, 3, 7, 9}, then 40 with less than or equal to 2 digits {(0)1, (0)3, (0)7, (0)9, 11, 13, 17, …, 97, 99}, and 400 with less than or equal to 3 digits {(00)1, (00)3, (00)7, (00)9, (0)11, …, 997, 999}. This is quite simple to explain, therefore we use the pair of increment variables (j, k) in tandem. Upon an invalid s<>t, we set j = j + [j == log10 k/4] and k = k × [j < log10 k/4] + 1. Thus, for j = 1, k may advance to 4, but the next increment mst bring j to 2 and reset k to 1. For j = 2, k may advance to 40, then continue to j = 3 (resetting k = 1) and k as high as 400, etc. This way we prepend a sufficient number of zeros ahead of the digits of A045572(k) so as to compose a suffix t that when concatenated behind s yields the least novel prime p. (See Code 1.)
We might have also simply used k as an increment on its own, without modification using A045572 and running to 10 j − 1. This would start with t = 1, but if s<>1 fails, would needlessly test s<>2 (forming an even number). It would be nearly as efficient to increment k = 1 by 2 and avoid even numbers. The method described above regarding A045572 is slightly more efficient than this approach.
We might reduce time associated with generation of the sequence if we were able to store the last (j, k) for a given s already in the sequence. This way, instead of running primality and element testing on s<>t for all initial (j, k), we could pick up where we left off. For instance, we see that for a(10) = 149 = 14<>9, with (j, k) = (1, 4), we could have picked up these iterators when we re-encounter s = 14 for a(10). Instead of re-running (j, k) = (1, 1), appending 1 and finding 141 composite (again), then (j, k) = (1, 2) and appending 2 to find 143 composite (again), etc., we could merely increment k and pick up where we left off. Unfortunately, this would expand memory and lookup-time requirements, which might conspire to render the program too burdensome on memory and processor time for the optimization to be successful.
Another method is to put up the sequence in two parts, s(n) and t(n), where s(n) contains the prefixes, and t(n) the reversed suffixes (so as to preserve leading zeros). Let rev(x) be the function that reverses the decimal digits of integer x. Instead of searching for absence of s, we would determine the index i of the last occasion of s, then look up rev(t(i)), apply j = ⌊1 + log10 rev(t(i))⌋ and k = 4Q + (r + 1)/2 – [r > 5] to rev(t(i)) and initialize (j = j + [j == log10 k/4], k = k × [j < log10 k/4] + 1). This advances (j, k) to the next coordinate such that we pick up where we left off. Unfortunately, these gymnastics complicate the ordinary generation in the early stages of the sequence and necessitate the storage of additional data for each term such that it might only come in handy as n is large.
We have programmed an algorithm (see Code 2) that notes the term, s, and (j, k) so as to reduce generation time in a middle phase of generation, despite the fact we will suffer lookup delays as the record increases in length. Using the twin iterator approach (Code 1), it took about an hour to generate 10000 terms, but this was reduced to only 5 using this middle-phase generator. Code 2 required 5:20 hours to generate 100000 terms.
A third approach (see Code 3) involves storing prefix s as an index, and the last successful iterator pair (j, k) to reduce lookup time. Given 100,000 terms of a(n), there are only 37 prefixes s:
2, 4, 5, 7, 8,
10, 11, 13, 14, 16, 17, 19,
20, 22, 23, 25, 26, 28, 29,
31, 32, 34, 35, 37, 38,
40, 41, 43, 44, 46, 47, 49,
50, 52, 53, 55, 56
This approach generates 1000 terms in .28 s and 10000 terms in 21.34 s. Code 3 generates 100,000 terms in 3278 s. Pre-iteration of (j, k) complicates and delays execution; it is more efficient (as regards Wolfram language and this particular code) to allow the first suffix to fail for repeated prefixes. Code 3 is most effective even for dozens or hundreds of terms. The generation of 100,000 terms in under an hour perhaps puts a quarter million in striking distance (using Mathematica).
A fourth approach involves generation and storage of u = A045572, then simply looking up the terms using a simple iterator k and dispensing with j, in place of concocting A045572(k) and padding left so as to have j digits. (See Code 4.1, requires Code 4.2) A simple lookup operation would replace the management of induction on (j, k) and the custom computation and recomputation of A045572(k). The values required may be reversed, i.e.,
1, 3, 7, 9
10, 30, 70, 90, 11, 31, 71, ..., 99;
100, 300, 700, 900, 110, 310, 710, ..., 999;
...
so as to “grab and go” rather than produce the terms during generation. (See Code 4.2) The associated lookup times and memory requirements come into play in any delays associated with this approach. However in this method, we merely iterate to the k-th term and find the suffix t in reverse. Relevant to this thoughtline, the greatest value of k in the generation of a(n) to 100,000 terms is 40000 at n = 93351, which would have instead been k = 44444 according to this method. Therefore the lookup table u would need to be nearly as large as the desired number of terms in order to suffice.
This approach should still be used in combination with prefix-iterator memoization so as to reduce testing time. A variant of this method stores the digits of the suffix to reduce reversal steps.
Oddly this method seems to require more time to generate a thousand terms than Code 3.
Since a(n) is self-referential, the principal barrier to generating a large dataset is the need to constantly check a(n) for n large for the presence of s or s<>t. Reducing the searches via memoization prove helpful. It may prove helpful to first determine primality then, given a prime s or s<>t, we proceed to check if free. It remains necessary indeed to check if free since, very infrequently, p enters the sequence via concatenation fairly early.
A way to perhaps alleviate the burdensome non-membership test is to create a sort of annotation file.
Note that, for 10i ≤ p < 10(i + 1), dividing the interval into sub-intervals as to initial decimal digits d, for those prefixes s (which may be one or more digits) such that s is indivisible by 3, we find that the sub-intervals fill from least to greatest prime. There are limited lone primes produced early, but most primes in a(n) occur in a contiguous block. For instance, for n ≤10000, writing prime indices, we might delimit run lengths rather than list primes. Here, however we will list run lengths as starting and ending primes with a dash between them:
1, 3-59, 71-89, 101-113, 131-149, 163, 179, 191-139, 251-269, 281-293, 281-293, 311-317, 347-359, 373-389, 401-479, 491-599, 701-887, 1009-1193, 1301-1499, 1601-1789, 1901-2099, 2203-2399, 2503-2699, 2801-2999, etc.
This list is more compact. If we list primes that begin a run of ins and outs, noting the parity of the index, we can succinctly determine whether prime p is in a(n) without searching, say, hundreds of thousands of terms. When we enter a new p into a(n), we check the next and previous prime to p to see if we add it to an existing range, or break an even run (i.e., a run of not-ins) into two ranges. This annotation is relatively complex, but it would seem to be preferable to searching a long list of jumbled primes for membership, thereby speeding up the program. This has not yet been put into effect.
Let’s try to disentangle a(n) from its self-referential nature, at least partly alleviating the need to pour through a large list of values to find the next value.
We note some of the properties that will be explained later regarding the plot of a(n), most significantly, the “striation”, that is, the escalation of the primes p of the form that starts with the digits of prefix s (which is not divisible by 3) and a suffix t that consists of A045572(k) padded left by zeros so as to have j digits. Suppose s = 14. The striation associated with s = 14 is the infinite sequence {149, 1409, 1423, 1427, 1451, 1453, 1459, 1471, 1481, 1483, 1487, 1489, 1493, 1499, 14009, 14011, …}. Therefore it is a sort of NextPrime operation sensitive to the fact that, if we generate a prime that does not have the first two decimal digits “1, 4”, we need to run NextPrime not on, say, 149, but 1400, or 14000 vs. 1499.
How do we know the trajectory of the striation associated with a given primitive s can proceed unobstructed? This is a paramount question; if it isn’t true, generation of the sequence without a full-sequence novelty test is not possible.
Indeed, certain primes appear early via a novel prefix (sum of digits). For instance, through composite first digits, we get 23 and 53 quite early. We have a(2) = 23 ← 2, but through a digit sum of 23, a(133) = 233 ← 1787; after the first ten thousand terms, s = 23 has appeared more than 900 times, more often than any other prefix. For a(4) = 53 ← 5, the prefix s = 53 first appears at a(75712) = 5303 ← 2978999. In both these cases, the prefix 23 derives itself from the prefix 2, which later generates prefix 29 as well. We know that the smallest prime q that sums to p is in A054750, but generally so as to simplify, we observe the largest decimal digit is 9, thus, division of s by 9 can serve as a starting point for how far we need to go in taking into account suffixes. It is reasonable to assume we might only need to consider single- and double-digit prefixes. Thus, we consider prefixes A001651 = {2, 4, 5, 7, 8, 10, 11, 13, 14, …} and need to take into account when a suffix like 23 arises from the striation associated with 2.
Outside of the generation of prime s2 through concatenation s1<>t for a prime s1 with fewer digits, nothing would seem to obstruct the striation associated with s2 outside of the (repeat) appearance of s2 itself, Therefore, it seems we may write a program in a completely different and non-self-referential way. When we have prefix s, we simply look up the last prime p generated by s and use a modified NextPrime function f(p) to produce the next term. This ought to vastly speed generation of the sequence even for relatively small n.
In having pursued this computation avenue, we find that we can generate 100,000 terms in .23 s and a million in 23. The problem is that we do indeed experience conflict between the prefixes s, according to their provenance according to digits. For instance, the suffixes 20, 22, 23, 25, 26, 28, and 29 are subsets of s = 2. Therefore the program must be modified to seek the smallest term that exceeds the greatest value stored in s. Thus far the algorithms have yet to be adjusted to incorporate this subroutine.
In having adjusted algorithms, we find that we might generate a(n) for 1 ≤ n ≤ 6185 error-free, but we have a conflict thereafter. This algorithm sums a(6185) = 100003 → s = 4, where the register s(4) = 433. However, s(40) has already been set to 4003 (a(5638) = 4003 ← 194989), thus it decides 4007 is the next term rather than the correct 439. Hitherto the assumption was that the program would reliably select the appropriate novel prime if it could create a larger prime than in all the related registers, i.e., those that were initial-digit parents or children of s. For s = 4, these would include {40, 41, 43, 44, 46, 47, 49} and {400, 401, 403, …, 497, 499}, etc., but since we had surmised we’d only need to calculate 2-decimal-digit prefixes, we merely consider the first range. The trouble is that s(40) = 4007 cannot truly interfere with s(4) = 433, since, according to its lexical interpretation, s(4) has already progressed beyond the former; the appropriate move would be to nextprime(433) → 439. Therefore a simple greater-than test is insufficient; we would need to account for the prefix of the number under consideration.
The application of so many constraints in this endeavor complicates the algorithm, slowing it. It may be more efficient, once corrected, to generate n > 105, and yet might be the only practical way to generate a million terms of a(n).
We might examine the first emergence of primes p and prefixes s in a(n).
Table 1 lists the prefix s, the number of instances of s, the first position m of s, value of a(m), and progenitor a(m − 1):
s pop m a(m) a(m-1)
----------------------------------
2 4 1 2 Given
4 11 15 41 211
5 33 3 5 23
7 109 28 7 43
8 146 5 83 53
10 331 13 101 1423
11 610 6 11 83
13 1045 40 13 823
14 1301 10 149 59
16 2202 25 163 1429
17 2919 18 17 89
19 4121 65 19 829
20 4519 86 2003 839
22 5910 119 223 1759
23 6799 133 233 1787
25 7690 159 251 1789
26 7695 258 263 1979
28 8014 303 281 1999
29 8101 553 293 19289
31 7589 722 31 17599
32 6905 912 3203 17789
34 5702 966 347 19699
35 5257 1138 353 19889
37 3822 2832 37 191899
38 3108 2901 383 191999
40 2014 5522 4001 194899
41 1646 7655 4111 196799
43 985 10601 4327 286999
44 658 10713 443 198899
46 338 12095 461 199999
47 237 30954 479 2919899
49 102 46369 491 2939899
50 53 46448 50021 2939999
52 13 68528 5209 2969899
53 11 75712 5303 2978999
55 1 92403 5501 2998999
56 1 93352 5623 2999999
...
We observe that the prefix s = 23 appears most often, 906 times, with n ≥ 10000, but for n ≥ 100,000, s = 29 has the maximum population of 8101.
Table 2 shows general data for a(n), 1 ≤ n ≤ 120. We note that m is the index of a(n) in A045572, that, since 2 and 5 do not appear in A045572, a(1) and a(3) is not found there:
n a(n) s < > t j k m
------------------------------
1 2 2 . . .
2 23 2.....3 1 2 10
3 5 5 . . .
4 53 5.....3 1 2 22
5 83 8.....3 1 2 34
6 11 11 . . 5
7 29 2.....9 1 4 12
8 113 11....3 1 2 46
9 59 5.....9 1 4 24
10 149 14....9 1 4 60
11 1409 14...09 2 4 564
12 1423 14...23 2 10 570
13 101 10....1 1 1 41
14 211 2....11 1 5 85
15 41 4.....1 1 1 17
16 503 5....03 2 2 202
17 89 8.....9 1 4 36
18 17 17 . . 7
19 809 8....09 2 4 324
20 173 17....3 1 2 70
21 1103 11...03 2 2 442
22 509 5....09 2 4 204
23 1427 14...27 2 11 571
24 1429 14...29 2 12 572
25 163 16....3 1 2 66
26 103 10....3 1 2 42
27 43 4.....3 1 2 18
28 7 7 . . 3
29 71 7.....1 1 1 29
30 811 8....11 1 5 325
31 107 10....7 1 3 43
32 821 8....21 1 9 329
33 1109 11...09 2 4 444
34 1117 11...17 1 7 447
35 109 10....9 1 4 44
36 1009 10...09 2 4 404
37 1013 10...13 1 6 406
38 521 5....21 1 9 209
39 823 8....23 2 10 330
40 13 13 . . 6
41 47 4.....7 1 3 19
42 1123 11...23 2 10 450
43 73 7.....3 1 2 30
44 1019 10...19 1 8 408
45 1129 11...29 2 12 452
46 131 13....1 1 1 53
47 523 5....23 2 10 210
48 1021 10...21 1 9 409
49 401 4....01 2 1 161
50 541 5....41 2 17 217
51 1031 10...31 2 13 413
52 547 5....47 2 19 219
53 167 16....7 1 3 67
54 1433 14...33 2 14 574
55 1151 11...51 2 21 461
56 827 8....27 2 11 331
57 179 17....9 1 4 72
58 1709 17...09 2 4 684
59 1721 17...21 1 9 689
60 1153 11...53 2 22 462
61 1033 10...33 2 14 414
62 79 7 9 1 4 32
63 1601 16...01 2 1 641
64 829 8....29 2 12 332
65 19 19 . . 8
66 1039 10...39 2 16 416
67 137 13....7 1 3 55
68 1163 11...63 2 26 466
69 1171 11...71 2 29 469
70 1049 10...49 2 20 420
71 1439 14...39 2 16 576
72 1723 17...23 2 10 690
73 139 13....9 1 4 56
74 1301 13...01 2 1 521
75 557 5....57 2 23 223
76 1733 17...33 2 14 694
77 1447 14...47 2 19 579
78 1607 16...07 2 3 643
79 1451 14...51 2 21 581
80 1181 11...81 2 33 473
81 1187 11...87 2 35 475
82 1741 17...41 2 17 697
83 1303 13...03 2 2 522
84 701 7....01 2 1 281
85 839 8....39 2 16 336
86 2003 20...03 2 2 802
87 563 5....63 2 26 226
88 1453 14...53 2 22 582
89 1307 13...07 2 3 523
90 1193 11...93 2 38 478
91 1459 14...59 2 24 584
92 191 19....1 1 1 77
93 11003 11..003 3 2 4402
94 569 5....69 2 28 228
95 2011 20...11 1 5 805
96 409 4....09 2 4 164
97 1319 13...19 1 8 528
98 1471 14...71 2 29 589
99 1321 13...21 1 9 529
100 709 7....09 2 4 284
101 1609 16...09 2 4 644
102 1613 16...13 1 6 646
103 11027 11..027 3 11 4411
104 11047 11..047 3 19 4419
105 1327 13...27 2 11 531
106 1361 13...61 2 25 545
107 11057 11..057 3 23 4423
108 1481 14...81 2 33 593
109 1483 14...83 2 34 594
110 1619 16...19 1 8 648
111 1747 17...47 2 19 699
112 193 19....3 1 2 78
113 1367 13...67 2 27 547
114 1753 17...53 2 22 702
115 1621 16...21 1 9 649
116 1051 10...51 2 21 421
117 719 7....19 1 8 288
118 1759 17...59 2 24 704
119 223 22....3 1 2 90
120 727 7....27 2 11 291
Furthermore we observe a pattern in s. Let the term “trine” refer to a number m that is congruent to 0 (mod 3), i.e., 3 | m, and “nontrine” a number that is congruent to 1 or 2 (mod 3), thus indivisible by 3; we define 0 as “trine”. We find that s such that s ≡ 0 (mod 3) (outside of 3 itself) does not occur, since the sum of decimal digits of prime q is never divisible by 3 (outside of 3 itself). This is because a trine sum of digits of q indicates divisibility of q by 3, contradicting the notion of prime q outside of 3 itself. This is well-attested in the Rules of 3 and 9 (decimal divisibility tests). This impacts the appearance of terms in a(n) that have initial digit combinations that would form a number s such that 3 | s. These terms are delayed in their appearance in the sequence because we need a digit sum s of a(n – 1) sufficiently large enough to produce the trine prefix s. This implies that for certain small s, we require very many terms of a(n) to see them, as s increases.
We note that primes that begin with 3-, 6-, or 9- are quite delayed in their appearance in a(n), and that 3 is not in a(n). For example, a(722) = 31, with progenitor 17599; a(2832) = 37 ← 191899. In order for prime 61 to appear, we must have progenitor whose digit sum is 61. These terms remain to appear in a(n) for n ≤ 100,000.
Table 3 shows the first position m of prime pi , along with progenitor a(m − 1), given the first 100,000 terms of a(n). We know 3 does not occur so we place a dash (−) in the third and fourth columns. We have yet to see primes 61, 67, and 97, etc., primes whose initial digit is indivisible by 3:
i p_i m a(m-1)
---------------------
1 2 1 .
2 3 - -
3 5 3 23
4 7 28 43
5 11 6 83
6 13 40 823
7 17 18 89
8 19 65 829
9 23 2 2
10 29 7 11
11 31 722 17599
12 37 2832 191899
13 41 15 211
14 43 27 103
15 47 41 13
16 53 4 5
17 59 9 113
18 61 . .
19 67 . .
20 71 29 7
21 73 43 1123
22 79 62 1033
23 83 5 53
24 89 17 503
25 97 . .
26 101 13 1423
27 103 26 163
28 107 31 811
29 109 35 1117
30 113 8 29
...
Figure 1: The plot of a(n) for 1 ≤ n ≤ 1000 features “clouds” associated with the first appearance of D-digit numbers in the sequence. The sequence D(n) lists the index of the first appearance of a d-digit number in a(n): {1, 2, 8, 11, 93, 1163, 10000, …}. These are respectively {2, 23, 113, 1409, 11003, 170003, 2300003, …}.
The horizontal bands or “striations” are associated with the re-emergence of prefix s with a new suffix t that produces a strictly increasing new prime p = s<>t that is generally the next prime that begins with the digits of s. For instance, the rough row of points that begins at about y = 11000 (precisely, (x, y) = (n, a(n)) = (93, 11003)) pertains to the prefix s = 11. This striation begins with 11, then moves to 3 digits with 113, then to four with {1103, 1109, 1117, 1123, 1129, 1151, 1153, 1163, 1171, 1181, 1187, 1193}, and finally, prominently in Figure 1, but parenthetically, beyond to n = 2194, a(n) = 11987:
{11003, 11027, 11047, 11057, 11059, 11069, 11071, 11083, 11087, 11093, 11113, 11117, 11119, 11131, 11149, 11159, 11161, 11171, 11173, 11177, 11197, 11213, 11239, 11243, 11251, 11257, 11261, 11273, 11279, 11287, 11299, 11311, 11317, 11321, 11329, 11351, 11353, 11369, 11383, 11393, 11399, 11411, 11423, 11437, 11443, 11447, 11467, 11471, 11483, 11489, 11491, 11497, 11503, 11519, 11527, 11549, 11551, 11579, 11587, 11593, 11597, 11617, 11621, 11633, (11657, 11677, 11681, 11689, 11699, 11701, 11717, 11719, 11731, 11743, 11777, 11779, 11783, 11789, 11801, 11807, 11813, 11821, 11827, 11831, 11833, 11839, 11863, 11867, 11887, 11897, 11903, 11909, 11923, 11927, 11933, 11939, 11941, 11953, 11959, 11969, 11971, 11981, 11987)}
These striations persist in a roughly linear fashion, gradually increasing as the digit which follows those of s in p proceeds from 0 to 9, after which point, the striation terminates, resuming in a level roughly an order of decimal magnitude higher. The more often the prefix s arises (certainly via the digit sum of the last term) the faster the striation exhausts itself. We see that the striation, and thus the cloud structure, roughly arranges itself according to n, but it needs not heed the decimal logarithm of n.
Ultimately we will see terms with 6 digits beginning with 110017, and 7 digits with 1100009, etc. This implies a clustering in the shape of a cloud, at a level gauged by the integer length of a(n), therefore an ever higher series of such clouds.
The striations appear to be paired since these pertain to prefixes s that are not divisible by 3.
Therefore, we observe clouds of striations generally gauged (but not strictly so) to the decimal logarithm of n.
Figure 2: A log plot or a(n) for 1 ≤ n ≤ 10000:
Figure 3: We colorize Figure 2 according to the first decimal digit of a(n) using a spectrum such that the digits 3, 6, and 9 correspond respectively with the primaries red, green, and blue. Furthermore, we accentuate the primaries at full saturation, yet the other digits appear at ¼ saturation.
Thus, though s such that 3 | s does not occur, we see the clouds ever higher gradually edging into the “decimal forbidden digits” of 3, later certainly 6 and 9, because the digit sums s begin to enter the thirties, sixties, nineties, etc.
Table 4: We note that numbers that begin with 3 appear, but those that start with 6 or 9 have not (yet) appeared. Indeed, the first appearance of digits for 1 ≤ n ≤ 100,000:
D n a(n)
--------------
1 6 11
2 1 2
3 722 31
4 15 41
5 3 5
6 . .
7 28 7
8 5 83
9 . .
Figures that start with 6 or 9 have not appeared, since the prime digit sum s = 6 or s = 9 is impossible; we have to wait till we have a prime digit sum 61 for the former, and 91 for the latter. Having generated 100,000 terms of a(n), we estimate that 6 will appear within the range 100,000 ≤ n ≤ 200,000.
Figure 4 is a log-log plot of 100,000 terms of a(n) showing the cloud pattern continuing:
Table 5 shows the first position n of d-digit a(n), along with progenitor. We separate s from t with ".", though the number is not a decimal.
n a(n) a(n-1)
--------------------------
1 2 Given
2 2.3 2
8 11.3 29
11 14.09 149
93 11.003 191
1163 17.0003 29033
10000 23.00003 313637
93359 29.000039 3280961
...
Let sequence d(n) = a(n)/10^⌊log10 a(n)⌋, that is, a(n) reduced to 1 ≤ d(n) < 10. Figure 5 plots d(n) with 1 ≤ n ≤ 1000, coloring points with 1 decimal digit blue, 2 cyan, 3 green, 4 gold, 5 red, etc. We see that the “clouds” indeed do fit into one another somewhat, not without limited intermingling.
Table 6: We accentuate the only numbers that begin with digit 3 in a darkened color at right. These points pertain to the following:
n a(n) a(n − 1)
----------------------
722 31 17599
825 31...1 19489
892 31...3 16699
912 32..03 17789
926 31...7 19597
952 31..09 19687
957 32..09 19697
966 34...7 19699
These points belie the fact that such numbers result from the entry of a prime digit sum s (i.e., 31 = 1+7+5+9+9), then the occasion of the same prime already in the sequence necessitating concatenation with the smallest single-digit (in this case) term of A045572, until the single-digit possibilities are exhausted, then double-digit instances, etc. For a(912) = 3203, we see the composite digit sum s (32 = 1+9+4+8+9) with the necessary concatenation with t. Indeed this proves to be the source of the horizontal striations in the plots. We note that numbers beginning with “33” are missing as expected. These shall not show until there are preceding terms a(n − 1) whose digit sum s = 330.
Figure 6 is an extended plot of Figure 5 in the range 1 ≤ n ≤ 10000, based on Sycamore noting that higher clouds appear to fit within those pertaining to terms in a(n) with one fewer digit:
Again, we accentuate the terms in a(n) that begin with trine digits (here, solely 3). Remember that color in a reverse spectral order relates to increasing decimal integer length; we observe the striations continue across the boundaries of the “clouds”.
Outside of the bulk of the “clouds”, we observe the occasion of high first-digit terms generated primarily early and scarcely, which we shall term “high flyers” for the sake of ease of reference in this work. Table 7 below shows such terms with 1 ≤ n ≤ 1000, generated on account of digit sums s = 5, 8, or 9:
n a(n) a(n - 1)
--------------------
3 5 23
4 5....3 5
5 8....3 53
9 5....9 113
16 5...03 41
17 8....9 503
19 8...09 17
22 5...09 1103
28 7 43
29 7....1 7
30 8...11 71
32 8...21 107
38 5...21 1013
39 8...23 521
43 7....3 1123
47 5...23 131
50 5...41 401
52 5...47 1031
56 8...27 1151
62 7....9 1033
64 8...29 1601
75 5...57 1301
84 7...01 1303
85 8...39 701
87 5...63 2003
94 5...69 11003
100 7...09 1321
117 7...19 1051
120 7...27 223
134 8...53 233
139 8...57 1061
152 7...33 2203
154 5...71 13001
156 7...39 13003
160 8...59 251
171 8...63 2213
182 7...43 14011
188 7...51 11113
197 7...57 11131
207 8...77 16001
209 7...61 2221
225 8...81 13103
229 7...69 2311
256 8...83 13121
266 8...87 10007
274 8..009 11213
312 8..011 10061
346 5...77 20021
350 7...73 20023
366 7...87 11311
389 8..017 20051
393 8..039 11321
399 7...97 22003
409 8..053 22013
420 7..001 421
421 8..059 7001
431 7..013 20113
439 8..069 22031
443 8..081 20123
462 5...87 10103
469 8..087 431
482 8..089 10133
498 8..093 23003
508 8..101 11411
510 7..019 10141
529 7..027 23011
569 5...93 20201
576 8..111 23021
580 7..039 22111
588 8..117 20231
592 8..123 10151
725 5...99 10211
728 8..147 10223
771 8..161 20411
774 8..167 23201
826 5..003 311
827 8..171 5003
893 7..043 313
913 8..179 3203
We see that eventually, j-digit suffixes t become exhausted in combination with the prefix s, then we turn to (j + 1)-digit suffixes, increasing the potential a(n) by an approximate order of decimal magnitude.
The clouds have a shape similar to a lunette or the well-known “swoosh” logo oriented so its smaller cusp is at bottom and the tail at top, slightly rising. We have noted in Table 1 the frequencies of prefixes s, that as n increases, the most frequent prefix s also increases (Table 8):
i s pop
3 17 103
4 23 906
5 29 8101
With a greater frequency comes a more intensive iteration of (j, k), making for a larger term, hence we see the most frequent prefixes responsible for the lunette shape. The lower cusp is shorter because these prefixes have generated over several clouds, while the cloud is only beginning to make headway among the larger prefixes s, making for a tail. The lower cusp tapers, since subsequent clouds can produce ever finer mantissas (according to logarithm base 10) via s = 10 concatenated with (j, k) with increasing j and low k. This said, we see prefix 10 infrequently and ever less frequently.
Figure 7 is diagrammatic, based on the premise of Figures 5 and 6, and accentuates the exponentially scaled clouds that are nested and in a lunette shape, where warmer colors are clouds of a higher number of decimal digits. The points are the high-flying primes that begin with digits 4, 5, 7, or 8. (the digit 9 is not shown for concision, since no term begins with 9 and will not appear for some time.)
Furthermore it appears that the top surface of the clouds makes headway as n increases, so that eventually we shall see terms beginning with 4 and 5 more commonly as s increases over 39, and that there doesn't seem to be a barrier to the entry of the clouds into terms that begin with 6 or 9, though it may require a great number of terms to support their appearance. We have observed the invasion of numbers beginning with digit 3. Per A054750, we see that the smallest prime that generates a digit sum of 61 is 898999, but we observe values beginning with 8 relatively rarely. See Table 5 for the smallest d-digit terms in a(n).
The apparent rising horizontal striations are the entries of numbers associated with the few prefixes s. In this chart, there are only 27; since s cannot be divisible by 3, we see a void or gap. This notion leads us to consider circumventing the self-referential nature of a(n) as a way of rapidly calculating its terms.
What happens when there are enough terms to fill the entire range of first-digits conveyed by Figure 6 and 7? It would seem that we would require more terms than we could reasonably generate. However, if we set the number base from b = 10 to b = 6 or b = 3, we would perhaps achieve a similar end.
Before we change bases, we would need to be conscious of the fact that nondivisor primes p only occur at the reduced residues mod b. Therefore, all primes p outside 2 and 5 for b = 10 are congruent to {1, 3, 7, 9} (mod 10), i.e., written in decimal, they end in one of those φ(b) = φ(10) = 4 digits, hence the relevance of A045572 to the case b = 10. For b = 6 (i.e., senary), there are only φ(6) = 2 possible end digits that harbor primes p outside 2 and 3, and those are 1 and 5. For b = 3 (i.e., ternary), there are only φ(3) = 2 possible end digits that harbor primes p ≠ 3, and those are 1 and 2. We see that the ternary case leaves us no “unforbidden” digits, since ternary numbers written conventionally (i.e., without leading zeros) must start with either 1 or 2.
Thus we define a6(n) analogously to a(n), except that the base we use is b = 6. We might start with any prime, but in this case we simply start with the first prime, as we had with b = 10. These are the decimal values of the first terms:
2, 13, 3, 19, 29, 59, 331, 37, 17, 7, 73, 23, 53, 293, 307, 43, 109, 149, 337, 47, 257, 263, 311, ...
but more usefully in terms of summing digits, we write them in senary (base 6):
2, 21, 3, 31, 45, 135, 1311, 101, 25, 11, 201, 35, 125, 1205, 1231, 111, 301, 405, 1321, ...
Figure 8 is a plot of a6(n) for 1 ≤ n ≤ 66 (i.e., 46656) in the character of Figures 6 and 7, exhibiting striations associated with prefixes that form clouds in a similar lunette shape. Since this image exhibits two-senary-digit prefixes s, we see that there are 6² = 36 combinations of the first two senary digits (less s such that 5 | s), therefore the striations for b = 6 appear grouped in fours. We see the smallest integer lengths exhibiting greater coarseness, and as the senary order of magnitude e increases, the striations seem to straighten out. We see where a striation jumps to the next order of magnitude, we reset the vertical position closest to s × 6e. These things are found in finer grain in the decimal plot, but are beautifully clarified here.
Figure 8 lends us a clue as to what we would see once the digit range is saturated. Indeed, the “digit range” has already been saturated once we have s > b. Indeed we are seeing a filling of the range b ≤ s < b² rather than 1 ≤ s < b.
Examining Figure 7, we see at left and most prominently at upper left, the “high-flyer” terms associated with single digit s, and we see these continue into the range wherein we see two-digit s come into play (i.e., the clouds).
Therefore we can expect the plot to intercalate triple-digit s horizontally and incorporating the longer-standing striations, and quadruple-digit s intercalated similarly, as n increases sufficiently. When three-digit s emerges with sufficiently large n, we should see such plotted between the lines of s = “10” (of any base) and s = “11” (of any base), i.e., s = “101” (written in any base). Therefore, if it were possible to include more terms in Figure 8, or, graphically, to “pan” or “page” right, we would see a group of four striations fitting between each of the striations (but two paired side by side in the “blank” area) in Figure 8. For a(n) (decimal), we would see two pairs of striations fitted between each two-digit-space, making for a very dense plot.
We define a3(n) analogously to a(n), except that the base we use is b = 3. We might start with any prime, but in this case we simply start with the first prime, as we had with b = 10. These are the decimal values of the first terms:
2, 13, 3, 7, 37, 223, 79, 2851, 1297, 5, 229, 2857, 1303, 2887, 11, 1381, 16921, 8209, 16927, 1549, ...
but more usefully in terms of summing digits, we write them in ternary (base 3):
2, 111, 10, 21, 1101, 22021, 2221, 10220121, 1210001, 12, 22111, 10220211, ...
Figure 9 is a plot of a3(n) for 1 ≤ n ≤ 310 (i.e., 59049) in the character of Figures 6 and 7. Due to the small nature of the number 3 and its perfect powers, the plot doesn’t seem very similar to that of a6(n) or a(n) (decimal). The striations do not seem to form clouds nor do they seem paired as they are in decimal, or in a group of 4 as in senary. The lunette shape appears to be absent, except that we do see a very shallow lunette whose center (associated with the highest-population prefix s) transitions so swiftly “upward” that subsequent lunettes appear to be cleaved in half.
The thing to note about Figure 9 is that we have saturated one, two, and are working on 3 digit prefixes. Hence we might regard “(digit) saturation” merely as the transition between one-, two-, three-, or m-digit s expressed in base b.
Therefore, the 2-digit saturation of the decimal plot will be achieved when we see the first triple-digit prefix (i.e., digit sum) s, perhaps at around n = 1010. As n increases, the plot will begin to fill in starting from the bottom as three-digit prefixes emerge.
Thus, we can summarize all the major aspects of the plot of a(n).
1. The digit sums s = a(n − 1) such that q | (b − 1) does not divide s dare the font of the “striations”. These striations represent a subsequence of primes p (apart from p | b) whose base-b expansions begin with the digits d of s.
2. The striations beginning with s represent chains of primes that can be generated by a modified recursive nextprime(s × be) function that, whereupon the next prime would fail to begin with the digits d of s, would reset to nextprime(s × b(e + 1)). The chains with single digit s are generally suppressed by chains whose multiple digits d begin with the single digit s. The apparent length of the striations is governed by the frequency with which the sum s appears in a(n) for n across a given magnitude. Once all k-digit terms in the chain of s are exhausted, the striation “pops” to the next order of magnitude, seeming to form part of a cloud.
3. The striations seem to form “clouds”, especially given a sufficient number of D-digit s; their pairing or grouping pertains to the arrangement of reduced residues (mod b) since nondivisor primes p expressed in base-b may only end in digits belonging to those residues (mod b).
4. The “saturation” of the plot gives rise to an even denser series of clouds wherein the striations occur in a tighter formation related to the dendritic reduction according to the number of digits in s.
5. “Forbidden” D-digit prefixes s do not serve as a barrier to (D + 1)-digit s that begin with that forbidden digit, therefore the plot of the first digits of s will be filled.
It seems assured (without rigor) that eventually we will have all primes p ≠ 3 in A337099, on account of the prime digit sums s of immediately preceding terms eventually entering the sequence. It is key to observe the smallest sums (on account of the digit sums of smaller numbers early in the sequence not fully covering), but in the production of many combinations of decimal digits as the sequence accumulates many terms, we should have all p in A337099 via attrition. Indeed some of these sums enter “early” via the concatenation function s <> t. These will be primes that begin with nontrine digits (1, 2, 4, 5, 7, 8). It seems, with 10000 terms of a(n), we are beyond the point where it might be that we wouldn’t fully cover the primes, say, 5 through 37. At this point it seems assured we will have 61 et al. “eventually”, i.e., given enough terms.
We have examined ab(n), that is, a base b analog to a(n), considering the sums of base-b digits, using base-b arithmetic, when we examined “digit saturation”. The behavior of these base-b (b ≠ 10) sequences is similar to that of a(n) (i.e., the decimal variety originally considered). We had . Setting these alternatives
If a(1) is set to 3 instead of the least prime 2, we might be able to construe the sequence as a permutation of the primes, if the above conjecture proves true. Instead of reusing a(n), we deem this sequence b(n). The other properties of the sequence (a cloud structure in the plot, computational approaches, etc.) are expected to be largely the same. b(n) begins:
3, 31, 41, 5, 53, 83, 11, 2, 23, 59, 149, 1409, 1423, 101, 29, 113, 503, 89, 17, 809, 173, 1103, 509, 1427, 1429, 163, 103, 43, 7, 71, 811, 107, 821, 1109, 1117, 109, 1009, 1013, 521, 823, 13, 47, 1123, 73, 1019, 1129, 131, 523, 1021, 401, 541, 1031, ...
We observe the production of all four single-decimal-digit primes and overall, similar behavior to a(n). It may be interesting to compare the emergence of primes p in a(n) and in b(n). This paper may be extended to examine b(n) and compare it to a(n).
Table 1b lists the prefix s, the number of instances of s, the first position of s and associated value, and progenitor in both sequences:
s pop m a(m) a(n-1) s pop n b(n) b(n-1)
2 4 1 2 Given 2 3 8 2 11
4 11 15 41 211 4 10 3 41 31
5 33 3 5 23 5 33 4 5 41
7 109 28 7 43 7 109 29 7 43
8 146 5 83 53 8 146 6 83 53
10 331 13 101 1423 10 331 14 101 1423
11 610 6 11 83 11 610 7 11 83
13 1045 40 13 823 13 1045 41 13 823
14 1301 10 149 59 14 1301 11 149 59
16 2202 25 163 1429 16 2202 26 163 1429
17 2919 18 17 89 17 2918 19 17 89
19 4121 65 19 829 19 4121 66 19 829
20 4519 86 2003 839 20 4519 87 2003 839
22 5910 119 223 1759 22 5910 120 223 1759
23 6799 133 233 1787 23 6799 134 233 1787
25 7690 159 251 1789 25 7691 160 251 1789
26 7695 258 263 1979 26 7695 259 263 1979
28 8014 303 281 1999 28 8014 304 281 1999
29 8101 553 293 19289 29 8101 554 293 19289
31 7589 722 31 17599 31 7588 723 311 17599
32 6905 912 3203 17789 32 6905 910 3203 17789
34 5702 966 347 19699 34 5703 964 347 19699
35 5257 1138 353 19889 35 5257 1132 353 19889
37 3822 2832 37 191899 37 3822 2820 37 191899
38 3108 2901 383 191999 38 3108 2901 383 191999
40 2014 5522 4001 194899 40 2014 5521 4001 194899
41 1646 7655 4111 196799 41 1646 7658 4111 196799
43 985 10601 4327 286999 43 985 10601 439 286999
44 658 10713 443 198899 44 658 10694 443 198899
46 338 12095 461 199999 46 338 12104 461 199999
47 237 30954 479 2919899 47 237 30951 479 2919899
49 102 46369 491 2939899 49 102 46379 491 2939899
50 53 46448 50021 2939999 50 53 46453 50021 2939999
52 13 68528 5209 2969899 52 13 68522 5209 2969899
53 11 75712 5303 2978999 53 11 75711 5303 2978999
55 1 92403 5501 2998999 55 1 92403 5501 2998999
56 1 93352 5623 2999999 56 1 93352 5623 2999999
This table demonstrates the similarity of the sequences at least as far as populations of s and their progenitors are concerned. Compare to Table 1.
Table 3b shows the first position m of prime pi , along with progenitor a(m − 1), given the first 100,000 terms of a(n). We know 3 does not occur in a(n), so we place a dash (−) in the third and fourth columns. We compare these to the first position n of prime pi , along with progenitor b(n − 1), given the first 100,000 terms of b(n).
i p_i m a(m) a(m-1) n b(n) b(n-1)
-----------------------------------------------------
1 2 1 2 Given 8 2 11
2 3 . . . 1 3 Given
3 5 3 5 23 4 5 41
4 7 28 7 43 29 7 43
5 11 6 11 83 7 11 83
6 13 40 13 823 41 13 823
7 17 18 17 89 19 17 89
8 19 65 19 829 66 19 829
9 23 2 23 2 9 23 2
10 29 7 29 11 15 29 101
11 31 722 31 17599 2 31 3
12 37 2832 37 191899 2820 37 191899
13 41 15 41 211 3 41 31
14 43 27 43 103 28 43 103
15 47 41 47 13 42 47 13
16 53 4 53 5 5 53 5
17 59 9 59 113 10 59 23
18 61 . . . . . .
19 67 . . . . . .
20 71 29 71 7 30 71 7
21 73 43 73 1123 44 73 1123
22 79 62 79 1033 63 79 1033
23 83 5 83 53 6 83 53
24 89 17 89 503 18 89 503
25 97 . . . . . .
26 101 13 101 1423 14 101 1423
27 103 26 103 163 27 103 163
28 107 31 107 811 32 107 811
29 109 35 109 1117 36 109 1117
30 113 8 113 29 16 113 29
Note that the progenitors of pi are generally the same in both sequences. This may lead us to consider that there may be a way to surmise how these numbers might be efficiently predicted. We know that the list of prefixes s remains small, threading through the one- and two-digit numbers. Therefore it may serve us to identify the smallest prime that exceeds, say, s × (10 s/9 − 1). This may give us a useful reference for the progenitor of 61 in either sequence. Compare to Table 3.
Table 4b: Comparison of first position m of a(m) beginning with digit d, and n of b(n) beginning with same, for sequences with 100,000 terms each:
d m a(m) n b(n)
-----------------------
1 6 11 7 11
2 1 2 8 2
3 722 31 1 3
4 15 41 3 41
5 3 5 4 5
6 . . . .
7 28 7 29 7
8 5 83 6 83
9 . . . .
Compare this to Table 4.
Let R(n) signify the decimal digital root of n. (Here we concern ourselves only with decimal or base b = 10.) A digital root is simply the number resulting from repeatedly adding the digits of a number again and again till we arrive at a single digit. Thus, for the number 1048576, the sum of digits is 31; the sum of the digits of 31 is 4. So 4 is the digital root of 1048576.
This sequence r(n) has a clearer and less-contested set of lineages or striations, that is, in the case of r(n), the single-digit prefix s.
The digit-sum version a(n) admits multiple decimal digit s, for instance, the prime 4999 → 31 = s, but the digital-root version r(n) requires 4999 → 31 → 4 = s, noting that any further iteration of the latter still results in the fixed point 4.
We know that we cannot have s = R(n) such that 3 | s for the same reason as a(n), namely that decimal numbers m whose sum-of-digits is divisible by 3 are likewise themselves divisible by 3. Thus, we will never see s = 3, 6, or 9, except in the explicit case where R(n) = prime p = 3. We are left then with only 6 possible values of s: 1, 2, 4, 5, 7, or 8. Generalizing to any base b, s is forbidden from any multiple kd such that 1 < d | (b − 1). Therefore, hexadecimally (i.e., for b = 16), we may only have 8 possible values of s; 1, 2, 4, 7, 8, 11, 13, 14.
Therefore, given the constriction of all the action that makes for a(n) to just 6 lineages in the case of r(n), there is much less intercalation of striations apropos to 10e and those to 10(e + 1). These exhaust themselves much more quickly and are more “connected” with the decimal magnitude m of n, i.e., m = log10 n. There are only so many D-digit decimal prime numbers p that begin with 1, 2, 4, 5, 7, or 8. Once we are out of them, the function pops to the next order of magnitude, beginning with the smallest prime greater than d × 10(e + 1).
We then have a plot (Figure 10 below) of a scaling up of six lineages, only seemingly paired, pertaining to r(n) = 1, 2, 4, 5, 7, or 8. The graph is pretty different; it is a simplification of the behavior of a(n), wherein we have six fairly linear striations to a “cloud”.
Figure 11 is a log-log plot of r(n):
The “clouds” grow ever more distinctly resolved and the lineages with less intercalated delay as the magnitude increases. Like that of a(n), the plot of r(n) doesn’t paint the striations such that they are consecutively constructed line by line, but instead, skips between and among them, even between magnitudes, based on the reserve of d-initial digit primes that occur in the interval. We see the advent of large d primes (i.e., those beginning with 8) for a given magnitude precede those of small d primes (i.e., those beginning with 1). A question: would there be a case where the large-d primes of a given magnitude exhaust before the small-d primes? If so, this would seem possible only for very large n at a threshold that lies perhaps given several dozen or a hundred orders of decimal magnitude.
We can align the lineages attributable to s, which in the case of r(n), is rendered very simple in comparison to a(n). Thus, we plot r(n)/ ⌈log10 r(n)⌉ in order to organize the chart in terms of their initial single-digit s. This plot (Figure 12 below) clarifies the modus operanda of the function, showing that indeed there are 6 lineages that exhaust themselves before “graduating” to the next magnitude M. In the plot, we color-code according to M, so that r(n) plotted red has 5, gold 4, green 3, cyan 2, and dark blue 1 decimal digit:
Perhaps more clear is a plot where the x axis (i.e., the index n) is plotted as to log n. This plot (Figure 13 below) allows the smaller n to scale and exhibit the effect of the magnitude of the index on the value of r(n):
In Figure 13, we see the lineages growing ever fuller and purer, the curves (an artifact of the log-index plot) becoming more “complete” and closer to a pure curve as n increases.
We have allowed the number of decimal digits to govern both the color and size of the points shown in Figure 13. Therefore we can easily pick out the first several dozen terms {2, 23, 5, 53, 83, 29, 211, 41, 59, 503, 89, 809, 811, 11, 223, 7, 71, 821, 227, 229, 43, 73, 13, 47, 233, 823, 401, 509, 521, 827, …}. We note the delayed start of the lineage of 7, something in common r(n) suffers in common with a(n).
Indigo large dots represent single-digit terms 2, 5, and 7 (we are missing 3, since it begins with a forbidden digit). Cyan medium dots represent double-digit terms 23, 53, 83, 29, 41, 59, etc. Dark green small dots represent three-digit terms, etc.
Indeed, looking only at the lineage of one of the six primitive values of s, we see a clear induction on the prime index π(p) for p starting with digit s, constrained by having the first digit s. For instance, for s = 4, there is no prime with 1 digit starting with 4, but we have 41, 43, 47 in cyan, then {401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499}, then {4001, …, 4999}, etc., in this order, again only examining the portion of the plot between 4 ≤ y < 5. Therefore, this sequence is clearly not a permutation of the primes, clearly missing primes that start with 3, 6, or 9 in base ten.
What is missing from the behavior of r(n) vs. a(n) is the burgeoning of two-digit s, then three-digit s, which tends to sweep out across the initial digits 3, 6, and 9 so as to cover them in a(n) as n approaches infinity. Therefore r(n) is definitely not a permutation of the primes. We will never see, for example, primes 31, 37, 61, 67, 97, not to mention larger primes beginning with 3, 6, or 9.
The sequence r(n) lends credence to our analysis of a(n).
We surmise the chart, being markedly simple, would prove similar in other bases so as to render examining them relatively uninteresting.
This concludes our examination of OEIS A337099.
Code 1: Generate A337099 using suffix length j and A045572(k) (twin iterator approach):
Nest[Block[{a = #, d = IntegerDigits[Total@ IntegerDigits[#[[-1]] ] ], k = 0, j = 1, n},
Append[#,
While[Nand[FreeQ[a, #], PrimeQ@ #] &@
Set[n, If[k == 0, FromDigits@ d,
FromDigits[Join[d, PadLeft[IntegerDigits[(5 k + Mod[3 k + 2, 4] - 4)/2], j]]]]],
If[# == j, Set[{j, k}, {# + 1, 1}], k++] &[Log10[k/4]]]; n]] &, {2}, 120]
Code 2: Generate A337099 using iterator memoization approach:
Nest[Block[{a = #, d = IntegerDigits[Total@ IntegerDigits[#[[-1, 1]] ] ], k = 0, j = 1, n},
If[FreeQ[a[[All, 2]], d], Set[{j, k}, {0, 1}],
Set[{j, k}, a[[Position[a[[All, 2]], d][[-1, 1]], -1]]]];
Append[a,
While[Nand[FreeQ[a[[All, 1]], #], PrimeQ@ #] &@
Set[n, If[k == 0, FromDigits@ d,
FromDigits[
Join[d, PadLeft[IntegerDigits[(5 k + Mod[3 k + 2, 4] - 4)/2], j]]]]],
If[# == j, Set[{j, k}, {# + 1, 1}], k++] &[Log10[k/4]]]; {n, d, {j, k}}]] &,
{{2, {2}, {0, 1}}}, 10^5][[All, 1]]
Code 3: Generate A337099 using external prefix memoization approach:
Block[{t},
Nest[Block[{a = #, d = IntegerDigits[Total@IntegerDigits[#[[-1]] ] ], k = 0, j = 1, s, n},
s = FromDigits@ d;
If[ListQ[t[s]], Set[{j, k}, t[s]] ];
Append[#,
While[Nand[FreeQ[a, #], PrimeQ@ #] &@
Set[n, If[k == 0, s,
FromDigits[
Join[d, PadLeft[IntegerDigits[(5 k + Mod[3 k + 2, 4] - 4)/2], j]]]]],
If[# == j, Set[{j, k}, {# + 1, 1}], k++] &[Log10[k/4]]];
Set[t[s], {j, k}]; n]] &, {2}, 10^5] ]
Code 4.1: Generate A337099 using external prefix memoization approach:
Block[{t},
r = r045572[[1 ;; 4*10^Floor@ Log10[nn]]];
Nest[Block[{a = #, d = IntegerDigits[Total@IntegerDigits[#[[-1]] ] ], k = 0, j = 1, s, n},
s = FromDigits@ d;
If[ListQ[t[s]], Set[k, t[s]] ];
Append[#,
While[Nand[FreeQ[a, #], PrimeQ@ #] &@
Set[n, If[k == 0, s, FromDigits@ Join[d, Reverse@ IntegerDigits@ r[[k]] ]]], k++];
Set[t[s], k]; n]] &, {2}, 10^5] ]
Code 4.2: Pre-generate suffixes t in reverse (i.e., the “r045572” in Code 4.1 above):
Flatten@ Table[#*10^(n - IntegerLength@ # + 1) &@
IntegerReverse[(5 k + Mod[3 k + 2, 4] - 4)/2], {n, 0, 6}, {k, 4*10^n}]
Figure 10. Base 3, log-log plot of a3(n) for 1 ≤ n ≤ 310 (i.e., 59049).
Figure 11: Base 6, log-log plot of a6(n) for 1 ≤ n ≤ 66 (i.e., 46656).
A045572: Odd numbers indivisible by 5.
A054750: Smallest prime number whose digits sum to n-th prime.
A337099: a(n) (see above).
2020 0826 1730 Created.
2020 0827 1930 Conjecture at end.
2020 0828 1200 Figure 3.
2020 0828 1545 Figures 4 & 5.
2020 0829 0915 Extension of a(n) to 100,000 terms, associated observations and tables.
2020 0829 1100 Sequence b(n) to 100,000 terms and associated observations and tables.
2020 0830 0830 Nature of the shape of the clouds, continual increase in terms of a(n) to 300,000.
2020 0830 1515 Figure 7 and description of the nature of the plot of a(n).
2020 0830 1800 Attempted disentanglement of the self-referential nature of a(n).
2020 0902 2215 Studies in bases 3 and 6.
2020 0904 1615
Contents, computation experimentation.
2020 0910 1600 r(n) a digital-root version of a(n), Figures 10-13.