Sunday, May 24, 2015

Enranda Demo Output

Home

Even if you cannot build Enranda on your system, the demo output is a tutorial in its own right. Here is what it printed out on my 64-bit Linux system ("make demo"):

---------------------------------------------------------------------------
Enranda Demo
Copyright 2016 Russell Leidich
http://enranda.blogspot.com
build_id_in_hex=00000029

Follow along with main() in demo.c, so you can learn how to use Enranda.
The math details are explained at the webpage linked above.

Before doing anything else, we need to call enranda_init() to ensure that
all required features, bug fixes, and performance improvements are present.

OK that worked. So let's make some noise! Start simple. Let's generate 16
random bytes using enranda_entropy_u8_list_get(). We'll store them to
*random_u8_list_base, which is a preallocated array. Here we go...

random_u8_list_base[0000000000000010]={
  95,4A,B2,61,A5,AD,1B,03,13,C7,51,C4,71,A3,C2,24
}

By the way, we actually generated (2^19) bits of entropy behind the scenes,
which is Enranda's private granularity. As explained on the webpage, Enranda
outputs less entropy than it inputs, so there is no pseudorandomness involved,
apart from trapdooring big entropy into small entropy. But is the result
really convincingly random? With only 16 bytes, who knows...

To help answer this question, we can create (2^16) samples, each of which 16
bits in size. If it's truly random, the logfreedom of the result should be
equally likely to be less than or equal to the median logfreedom, as greater
than or equal to it. dyspoissometer_logfreedom_median_get() will provide us
with the median logfreedom. Unfortunately, it takes too long for demo purposes,
but if you run it yourself, you'll find that the median is close to
7.267915800940217E+05; the exact median is combinatorially hard to find, but
polynomially easy to approximate (and more accurately so than we require).

By the way, logfreedom is most useful as a randomness test when the mask count
(2^16, in this case) equals the mask span (which it does). So let's get 10 such
random mask sets, and measure the logfreedom of each:

logfreedom_median=7.267915800940217E+05
logfreedom=+7.2678731861577597803887273532641950E+05
logfreedom=+7.2678993443417500137163254689272729E+05
logfreedom=+7.2679226382915216262420277489021370E+05
logfreedom=+7.2679102351412089924108049017367813E+05
logfreedom=+7.2679310420115307514812895393644069E+05
logfreedom=+7.2678748279634064662311687400376423E+05
logfreedom=+7.2679271483546965416836363664072744E+05
logfreedom=+7.2678897461051454301246910040808018E+05
logfreedom=+7.2679333110324673231410638914464449E+05
logfreedom=+7.2678646993628213448377263688990068E+05
timestamp_delta=0000000001F99D66

As to performance, this ^ hexadecimal value is the number of CPU clock ticks
that it took to generate (2^19) bits of entropy. Based on CPU frequency, you
can accurately predict how long it would take to generate some larger number of
bits; lesser populations take the same amount of time.

Granted, 10 samples is not enough to really see whether we're straddling the
median with equal probability to the left and right, but you can easily run
this demo several times and tabulate the results for yourself (or just modify
the code). In any event, it should be obvious from the results above that
Enranda outputs very hard randomness which dances around the median, just like
an ideal TRNG would. A weak TRNG would tend to stay below the median, and
perhaps move up to it over time, or alternatively remain stuck at maximum
logfreedom all the time because it was contrived to do so pseudorandomly.

But how does this logfreedom compare to that of a "good" PRNG? For some
insights on this matter, we turn to the Mobius function.

The Mobius function is defined as follows, for all positive integers N:

  +1 for N=1
  -1 for N=2
  else 0 if N contains a prime factor with an exponent of at least 2
  else +1 if N contains an even number of prime factors
  else -1

According to Wikipedia, the first 23 values starting with N=1 are:

{1, -1, -1, 0, -1, 1, -1, 0, 0, 1, -1, 0, -1, 1, 1, 0, -1, 0, -1, 0, 1, 1, -1}

Informally, the famous Riemann Hypothesis says that Mobius is random.
Specifically, +/-1 occur with asymptotically equal probability; however, there
is a dearth of zeroes. Thus we have in Mobius a ternary PRNG in which the
middle value exhibits bias. Because of this bias, we must instead turn to what
I call the "bimobius" (binary Mobius) function as a randomness standard,
which is replicated in both demo.c and mathematica.txt for your convenience.

To understand bimobius, we first remove the zeroes from the Mobius function:

{1, -1, -1, -1, 1, -1, 1, -1, -1, 1, 1, -1, -1, 1, 1, -1}

Then we convert +/-1 to their sign bits -- zero and one, respectively. Hang on
a moment while I generate a big blob of bimobius...

{0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1}

Done! By the way, the above text was generated by demo_bimobius_u16_list_get().

It may also be of interest to you that if we have N terms of Mobius, then this
will asymptotically reduce to (6N/(pi^2)) terms of bimobius. The reason relates
to the fact that the bimobius function is essentially a mapping of the
least-common-multiples of coprimes to their number of prime factors mod 2.
(6/(pi^2)) is the probability that 2 random positive integers will be coprime.

So what's the logfreedom of a bitstring containing (2^20) ((2^16) samples of 16
bits each) sequential bimobius bits? Let's see...

bimobius_logfreedom=+7.2678872718252676957666371703746903E+05

As you should be able to see from above, Enranda's logfreedom is similar to
that of bimobius. (Of course, bimobius has infinite domain, so "similar" is the
best we can say.) Now, because bimobius is actually more random than Mobius for
the reasons stated above, Enranda is thus noisier than Mobius, which is in turn
more random than any polynomial boolean function if the Riemann Hypothesis is
true (http://arxiv.org/pdf/1103.4991v4.pdf).

Now we should take some time to explain how to use Enranda. It all starts with
enranda_init(), a call to which you can see in main(). (Watch out for a NULL
return value!) Then you have 2 options for generating entropy, which you can
change anytime: (1) low-latency, low-bandwith call-by-call accrual, until
(2^19) bits are ready to issue or (2) high-bandwidth single-call accrual,
wherein entropy is generated (2^19) bits at a time until the caller's
buffer has been filled. Each option buffers pregenerated entropy for future
calls, so once (2^19) bits are generated, they can be served out as needed,
whereupon they are automatically refreshed when depleted. For the first
option, see enranda_entropy_accrue(); for the other, see
enranda_entropy_u16/u32/u64/u8_list_get().

Let's practice calling enranda_entropy_accrue(), which you can follow along
with in main(). The whole point of this function is to allow a caller to
accrue a bit or so of entropy with each call, which keeps latency low
but allows the caller to accrue randomness while idle, for example, while
waiting for a network connection. (Just make sure not to call it so often that
your process stays awake when it should actually go to sleep and save on energy
consuption; ultimately option #2 may be better, depending on the situation.)
Here we go:

call_count=0000000000076C95

enranda_entropy_accrue() finally returned zero, after call_count accrual calls.
In other words, it took that many calls to produce (2^20) bits of protoentropy,
which will provide for (2^19) bits of output entropy. Typically, each call
provides about 3 bits of protoentropy, which is why option #2 is faster if you
can tolerate the latency required to fill the entire protoentropy buffer.

Now it's time to call enranda_entropy_u16/u32/u64/u8_list_get() to get a string
of entropy out of the (2^19) random bits that we just generated. Let's get it
in u64 form because that's the most efficicient. We'll send them to
*random_u64_list_base. But first, we'll zero that memory (totally unnecessary)
just so you can see the entropy appear:

random_u64_list_base[0000000000000007]={
  0000000000000000,0000000000000000,0000000000000000,0000000000000000,
  0000000000000000,0000000000000000,0000000000000000
}

OK now let's fill it with 3 (u64)s of entropy, starting at index 2 (just
because it makes for a good demo):

random_u64_list_base[0000000000000007]={
  0000000000000000,0000000000000000,F057FA1CD9128473,B60C640900A50250,
  860A4465C12E4876,0000000000000000,0000000000000000
}

Now let's say that for some crazy reason, we need 1234567 (u32)s of entropy
immediately. We can call enranda_entropy_u32_list_get() to fill a buffer in
memory, regardless of the previous output -- if any -- from
enranda_entropy_accrue(). Enranda will internally generate entropy in units of
(2^19) bits, as always, but the caller need not care about that. Here we go:

timestamp_delta=0000000049AB391A

Done! That ^ is how many CPU ticks it took for (1234567*32)=39506144 bits (a
few more, actually, if we account for the block size).

For sake of brevity, here are the first 4 (u32)s:

random_u32_list_base[0000000000000004]={
  34A946DC,AB9B1365,958F5DED,4CE8FD3E
}

...and the last 4:

random_u32_list_base[1234563][0000000000000004]={
  0A094068,0E4C1926,8D83A55F,0800D726
}

Now let's look at Enranda's mean order-(2^16) kernel density, as measured over
1000 samples (of (2^16) 16-bit random values, per sample). Kernel density is
explained at:

http://cognomicon.blogspot.com/2014/12/the-kernel-density-randomness-metric.html

This might take a few minutes because kernel density is extremely noisy and
therefore requires longterm averaging. I will display 10 intermediate results
so you can see the progress. But first, I will compute the expected kernel
density, to which Enranda should in theory converge...

kernel_density_expected=+4.8906783042517839115244667290812965E-03
kernel_density_mean=+5.4905957514696782178217821782178216E-03
kernel_density_mean=+5.2196445749766791044776119402985078E-03
kernel_density_mean=+5.0478710288621262458471760797342195E-03
kernel_density_mean=+5.1573945994389027431421446384039900E-03
kernel_density_mean=+5.1274404316367265469061876247504988E-03
kernel_density_mean=+5.0637852927412645590682196339434278E-03
kernel_density_mean=+5.0302633375312054208273894436519261E-03
kernel_density_mean=+4.9986535690250468164794007490636707E-03
kernel_density_mean=+4.9735184647266925638179800221975583E-03
kernel_density_mean=+4.9296569824218749999999999999999999E-03

Typically, you should see kernel_density_mean oscillating within about 5% of
kernel_density_expected, or less at later values. We can measure the quality of
their mutual compliance with kernel skew, as discussed at:

http://cognomicon.blogspot.com/2014/12/the-kernel-density-randomness-metric.html

enranda_kernel_skew=+5.0395348787036105112565782716273278E-01

This ^ value is always on [0.0, 1.0]. It should be as close to 0.5 as
possible, although lesser and greater values are actually asymmetrical in
their meaning. In truth, kernel skew is like dyspoissonism: it's useful for
comparing 2 mask lists, but is not particularly informative in and of itself.

Therefore, for sake of comparison, here is the kernel skew of the first (2^20)
bits of bimobius:

bimobius_kernel_skew=+2.4335797057849715773640258925882887E-01

Hmm... bimobius doesn't look very random. Let's try the _next_ (2^20) bits...

bimobius_kernel_skew_next=+6.5829904760400329378926710921420918E-01

That's a bit better, but still quite skewed away from randomness. Maybe the
prime numbers are more orderly than we thought, which would not be suprising
because computing bimobius is (merely) subexponentially expensive.

This concludes the demo. Now deallocate *enranda_base using enranda_free()...

Done! Have fun using Enranda. And don't forget to try out otpenranda,
timedeltaprofile, and timedeltasave as well (see README.txt).

Scroll up or redirect to a file in order to see what you missed!
---------------------------------------------------------------------------

No comments:

Post a Comment