Sunday, May 24, 2015

Enranda Demo Output

Home

Even if you cannot build Enranda on your system, the demo output is a tutorial in its own right. Here is what it printed out on my 64-bit Linux system ("make demo"):

---------------------------------------------------------------------------
Enranda Demo
Copyright 2016 Russell Leidich
http://enranda.blogspot.com
build_id_in_hex=00000029

Follow along with main() in demo.c, so you can learn how to use Enranda.
The math details are explained at the webpage linked above.

Before doing anything else, we need to call enranda_init() to ensure that
all required features, bug fixes, and performance improvements are present.

OK that worked. So let's make some noise! Start simple. Let's generate 16
random bytes using enranda_entropy_u8_list_get(). We'll store them to
*random_u8_list_base, which is a preallocated array. Here we go...

random_u8_list_base[0000000000000010]={
  95,4A,B2,61,A5,AD,1B,03,13,C7,51,C4,71,A3,C2,24
}

By the way, we actually generated (2^19) bits of entropy behind the scenes,
which is Enranda's private granularity. As explained on the webpage, Enranda
outputs less entropy than it inputs, so there is no pseudorandomness involved,
apart from trapdooring big entropy into small entropy. But is the result
really convincingly random? With only 16 bytes, who knows...

To help answer this question, we can create (2^16) samples, each of which 16
bits in size. If it's truly random, the logfreedom of the result should be
equally likely to be less than or equal to the median logfreedom, as greater
than or equal to it. dyspoissometer_logfreedom_median_get() will provide us
with the median logfreedom. Unfortunately, it takes too long for demo purposes,
but if you run it yourself, you'll find that the median is close to
7.267915800940217E+05; the exact median is combinatorially hard to find, but
polynomially easy to approximate (and more accurately so than we require).

By the way, logfreedom is most useful as a randomness test when the mask count
(2^16, in this case) equals the mask span (which it does). So let's get 10 such
random mask sets, and measure the logfreedom of each:

logfreedom_median=7.267915800940217E+05
logfreedom=+7.2678731861577597803887273532641950E+05
logfreedom=+7.2678993443417500137163254689272729E+05
logfreedom=+7.2679226382915216262420277489021370E+05
logfreedom=+7.2679102351412089924108049017367813E+05
logfreedom=+7.2679310420115307514812895393644069E+05
logfreedom=+7.2678748279634064662311687400376423E+05
logfreedom=+7.2679271483546965416836363664072744E+05
logfreedom=+7.2678897461051454301246910040808018E+05
logfreedom=+7.2679333110324673231410638914464449E+05
logfreedom=+7.2678646993628213448377263688990068E+05
timestamp_delta=0000000001F99D66

As to performance, this ^ hexadecimal value is the number of CPU clock ticks
that it took to generate (2^19) bits of entropy. Based on CPU frequency, you
can accurately predict how long it would take to generate some larger number of
bits; lesser populations take the same amount of time.

Granted, 10 samples is not enough to really see whether we're straddling the
median with equal probability to the left and right, but you can easily run
this demo several times and tabulate the results for yourself (or just modify
the code). In any event, it should be obvious from the results above that
Enranda outputs very hard randomness which dances around the median, just like
an ideal TRNG would. A weak TRNG would tend to stay below the median, and
perhaps move up to it over time, or alternatively remain stuck at maximum
logfreedom all the time because it was contrived to do so pseudorandomly.

But how does this logfreedom compare to that of a "good" PRNG? For some
insights on this matter, we turn to the Mobius function.

The Mobius function is defined as follows, for all positive integers N:

  +1 for N=1
  -1 for N=2
  else 0 if N contains a prime factor with an exponent of at least 2
  else +1 if N contains an even number of prime factors
  else -1

According to Wikipedia, the first 23 values starting with N=1 are:

{1, -1, -1, 0, -1, 1, -1, 0, 0, 1, -1, 0, -1, 1, 1, 0, -1, 0, -1, 0, 1, 1, -1}

Informally, the famous Riemann Hypothesis says that Mobius is random.
Specifically, +/-1 occur with asymptotically equal probability; however, there
is a dearth of zeroes. Thus we have in Mobius a ternary PRNG in which the
middle value exhibits bias. Because of this bias, we must instead turn to what
I call the "bimobius" (binary Mobius) function as a randomness standard,
which is replicated in both demo.c and mathematica.txt for your convenience.

To understand bimobius, we first remove the zeroes from the Mobius function:

{1, -1, -1, -1, 1, -1, 1, -1, -1, 1, 1, -1, -1, 1, 1, -1}

Then we convert +/-1 to their sign bits -- zero and one, respectively. Hang on
a moment while I generate a big blob of bimobius...

{0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1}

Done! By the way, the above text was generated by demo_bimobius_u16_list_get().

It may also be of interest to you that if we have N terms of Mobius, then this
will asymptotically reduce to (6N/(pi^2)) terms of bimobius. The reason relates
to the fact that the bimobius function is essentially a mapping of the
least-common-multiples of coprimes to their number of prime factors mod 2.
(6/(pi^2)) is the probability that 2 random positive integers will be coprime.

So what's the logfreedom of a bitstring containing (2^20) ((2^16) samples of 16
bits each) sequential bimobius bits? Let's see...

bimobius_logfreedom=+7.2678872718252676957666371703746903E+05

As you should be able to see from above, Enranda's logfreedom is similar to
that of bimobius. (Of course, bimobius has infinite domain, so "similar" is the
best we can say.) Now, because bimobius is actually more random than Mobius for
the reasons stated above, Enranda is thus noisier than Mobius, which is in turn
more random than any polynomial boolean function if the Riemann Hypothesis is
true (http://arxiv.org/pdf/1103.4991v4.pdf).

Now we should take some time to explain how to use Enranda. It all starts with
enranda_init(), a call to which you can see in main(). (Watch out for a NULL
return value!) Then you have 2 options for generating entropy, which you can
change anytime: (1) low-latency, low-bandwith call-by-call accrual, until
(2^19) bits are ready to issue or (2) high-bandwidth single-call accrual,
wherein entropy is generated (2^19) bits at a time until the caller's
buffer has been filled. Each option buffers pregenerated entropy for future
calls, so once (2^19) bits are generated, they can be served out as needed,
whereupon they are automatically refreshed when depleted. For the first
option, see enranda_entropy_accrue(); for the other, see
enranda_entropy_u16/u32/u64/u8_list_get().

Let's practice calling enranda_entropy_accrue(), which you can follow along
with in main(). The whole point of this function is to allow a caller to
accrue a bit or so of entropy with each call, which keeps latency low
but allows the caller to accrue randomness while idle, for example, while
waiting for a network connection. (Just make sure not to call it so often that
your process stays awake when it should actually go to sleep and save on energy
consuption; ultimately option #2 may be better, depending on the situation.)
Here we go:

call_count=0000000000076C95

enranda_entropy_accrue() finally returned zero, after call_count accrual calls.
In other words, it took that many calls to produce (2^20) bits of protoentropy,
which will provide for (2^19) bits of output entropy. Typically, each call
provides about 3 bits of protoentropy, which is why option #2 is faster if you
can tolerate the latency required to fill the entire protoentropy buffer.

Now it's time to call enranda_entropy_u16/u32/u64/u8_list_get() to get a string
of entropy out of the (2^19) random bits that we just generated. Let's get it
in u64 form because that's the most efficicient. We'll send them to
*random_u64_list_base. But first, we'll zero that memory (totally unnecessary)
just so you can see the entropy appear:

random_u64_list_base[0000000000000007]={
  0000000000000000,0000000000000000,0000000000000000,0000000000000000,
  0000000000000000,0000000000000000,0000000000000000
}

OK now let's fill it with 3 (u64)s of entropy, starting at index 2 (just
because it makes for a good demo):

random_u64_list_base[0000000000000007]={
  0000000000000000,0000000000000000,F057FA1CD9128473,B60C640900A50250,
  860A4465C12E4876,0000000000000000,0000000000000000
}

Now let's say that for some crazy reason, we need 1234567 (u32)s of entropy
immediately. We can call enranda_entropy_u32_list_get() to fill a buffer in
memory, regardless of the previous output -- if any -- from
enranda_entropy_accrue(). Enranda will internally generate entropy in units of
(2^19) bits, as always, but the caller need not care about that. Here we go:

timestamp_delta=0000000049AB391A

Done! That ^ is how many CPU ticks it took for (1234567*32)=39506144 bits (a
few more, actually, if we account for the block size).

For sake of brevity, here are the first 4 (u32)s:

random_u32_list_base[0000000000000004]={
  34A946DC,AB9B1365,958F5DED,4CE8FD3E
}

...and the last 4:

random_u32_list_base[1234563][0000000000000004]={
  0A094068,0E4C1926,8D83A55F,0800D726
}

Now let's look at Enranda's mean order-(2^16) kernel density, as measured over
1000 samples (of (2^16) 16-bit random values, per sample). Kernel density is
explained at:

http://cognomicon.blogspot.com/2014/12/the-kernel-density-randomness-metric.html

This might take a few minutes because kernel density is extremely noisy and
therefore requires longterm averaging. I will display 10 intermediate results
so you can see the progress. But first, I will compute the expected kernel
density, to which Enranda should in theory converge...

kernel_density_expected=+4.8906783042517839115244667290812965E-03
kernel_density_mean=+5.4905957514696782178217821782178216E-03
kernel_density_mean=+5.2196445749766791044776119402985078E-03
kernel_density_mean=+5.0478710288621262458471760797342195E-03
kernel_density_mean=+5.1573945994389027431421446384039900E-03
kernel_density_mean=+5.1274404316367265469061876247504988E-03
kernel_density_mean=+5.0637852927412645590682196339434278E-03
kernel_density_mean=+5.0302633375312054208273894436519261E-03
kernel_density_mean=+4.9986535690250468164794007490636707E-03
kernel_density_mean=+4.9735184647266925638179800221975583E-03
kernel_density_mean=+4.9296569824218749999999999999999999E-03

Typically, you should see kernel_density_mean oscillating within about 5% of
kernel_density_expected, or less at later values. We can measure the quality of
their mutual compliance with kernel skew, as discussed at:

http://cognomicon.blogspot.com/2014/12/the-kernel-density-randomness-metric.html

enranda_kernel_skew=+5.0395348787036105112565782716273278E-01

This ^ value is always on [0.0, 1.0]. It should be as close to 0.5 as
possible, although lesser and greater values are actually asymmetrical in
their meaning. In truth, kernel skew is like dyspoissonism: it's useful for
comparing 2 mask lists, but is not particularly informative in and of itself.

Therefore, for sake of comparison, here is the kernel skew of the first (2^20)
bits of bimobius:

bimobius_kernel_skew=+2.4335797057849715773640258925882887E-01

Hmm... bimobius doesn't look very random. Let's try the _next_ (2^20) bits...

bimobius_kernel_skew_next=+6.5829904760400329378926710921420918E-01

That's a bit better, but still quite skewed away from randomness. Maybe the
prime numbers are more orderly than we thought, which would not be suprising
because computing bimobius is (merely) subexponentially expensive.

This concludes the demo. Now deallocate *enranda_base using enranda_free()...

Done! Have fun using Enranda. And don't forget to try out otpenranda,
timedeltaprofile, and timedeltasave as well (see README.txt).

Scroll up or redirect to a file in order to see what you missed!
---------------------------------------------------------------------------

Enranda Source Code

Home / Previous Lesson

The latest stable release is here.

This is the source code for Enranda for Linux (32-bit and 64-bit), Mac (64-bit), and Windows (32-bit), which is in C language. We include a Mathematica implementation of the bimobius function just for fun.

README.txt has everything you need to get started. It will tell you how to build and run the demo, which contains code for several commonplace purposes. It also explains some useful bonus utilities included in the package. For your convenience, you can see the actual text of the demo output here because I think it has educational value (even though it might be somewhat out-of-sync with the most recent source code).

Statistical Analysis of Enranda's Output

Home / Previous Lesson / Next Lesson

It would be pointless for me to share my measurements because you wouldn't (and shouldn't) trust them. I based my "ultrahard" assessment on iterative median logfreedom and kernel skew measurements with Dyspoissometer, which turned out to be of comparable hardness to the bimobius function. (The bimobius function is asymptotically indistinct from binary noise, if the Riemann Hypothesis is true.) You can see this for yourself in the Enranda demo. But knock yourself out with your own statistical test suite. If you find anything suspicious, which would most likely be caused by a bug as opposed to a design flaw, email me.

For your convenience, Enranda comes with otpenranda (One-Time-Pad Generator with Enranda), a command line utility which allows you to create power-of-2-sized files comprised of unmodified Enranda output. You can then analyze the resulting file with dyspoissofile or an application of your own choosing.

Permutative Trapdoors and History Hashes

Home / Previous Lesson / Next Lesson

A practical answer is to introduce a second corkscrew hash, with a rotation coprime to both 16 and (the rotation of sequence_hash). In the code, this is known as "history_hash". history_hash is simply a hash of every unique sequence_hash in the order in which it was observed. It is for all practical purposes utterly unpredictable, but we can tolerate the first (2^15) history hashes being predictable. (In reality, the first one is virtually indeterminate to begin with, let alone the (2^15)th.)

There is the obvious temptation to use the history hash as entropy directly. I don't like this idea because it slightly facilitates differential prediction, whereby an attacker might be able to predict something about the next 16 bits of entropy from the previous 16. Instead, we invoke the titanic entropy power of the permutation. It works like this:

enranda_rewind() (which need never be called directly, as it launches from enranda_init() automatically) sets up a permutation consisting of every whole number on [0, (2^16)-1] at unique_list_base (where "unique" reflects the fact that each whole is thus unique). After every update to history_hash, it becomes a swap index into unique_list_base. So after the first update, we swap the whole number at index zero with the whole number at index history_hash. After the next update, we swap the whole number at index one with the whole number at index history_hash, etc. The result, after (2^16) updates of history_hash, is a random permutation of all possible 16-bit values. This permutation constitutes Enranda's protoentropy. It can assume any of about (2^954,036) possible states (2^(log2((2^16)!))). Eventually, the low half of the protoentropy will be added (more or less, as explained below) to the high half in order to create entropy for output.

Now perhaps you can understand why we can "only" tolerate roughly (2^15) predictable (history_hash)es: the output entropy is half as large as the protoentropy. Granted, half of the protoentropy doesn't quite equate to the entropy of (2^15) 16-bit values because they are all unique; but for all practical purposes, this rule of thumb is valid.

It's important to point out that while the permutation is created during entropy accrual, the trapdoor addition which "folds" the protoentropy in half occurs at the moment when entropy is requested, via the enranda_entropy_u16/u32/u64/u8_list_get() family of functions. For its part, the addition occurs with carry propagation across at least 16 successive bits, even if the output buffer is byte-granular. The reason for this is that (1) we don't want to propagate the carry continuously, because that would serialize the output process, which would be bad because the whole idea is that entropy needs to be delivered immediately and (2) 16-bit propagation is sufficient, because although it induces a bias in the output, such bias is statistically invisible to even theoretically acquirable quantum computing resources. This is due to the very slightly constrained number of possible outputs which one can obtain by adding the low half of a permutation to its high half. For example, it's not possible to generate (2^15) 16-bit zeroes in a row, but ((2^15)-1) is indeed possible. I conjecture that the bias reduces the number of possible outputs from (2^19) to (R(2^19)), where R is the reflexive density constant. Either way, such a bias is negligible.

Antiperiodic Countermeasures: Sequence Hashes and Populations Thereof

Home / Previous Lesson / Next Lesson

Plausible periodic timedelta sequences must not be allowed to enable an attacker to predict any part of the output entropy. For safety's sake, we should endeavor to ignore them entirely. Let me say up front that, while Enranda cannot reject every periodic signal, it does so successfully up to period 255 (with caveats as explained below). Period 255 is utterly implausible in the most quiescent systems, let alone those interesting enough to warrant attack, unless they have been rooted in order to construct such a protracted period via malicious virtualization (a blue pill attack), in which case, the TRNG output is irrelevant anyway. So in practice, Enranda does indeed ignore all periodic signals. But without performing a computationally expensive operation such as treewise string compression, how can we accomplish this?

Enter the "sequence hash". Here's how it works. Suppose we have a period-5 timedelta stream:

{8, 9, 8, 3, 3, 8, 9, 8, 3, 3, 8, 9, 8, 3, 3, 8, 9, 8, 3, 3...}

We then identify, one at a time, each unique sequence in the order of discovery. In the periodic example above, this process would produce the following unique sequences separated by semicolons:

{8; 9; 8, 3; 3; 8, 9; 8, 3, 3; 8, 9, 8; 3, 3; 8, 9, 8, 3; 3...}

We don't actually search backwards, which would be computationally expensive. Instead Enranda implements the sequence hash ("sequence_hash" in the code) as a 16-bit "corkscrew hash" of successive 16-bit timedeltas until said hash is found to be "unique" (meaning that it has not occurred in the past (2^16) such hashes, as ensured by a ring buffer of sequence_hashes at sequence_hash_list_base, and their populations at sequence_hash_count_list_base). Just like its metal counterpart, a corkscrew hash rotates on each iteration in order to make it as sensitive as possible to the order of particular timedeltas, yet symmetric with respect to each iteration in that all bits of all timedeltas influence the corkscrew hash, regardless of order. The simplest corkscrew hashes meeting these criteria are rotate-and-add and rotate-and-xor; we use the former for the sake of maximum collision avoidance. (See enranda_entropy_accrue() for an example.) The reason for the rotation (which is coprime to 16 (bits)), is to achieve maximum dispersion despite most timedeltas being of similar small magnitude.

As soon as a unique sequence_hash is discovered, it's stored to sequence_hash_list_base and its population at sequence_hash_count_list_base is set to one, whereupon sequence_hash itself is reset to zero. But if a previously encountered hash is seen again (as evidenced by its population being nonzero), its population is incremented (such that wrapping back to zero is prohibited). The only way in which population can be decremented is by virtue of a sequence_hash expiring from sequence_hash_list_base, which contains (2^16) items.

Critically, the populations of all (2^16) possible (sequence_hash)es at sequence_hash_count_list_base are initially set to one by enranda_rewind(). This almost gaurantees that the first (2^16) timedelta observations will be ignored. A few such observations pass through this filter, but it doesn't matter because of the way that protoentropy is hashed into entropy, as explained later. And of course, there are hash collisions from time to time, meaning that we fail to detect a small fraction of unique sequences. This is a minor tax on efficiency, which is well worth the speed advantages of keeping the ring buffer and population list small enough to fit in most CPU caches.

So how many hashes, in the worst case, would be required to detect a sequence with period 255? In the worst case, if all the timedeltas were unique, then we would have 255 sequences of period one. We would also have 255 sequences of periods 2, 3... 255. If every such sequence had a unique hash, then we would need (255*(1+2+... 255)), or 32,640 (because sum(n)=(n(n+1)/2)), items in the ring buffer, which is slightly less than half of the (2^16) items actually present. The reason for the factor of (1/2) will become clear in the next lesson, but essentially, we need to ensure that at least half of the unique (sequence_hash)es were generated in a truly unpredictable manner. So in other words, we are making the conservative assumption that the first 32,640 hashes are all different but actually predictable due to a period-255 timedelta sequence. In reality, tiny periods seldom exceeding 4, which are much harder to forget (or to regard as unique in the first place) on account of requiring quadratically fewer unique (sequence_hash)es to circumscribe, will dominate.

So conservatively, a unique sequence_hash is worth at least 16 bits, on average, because by definition it occurs less than once in (2^16) such hashes. But we cannot directly use them as protoentropy, let alone entropy! The reason is that, while unique (sequence_hash)es are mostly unpredictable, we can at least say with confidence that popular (sequence_hash)es will not be among them. This constitutes a weak but potentially exploitable bias. What to do? See the next lesson.

The Trouble with Timestamps

Home / Previous Lesson / Next Lesson

It's obviously tempting to read the TSC in a very tight loop, feeding each read into a pseudorandom number generator (PRNG), in order to produce what passes for the output of an ideal TRNG. The problems with such an approach are many, which we will investigate in detail below.

First of all, we need to introduce the concept of a "timedelta". A timedelta is the difference between 2 successive TSC reads. Usually, timedeltas are small because the reads occur much closer to one another in time than the full TSC span (usually 64 but sometimes 32 bits). Nonetheless, some rare events can result in massive timedelta observations: (1) performing the pair of reads across a protracted environmental interrupt, such as a system suspend event; (2) core relocation due to OS task rescheduling, occasionally resulting in quasinegative timedeltas; or (3) bad OS or firmware code which improperly saves and restores the TSC, for example, across a CPU clock halt event required for power savings.

The timedelta stream nonetheless serves as the exclusive source of "protoentropy", the highly biased but slightly unpredictable "soup" which gets blended into bona fide entropy, for Enranda. At least, this saves us the embarrassment of claiming that the initial TSC state represents any particular entropy content; Enranda assumes it to be completely predictable, which is part of its coldboot attack immune system.

Now, reading the TSC in the manner described above is overwhelmingly biased. After all, a CPU is a moderately predictable beast, executing tight loops in identical time, almost every time. So at first glance, this bias appears to be an asset: we could simply ignore all the "usual" timedeltas, retaining the oddballs as a source of entropy. For example, we might have a timedelta stream that looks like:

{5, 5, 5, 15}

in which case, we would ignore the 5s and keep the 15. But what, exactly, is that 15 worth in term of bits? It occurred once out of 4 samples, so logically, it must be worth 2 bits. But there are some problems with that naive assumption. For example, the timedelta stream might continue like this:

{5, 5, 5, 15, 15, 15, 15, 15, 15}

in which case, the most reasonable conclusion would be that CPU frequency throttling by a factor of 3 simply expanded the usual timedelta by the same factor. But surely this cannot be true because the CPU should be able to execute each TSC read loop in the same number of clock cycles, regardless of the wall clock equivalent thereof. Yet, in fact, this counterargument turns out to be false on X86 and X64 CPUs (and probably others) due to a phenomenon called "TSC invariance".

TSC invariance is the scaling process by which the TSC tick rate is made to be constant relative to wall clock time, regardless of the current CPU clock frequency. This is easily accomplished by adding N (instead of one) on each TSC incrementation after dividing the clock frequency by N. (Noninteger scaling factors actually induce periodic cycles of addends which differ by one, for example {2, 2, 2, 3}, to make things look even more noisy despite maintaining absolute predictability.) I personally consider TSC invariance to be a design flaw because the TSC was never intended to measure wall clock time to begin with, but my contention is obviously moot.

Nevertheless we should be able to recognize scaled usual timedeltas in order to discard them as well. For instance, we know that this particular CPU can throttle by a factor of 3, so we can just ignore the 15s in addition to the 5s. Unfortunately, even if we resort to longterm statistical analysis, there is still no concrete means by which to discern CPU throttling factors, not the least of which because continuous scaling across hundreds of different frequencies is likely to become the norm in the future because it maximizes power and thermal efficiency, not unlike the replacement of gearwise tranmission in cars with continuously variable transmission. Moreover, different parts of the CPU operate in different clock domains. So for instance, despite the frequency throttling of the CPU by a factor of 2, the memory bus might be kept at the same speed because it consumes so little power relative to the CPU. In that case, we might find that a loop increases from 5 to 7 clocks. The 7 would look unique and remarkable from a naive integer ratio throtting perspective, but it would in fact be mundanely predictable and in no way representative of the notional scaling factor of 2.

It gets worse. We have what I call "autopseudorandomness", which is the tendency of software TRNGs to observe unique timedeltas due not to truly unpredictable environmental interrupts, but rather the pseudorandom effects of the read loop itself on CPU execution timing. For example, there might be memory accesses inside the loop which may or may not hit some particular level of the CPU cache, resulting in read stalls. The stalls might occur, or not, based on the pseudorandomness of cache contents. Instruction alignment might also play a role: because the CPU is constantly prefetching memory into the L1 (most local) cache, it might encounter delays due to speculatively prefetching memory addresses implied by instructions which exist beyond the end of the loop. Or perhaps the top of the loop is aligned on an odd byte, resulting in similar prefetching phenomena. Or worse, perhaps we are unknowingly executing inside a virtual machine, in which case the complicated but perhaps predictable behavior of the hypervisor (virtual machine manager) would fan the flames of autopseudorandomness.

Yet, when we read a sufficiently large number of timedeltas, even hypervisor autopseudorandomness can be filtered out to the extent to which it is truly predictable; any residue thereof would indeed contain entropy. (But what about the first loop? Is that not cold and predictable, yet speciously entropic? Yes, it is, which is why Enranda ignores most of the first (2^16) TSC reads (as explained later).) Ultimately, like all PRNGs, autopseudorandom sequences must by definition be periodic. (This is not theoretical. I have seen period 4 myself, for example.) While we cannot filter all periodic signals, Enranda has a robust filter for all periodic signals of realistic length in uncompromised machines, virtualized or not. We will describe this filter later.

For now, suppose for example that we have the following periodic timedelta stream:

{5, 5, 7, 8, 7, 7, 5, 5, 7, 8, 7, 7...}

So above we have period 6. Although 8 happens to occur only once per period, 5 and 7 occur more often, making it very difficult to discern the period without doing computationally expensive transforms (such as treewise string compression or a Fourier transform). Worse, what if the period is 20, just because some hypervisor resets something every 20 TSC reads? How would we detect such repetition efficiently? And moreover, how would we extract the residual entropy from quasiperiodic timedelta streams, which by definition are only mostly periodic but are peppered with rare unexpected values?

Notwithstanding these questions, let us assume that we somehow manage to delete all the periodic and predictable timedeltas from the stream, leaving a residual stream which is aperiodic and presumably unpredictable. When we analyze this stream, we notice that timedelta 29 occurs only about every 1000 filtered reads. We conclude that it must be worth about 10 bits because ((2^10)=1024). We need only hash the exact timestamp upon reading it, resulting in 10 white (uniformally unbiased) bits -- and voila! -- we have 10 bits of bona fide entropy.

Only, we don't! What we failed to realize is that the timedelta of 29 is due to some periodic hardware event, such as the audio device reading another line of data from memory once per millisecond. While it does not occur exactly every 1000 TSC reads, it occurs approximately every 1000, say 980 to 1020. So in fact, the jitter in the timedelta 29 observation is only worth about 5 bits, not 10! The only way to see this clearly would be to observe millions if not billions of timedeltas, gathering statistics all the while.

So I tried this. But it doesn't work, either, because the world is unpredictable to us (yet potentially predictable to attackers) in ways which defeat all the best statistical engines -- nevermind the appalling latency that we might induce. For example, it might be that a millisecond after we gather these timedelta statistics, a harddrive suddenly comes online and starts causing predictable yet unforeseen timing disruption. We mistake such disruptions for novelty, resulting in the output of predictable coldboot entropy. Game over.

Now perhaps you can appreciate why Enranda has been under development, on and off, for the past 3 years!

After discarding so many flawed designs, I experienced a rather obtuse epiphany: we must regard the entire timedelta stream as pseudorandom, as though it were entirely predictable to the hypervisor which pulls the strings controlling the TSC puppet. This is an obnoxiously pessimistic assumption, because, after all, noise happens, but it is nonetheless the safest assumption. How, though, does it leave any room to produce a TRNG at all? Technically, it doesn't. But there is such a thing as a PRNG of such complexity as to defy analysis to anyone not in direct control of the machine. Enranda regards the timedelta stream as a biased version of exactly this sort of byzantine PRNG, and uses it as its exclusive source of protoentropy.

An Informal Taxonomy of True Random Number Generators

Home / Next Lesson

Briefly, there are a few broad classes of TRNGs, enumerated in the following table. Enranda belongs to the "ultrahard" category.


Class
Typical Embodiment
Typical Vulns
Broken
A physical system which produces biased bits and outputs them directly as entropy, or perhaps a PRNG implemented in hardware and sold as a TRNG.The TRNG itself may be hardened, but it doesn't matter because its output is already biased.
Trust-Me-TRNGCPU "black magic" randomness register, open-source USB device, or a device which measures a nonlinear physical phenomenon (e.g. cosmic microwave background).Hardware poisoning on the manufacturing floor. Furthermore, the hardware cannot be virtualized (or aliased to alternative hardware) so as to preempt poisoning. Output poisoning in the path connecting the physical randomness source to the output entropy.
QuasiquantumComes in a box with a "quantum" sticker on the side. Or perhaps this takes the form of an app which simply extracts quantum noise from the quantum dots that comprise your camera array (CCD). In any event, it claims to produce true random numbers from the fundamental uncertainty of the universe, which is certainly within the realm of what is physically possible."It looks quantum to me" does not mean that it is not merely a simulation thereof. Even if we trust the manufacturer, qubits are easily biased by all manner of interference, some of which might be predictable (or worse, inducible) by an attacker within physical proximity. For example, is the quantum dot CCD noise dependent upon lighting conditions? Can the pixel data be read via device emissions as it traverses the IO bus to the CPU? What happens if the camera has hot pixels, which are stuck in a particular configuration? Does that make the output occasionally predictable? Do we really want our cameras turning on at just the moment we need cryptographically hard randomness to protect our privacy anyway? How can this app be trusted to dispose of the captured image frame, rather than send it to some database on the Internet?
UltrahardStatistically indistinct from an ideal TRNG by every computable metric. Coldboot attack is futile. Completely open-source code with minimal dependency on hardware which may be entirely virtualized in such a manner that the implementer has full control over which. Cannot be defeated other than by subverting the machine and taking control of its outputs via (possibly nested) virtualization (which even then, would be immensely difficult and sometimes detectable); at which point, there is no reason to care about the TRNG because the machine is rooted and exposed to complete exploitation. Obtaining an exact copy of the machine would be completely useless for the purposes of predicting the output, even if the attacker had a quantum computer of planetary dimensions. In other words, forget it.The only material vuln here, apart from a lazy security guard who left the door open to the server room, would be a bug in the implementation. For its part, Enranda has been carefully combed for such bugs. Nonetheless bug reports are highly encouraged from the security community, especially when they create security vulns. If you can demonstrate weaknesses in Enranda without subverting the OS, we will regard you as a hero, not a criminal!
IdealA theoretical unbiased source of bits.None.

Intro

For the impatient, source code is here. Nevertheless reading this documentation is strongly encouraged.

Enranda is a true random number generator (TRNG) without any external hardware requirements. If that sounds like a contradiction in terms, recall that my previous such product, Jytter, was exactly that: a TRNG which produced cryptographically hard true random numbers using only the CPU timestamp counter (TSC) as a source of entropy, which can be read from unprivileged (userspace) apps under various modern operating systems (OSes).

But that was 2012. In the intervening 3 years, the generous and incredibly patient sponsorship of Tigerspike has allowed me to pursue, among other things, a much faster and cryptographically simpler TRNG. It is my hope that you will use this product to create better security products for all of us, so that we need not trust third party black boxes ("trust-me-TRNGs") -- or even open source hardware, for that matter, which can suffer from poisoned electronic components at the time of manufacture.

Before getting into the details, you should understand some of the criticisms that have been leveled against Enranda. As with anything else in life, it's a tradeoff between convenience and security. It may or may not be suitable for your particular application. That's your decision and your responsibility to evaluate. Here is the thread on the Randombit cryptography mailing list.

The features of Enranda are outlined below, which are thoroughly expounded in the documentation that follows.

1. Ultrahard (yes, we actually explain that!).

2. Roughly 4 megabytes per second on a commodity laptop.

3. Robust against mistaking autopseudorandomness for genuine physical entropy.

4. Employs realtime filtering of periodic signals which appear entropic but by definition have zero information content.

5. Originally developed for X86 and X64 architectures including Linux, MacOS, and Windows, but portable anywhere that the TSC or its equivalent can be read. (For example, search the source for ABSTIME, which is intended to support ARM but has not been tested.) See README.txt for a comprehensive summary of which environments have been tested.

Following is the table of contents for our comprehensive tutorial. You will find other articles posted to this blog from time to time, but here are the basics: