To collect this data, Cloudflare has arranged about 100 lava lamps on one of the walls in the lobby of the Cloudflare headquarters and mounted a camera pointing at the lamps. The camera takes photos of the lamps at regular intervals and sends the images to Cloudflare servers. All digital images are really stored by computers as a series of numbers, with each pixel having its own numerical value, and so each image becomes a string of totally random numbers that the Cloudflare servers can then use as a starting point for creating secure encryption keys.
But they’re just using it for a seed, so the output would be impossible to predict, but it feels like a checksum or something would approach a Gaussian distribution (the more numbers you add up, the more Gaussian it would be, since we know an image will have a mean and finite variance).
There are ways to get entropy out of non-uniform data in order to approach if not reach a uniform distribution.
A naïve, but surprisingly effective way to do this would be to put the data through a hashing algorithm of some sort.
Good hashing algorithms are specifically designed to make similar but non-identical inputs hash to values that appear unrelated.
Depending on the data source, there may be more efficient ways of getting an unpredictable sequence of bits out of it. e.g. for image data, an image difference from an average image may be more appealing than using the plain image, but I’m not sure whether that’s legitimately “more random” or whether it just feels that way.
Cloudflare uses a wall of lava lamps for some of their RNG needs
deleted by creator
deleted by creator
If you hash the image with a strong algo, even a single different pixel should end up in a wildly different result.
deleted by creator
Surely that can’t be uniform random though
But they’re just using it for a seed, so the output would be impossible to predict, but it feels like a checksum or something would approach a Gaussian distribution (the more numbers you add up, the more Gaussian it would be, since we know an image will have a mean and finite variance).
There are ways to get entropy out of non-uniform data in order to approach if not reach a uniform distribution.
A naïve, but surprisingly effective way to do this would be to put the data through a hashing algorithm of some sort.
Good hashing algorithms are specifically designed to make similar but non-identical inputs hash to values that appear unrelated.
Depending on the data source, there may be more efficient ways of getting an unpredictable sequence of bits out of it. e.g. for image data, an image difference from an average image may be more appealing than using the plain image, but I’m not sure whether that’s legitimately “more random” or whether it just feels that way.
deleted by creator