NerdKits - electronics education for a digital generation

You are not logged in. [log in]

NEW: Learning electronics? Ask your questions on the new Electronics Questions & Answers site hosted by CircuitLab.

Basic Electronics » Source of randomness for the hardware RNG

December 27, 2011
by bodangly
bodangly's Avatar

On the projects section, a hardware random number generator project is mentioned, but there is no link. I assume this means its still being worked on.I am just wondering how to pick up on noise in the voltage to use it as a source of randomness. I'm a programmer so I am not worried about how to write the code but just am not sure on how to pick up on the noise to use it in my algorithm. Thanks!

December 27, 2011
by pcbolt
pcbolt's Avatar

@bodangly

The first thing that came to my mind was using the "tempsensor" project. The ADC readings can be summed up and hashed any number of different ways. Good luck!

December 28, 2011
by bodangly
bodangly's Avatar

@pcbolt

I thought of that as well, but I'm not sure sampling the air temperature is acceptable in a secure application when any device in theory can sample that same air using an LM34. The thing is, it looked to me when doing that project that the lower bits appeared more random than the higher bits. I'd consistently get a reading of say 70 degrees, but the fractional part would vary more randomly. Will the noise in the ADC reading be secure enough? Maybe I can strip out the higher order bits and gather a pool of entropy from the lower bits? I'm trying to come up with a device that can function as source of entropy for a cryptographically secure RNG, feeding results via USB to a PC; possibly as a source of a additional entropy to /dev/random .

December 28, 2011
by pcbolt
pcbolt's Avatar

It would make for an interesting experiment. Take the smallest bit in each reading say 1024 times to generate a number. You could vary the sampling intervals and maybe even the reference voltage to distinguish it from an identical LM34 sensor. You would have to have a pretty secure USB connection since that can be hijacked with software and/or hardware. Pretty cool concept tho.

December 28, 2011
by hevans
(NerdKits Staff)

hevans's Avatar

Hi bodangly,

That sounds like a really cool project you have in mind. Taking the lower order bits of an adc conversion is probably an o.k. source to use as a seed, but as you mentioned the higher order bits are going to be pretty consistent from reading to reading, and across several sensors next to each other. Even if they are not the same from one reading to the other, they are certainly correlated to what the last sample was.

What I would probably do is to rely on only the lowest order single bit of your ADC reading. Given the noise in the system, I'm willing to say that that last bit is pretty randomly 1 or 0 (you should probably run a few tests to convince yourself of that). You can then take 8 successive readings and have 8 random bits.

Humberto

December 30, 2011
by bodangly
bodangly's Avatar

It seems this is a very viable source of randomness, with a little work. There is an inherent bias to the readings when they are taken close together, there will be long sequences of identical bits for example if you just do a simple for loop to gather 8 bits then return that byte, ad infinitum. But by taking many successive readings XOR'd together and then using the simple von Neumann algorithm (take two bits, if they are equal discard, if the pair is (1,0) return a 1, if its (0,1) return 0) I was able to whiten the bits and achieve what seems to be highly random output, but I still need to write some code to test it and verify. I'm just piping the output from USB into a file and I'm putting together some Tcl code to test the randomness of that stream, but my initial results look like I may not even need a hashing function to distill the entropy further.

Next I'd like to AES encrypt the USB output to protect the stream from malicious software. However, this whole process is already a little taxing on the chip, I'm only getting a few bytes a second. I was looking at the Atmel site and it looks like the xmega line has a hardware AES module. They also look to have USB pins. I'd need to put together my own equivalent to uart.c (though in the case of one of the chips with USB pins I guess it would be usb.c) and lcd.c for my chip but they look pretty easy to port. If I purchased my own programmer, would I need the NerdKits bootloader for anything at that point to use the chip? What kind of sucks is the smallest one I can find appears to have 44 pins in a square configuration. Is there a way to make that work with a breadboard? (either the supplied nerdkits one or another) And lastly, I understand these will have their own registers and pinouts, but aside from that is there something I'm missing that would make using these substantially harder than what I've described? Thanks for all the help! Buying the NerdKit was one of the best choices I've made with money in a while =D

December 30, 2011
by bodangly
bodangly's Avatar

Er let me clarify, I know the NerdKits bootloader probably won't just run on one of these chips since they have different pinouts and registers, what I meant was would I need to port anything from the NerdKits bootloader. Thanks again

December 31, 2011
by Rick_S
Rick_S's Avatar

If you were able to get you hands on a ATxmega32A4U-AU, those would from the factory have a bootloader already installed. From what I can see though they are pretty new and not readily available at this time. However, they use Atmel's FLIP software for loading programs onto the device via USB. Those devices are USB native and as such would not require a USB serial cable. However, keep in mind, as you have already noticed they are all SMD. The device I listed above is a TQFP 44 package. These aren't too bad to solder to adapter boards that can break out the pins for breadboarding. You would need to build the USB side circuit to allow it to communicate properly to your computer (I haven't looked but most likely it would consist of a couple of resistors and maybe some diodes) Once that was done, you could write the program and compile it then use FLIP to transfer it over.

It would be some extra steps to what is done with the nerdkit but would be a fun project if you need encryption.

Now if you want to have these devices act as a serial device to a PC, you would have to program them to be USB-Serial devices. This is similar to what the arduino folks do with their newer arduino's. They went with a second microcontroller (an atmega16u2 on the newer boards) that is programmed to be a serial port when connected to a computer.

The other xmega's would be more difficult to setup out of the box as they require a jtag programmer. You can install serial bootloaders into them but you would have to buy a jtag programmer to send it over to the chip. There are some development boards out there that have pre-installed bootloaders and those would most likely be the easiest starting point if you wanted to get into the xmega series chips.

One last note, I just checked the avr-gcc website and it doesn't appear that the newer usb based xmegas are supported yet. Just a little more food for thought.

Rick

December 31, 2011
by bodangly
bodangly's Avatar

I think I found the chip at mouser.com:

http://www.mouser.com/ProductDetail/Atmel/ATXMEGA32A4U-AU/?qs=HbI%2fMOA3e14zr9I2HFi4TQ%3d%3d

Unfortunately it doesn't look like they'll be in stock until the end of January - early February; but that's not too far off at least. So it sounds like my biggest hurdles will be building the side-circuit and then building the serial support. I see the code for how arduino does it looks to be in their source under hardware/firmwares/arduino-usbserial, according to their readme they base it off of LUFA's UsbToSerial (http://www.fourwalledcubicle.com/LUFA.php) project, so at least there is something for me to work off of. Of course with no avr-gcc support I am pretty SoL as far as using anything from that LUFA project. Maybe I'll put the encryption side of this project on the shelf for 6 months and let the support around those chips mature. I also am ahead of myself as far as my EE knowledge goes, I doubt I'd be able to figure out the USB side-circuit on my own at this point.

The good news is that the raw bits ALMOST pass Maurer's statistical test, I'm getting a value of L=7.1755384 when I should be getting 7.1836656. Pretty sure if I hash the pool on the PC side I'll have a really nice source of entropy, maybe I'll use the NIST test suite to check since Maurer isn't really the be all end all. Amazing how simple it really is just using this sensor to get a source of entropy. I'm thinking if I used two sensor's and XOR'd their output I may be able to pass Maurer without a hashing function. Would that be as simple as using pins 21,22, and 24, so I am using pin PC1(ADC1) on the second sensor?

December 31, 2011
by Ralphxyz
Ralphxyz's Avatar

bodangly would you please define:

a source of entropy

My definition does not fit with how you are using the term "entropy".

Some of the viewers of your post (like myself) are of limited knowledge and probable could not help you with your question directly but we use these postings to learn.

Thanks,

Ralph

December 31, 2011
by bodangly
bodangly's Avatar

@Ralph,

Sure thing, I'm very new when it comes to the electrical engineering aspect of all this myself so I could not at all explain WHY this is a source of entropy beyond the fact that there is a certain degree of noise in the voltage which manifests itself in the ADC reading. As for what entropy is, I am not referring to entropy as a thermodynamics term, but rather as it relates to information theory. You may have heard entropy defined as the amount of chaos in a system. This is what we are talking about here. When I grab the lowest bit of an ADC conversion, which I do as

uint_fast8_t result = ADCL & 0x01;

if this was a perfect source of entropy, then whether that bit is a 1 or 0 is entirely random. This would mean each bit has an entropy of 1. If there is a bias the entropy may be lower. Having a source of highly random bits is important for things like generating keys for cryptography, passwords, lotteries or gambling involving real money, etc. The statistical test I was referring to purports (I say that because its really not a perfect test and there are more stringent tests such as the NIST tests http://csrc.nist.gov/groups/ST/toolkit/rng/index.html) to measure the entropy of a block, in this case 8 bits (1 byte). I do a little bit more than simply grab the bits off of the conversion and use them directly to try and "whiten" any bias in them. I'm happy to comment my code and post it if you are interested to see what I do. The circuit is identical to the tempsensor though I removed the LCD as I do not use it as I just output directly to USB.

December 31, 2011
by Ralphxyz
Ralphxyz's Avatar

The tempsensor project uses averaging doesn't this reduce the noise? Isn't that what the averaging is there to do?

I think of "entropy as a thermodynamics term" so whenever someone uses it in place of "noise" I pull a blank.

This isn't the first time I have had to ask for clarification about the use of "entropy" hopefully it will sink in, but I do suffer from the AGE virus.

Ralph

December 31, 2011
by bodangly
bodangly's Avatar

Hopefully my code will help clarify some of what I was talking about. As you'll see I don't average the bits but I do use successive readings to affect the last. If you set your board up for the tempsensor project you can compile and download this right to your chip. I also have my board wired to use the USB for power which is described in the Appendix of the nerdkits guide. Since I am not outputting to use the LCD you must access your USB serial port on your PC, or you can put back in the statement to include lcd.h and write last_byte to your LCD. This will SEVERELY slow down the generation of bits but its useful to test at a glance that random bytes are being generated. Otherwise, if you are on linux you can simply type "cat /dev/ttyUSB0 > results" and let that run awhile to generate a file filled with random data. You can then open that file up in a hex editor to see for yourself. If you aren't using linux (it may work the same way in Mac OSX too but not sure) you can use Hyperterminal to read the data off your USB serial device and then save it into a file. Once you have a file filled with the random data you can use the data from that file to generate random numbers in a range, or you can read however many bits you need to create a specific crypto key, one-time pad, etc.

// entropypool.c
// for NerdKits with ATmega168
// based on tempsensor by mrobbins@mit.edu
// bodangly@gmail.com

#define F_CPU 14745600

#include <stdio.h>
#include <math.h>

#include <avr/io.h>
#include <avr/interrupt.h>
#include <avr/pgmspace.h>
#include <inttypes.h>

#include "../libnerdkits/delay.h"
#include "../libnerdkits/uart.h"

// PIN DEFINITIONS:
//
// PC0 -- temperature sensor analog input

void adc_init() {
ADMUX = 0;
ADCSRA = (1 << ADEN) | (1<<ADPS2) | (1<<ADPS1) | (1<<ADPS0);
ADCSRA |= (1<<ADSC);
}

//Using type uint_fast8_t to favor speed over memory
uint_fast8_t adc_read() {

    while(ADCSRA & (1<<ADSC)) {
    }

    //We bitwise & by 1 here to isolate the least significant bit in the lower bits of the ADC reading
    //This is because the upper bits are strongly correlated - for example watch your tempsensor project and you will see you may consistently get 70 degrees, but the fractional part will vary
    uint_fast8_t result = ADCL & 0x01;
    uint_fast8_t drop = ADCH;
    ADCSRA |= (1<<ADSC);
    return result;
}

uint_fast8_t neumannBit() {

    uint_fast8_t tBit1 = 0;
    uint_fast8_t tBit2 = 0;
    uint_fast8_t flag = 2;

    while (flag == 2) {
        //Read two bits
        tBit1 |= adc_read();
        tBit2 |= adc_read();

        //Read another two bits and XOR them with the previous reading
        //The reason we do this is in a way the opposite of why we would average a reading
        //When we average we are reducing the effect of noise, whereas here we are actually using the noise from an additional reading to affect our bits
        //I plan on adding a second temperature sensor and instead XOR these bits with a reading from that sensor

        tBit1 ^= adc_read();
        tBit2 ^= adc_read();

        //This an implementation of John von Neumann's whitening algorithm 
        //https://en.wikipedia.org/wiki/Hardware_random_number_generator#Dealing_with_bias
        //Basically if the two sucessive bits are equal we throw it out and take a new set of readings
        //If the ordered pair is (1,0) we return a 1, if its (0,1) we return a 0
        flag = (tBit1 == tBit2) ? 2 : ((tBit1 == 1) && (tBit2 == 0) ? 1 : 0);
    }
    return flag;
}

int main() {
    adc_init();
    uart_init();
    FILE uart_stream = FDEV_SETUP_STREAM(uart_putchar, uart_getchar, _FDEV_SETUP_RW);
    stdin = stdout = &uart_stream;
    uint_fast8_t last_byte = 0;

    while(1) {

        //Creates a byte out of 8 bits from our neumannBit function
        //I don't use a for loop here for speed considerations, I know it will run 8 times, the overhead of the for loop is a waste so I "unroll" it
        //To be honest calling the neumannBit function 8 times is also wasted overhead and in the code downloaded to my device I actually expand the function each time rather than have a function call for it
        last_byte |= (neumannBit() << 0);   
        last_byte |= (neumannBit() << 1);   
        last_byte |= (neumannBit() << 2);
        last_byte |= (neumannBit() << 3);   
        last_byte |= (neumannBit() << 4);       
        last_byte |= (neumannBit() << 5);       
        last_byte |= (neumannBit() << 6);
        last_byte |= (neumannBit() << 7);

        // write byte to serial port
        printf_P(PSTR("%c"), last_byte);

        //Set byte back to 0 the fast way 
        //Should take fewer clock cycles as it equates to one XOR operation as opposed to moving a 0 into memory and then into the variable
        last_byte ^= last_byte;
    }    
  return 0;
}
January 01, 2012
by bodangly
bodangly's Avatar

I am thinking an uncorrelated sensor would be better than simply another temperature sensor as my second input to the ADC. I was looking at these piezo-film vibration sensors (http://www.parallax.com/Store/Sensors/PressureFlexRPM/tabid/177/CategoryID/52/List/0/SortField/0/Level/a/ProductID/89/Default.aspx). Apparently the voltage scales with how much they tab is flexing. I'm thinking the thing probably will always be picking something up and the lowest order bits of any ADC reading will probably be as random as the temperature sensor. The only problem is the voltage scales very high, the data sheet says

"This device can generate voltages of ~70 volts. Always be sure to clamp, buffer or filter the signal going to the I/O pin to keep it within acceptable voltage/current limits."

Could I use a voltage regulator for that?

January 01, 2012
by Rick_S
Rick_S's Avatar

Why not just leave the ADC pin you check floating. I think you'll find it generates randomness pretty well. Also, not to sound too dumb, but what does all the manipulation of a single bit do for you. I mean it's either going to be zero or one. I could see xoring two somewhat random bytes to get a unique random byte but bits.. Seems like a bit (no pun intended) of over kill.

From what I see, you are getting two random bits, then either leaving them alone or toggling them... (still two random bits) then if they are the same you throw them out and try again. So what is your end result. You either get 01 or 10. Two possibilities. Wouldn't your byte be just as random if you let equal pairs through? For that matter, why take two, why not just one bit 8 times twice then xor the two full bytes.

Just thinking out loud. Another source of "randomness" is when user input is required you can grab a timer value. The odds of a timer equaling the same value when a user presses a button is very slim.

Rick

January 01, 2012
by bodangly
bodangly's Avatar

The reasoning behind throwing out bits that are the same is not to guarantee randomness. It is instead to help smooth out any bias so to speak. We may end up throwing out a lot of bits in the end but we get a much more useful stream. I've tried without doing any of this whitening, and what you end up with is long sequences of very low value bytes and many identical bytes in a row. As for two successive readings, it actually is not doing much here with only one sensor connected because the readings are strongly correlated; it would probably need to be more than two readings apart for their to be no correlation or at least less. In that case, I suppose a bunch of NOP instructions followed by a single reading would be better than what I am doing here from the standpoint of its purported goal. If we have two bias'd streams and XOR them together our bias (called e) changes to 2e^2 (https://en.wikipedia.org/wiki/Piling-up_lemma explains it in detail). This is my reasoning behind the piezo-film vibration sensor as a second reading on another ADC pin. Then, instead of two readings from the same sensor XOR'd together it would be two uncorrelated streams. Then I can drop the second reading XOR'd with the first and instead XOR a reading from the temp sensor with a reading from the vibration sensor.

January 01, 2012
by bodangly
bodangly's Avatar

I also hash the stream with sha1 160 bits at a time, salted with a few bytes from /dev/random. I ran some statistical analysis tools on the pools. Without hashing the output my pool returned from ent (http://www.fourmilab.ch/random/)

Entropy = 7.993634 bits per byte.

Optimum compression would reduce the size
of this 10604361 byte file by 0 percent.

Chi square distribution for 10604361 samples is 94124.68, and randomly
would exceed this value less than 0.01 percent of the times.

Arithmetic mean value of data bytes is 127.9251 (127.5 = random).
Monte Carlo value for Pi is 3.147981236 (error 0.20 percent).
Serial correlation coefficient is -0.002839 (totally uncorrelated = 0.0).

The output does not pass this test.The arithmetic mean, Monte Carlo, and serial correlation aren't horrible but the p value of .01 in the chisquare distribution is unacceptable. It also fails Diehard (http://stat.fsu.edu/pub/diehard/) and Ueli Maurer's test (though that one just barely). Not good for cryptographic or secure purposes. However, after hashing the output 160 bits at a time I got much better results

Entropy = 7.999982 bits per byte.

Optimum compression would reduce the size
of this 10813380 byte file by 0 percent.

Chi square distribution for 10813380 samples is 268.95, and randomly
would exceed this value 26.24 percent of the times.

Arithmetic mean value of data bytes is 127.5237 (127.5 = random).
Monte Carlo value for Pi is 3.140318383 (error 0.04 percent).
Serial correlation coefficient is -0.000111 (totally uncorrelated = 0.0).

This is far, far better and is probably good enough for cryptographic purposes. But I think I can do much better with an additional stream, maybe good enough to not need the hashing functions.

January 01, 2012
by bodangly
bodangly's Avatar

Oh and the bytes from /dev/random are there not to actually increase the randomness (its the same couple of bytes hashed with the entire pool) but actually as a replacement for my idea of hardware crypto. I plan on salting the pool, and then encrypting it in software on PC, with a key displayed on the LCD of the device. So, any rogue process can read the stream as it comes off of /dev/ttyUSB0 but they will not have access to the same pool of entropy being used to generate keys or random numbers.

Sorry for all the posting in a row probably should have put this all in one !

January 02, 2012
by sask55
sask55's Avatar

bodanbly

Just a thought, Could you use the length of the interval between changes on the ADC reading as your source of randomness? Since there are long and I assume also some short sequences of identical bits is that an indication that the ADC is returning identical values for multiple sequential reads. By reading the ADC with as short of intervals as possible and incrementing a counter with each identical read. Using the least significant bit on the counter as the ADC changes may produce a source of randomness. Of course this could be implemented on any changing data source by using the length of time between changes not the actual change in the data as the source of randomness. My thinking is that it may be possible that the LSB of this time interval when measured in small increments would be very near random .

January 02, 2012
by bodangly
bodangly's Avatar

Sask55,

That is an interesting idea. Perhaps I can implement this with very little changes by just starting a timer when I throw out bits. I don't think this would be a good enough stream all on its own though. The reason for that is because if we assume we are whitening a bias by throwing out bits, than the timer itself will be biased towards whatever bit is biased. If the length of bit sequences we are throwing out approaches a constant value than the timer will not be random over time. The thing is I get a pretty constant rate of bytes being generated right now, which leads me to believe the amount of bits being thrown out is not all that random. It is worth measuring for sure though!

Post a Reply

Please log in to post a reply.

Did you know that you can read diagnostic data from some cars with a NerdKit? Learn more...