<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Way of the exploding head</title>
	<atom:link href="http://warmcat.com/_wp/feed/" rel="self" type="application/rss+xml" />
	<link>http://warmcat.com/_wp</link>
	<description>Embedded and desktop Linux</description>
	<lastBuildDate>Fri, 12 Feb 2010 23:49:41 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.6</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Don&#8217;t let Production Test Be Special</title>
		<link>http://warmcat.com/_wp/2010/02/12/dont-let-production-test-be-special/</link>
		<comments>http://warmcat.com/_wp/2010/02/12/dont-let-production-test-be-special/#comments</comments>
		<pubDate>Fri, 12 Feb 2010 23:49:41 +0000</pubDate>
		<dc:creator>andy</dc:creator>
				<category><![CDATA[Embedded Linux]]></category>
		<category><![CDATA[Hardware design]]></category>
		<category><![CDATA[Openmoko Lessons]]></category>
		<category><![CDATA[Software design]]></category>

		<guid isPermaLink="false">http://warmcat.com/_wp/?p=75</guid>
		<description><![CDATA[Lesson 3: Test is not special
Commonly in embedded work test is the &#8220;red-haired stepchild&#8221;, nobody wants to take care of it and by common, silent consent it is always left until last.  Eventually the need for a test plan becomes overwhelming as the date to go to the factory nears, and the task is assigned [...]]]></description>
			<content:encoded><![CDATA[<h2>Lesson 3: Test is not special</h2>
<p>Commonly in embedded work test is the &#8220;red-haired stepchild&#8221;, nobody wants to take care of it and by common, silent consent it is always left until last.  Eventually the need for a test plan becomes overwhelming as the date to go to the factory nears, and the task is assigned to the most junior engineers available, since everybody knows that test is the death knell of your career.</p>
<p>Coming cold to and excluded from being inside an already-existing project, the engineers try to create some kind of test coverage the best way they can.  At openmoko two giant test suites were created, DM1 and DM2, written by people who were learning C for the first time.  I got the job of modernizing this code so I know from experience the code was already truly terrible and bitrotted at an alarming rate.  However I had to admire the guys who wrote it, with everything against them and little experience they did manage to create something that did provide test coverage at the factory, however much it was on life-support.</p>
<h2>Totentanz</h2>
<p>Similarly, Openmoko used production test jigs, special additional PCBs that formed a kind of custom test environment for the PCB under test.  At one version of GTA03 there were so many test points added it was a serious concern that the board would break down under the overall pressure needed to mate the spring-loaded test probes to the test points.</p>
<p>Jigs and test points have an obvious advantage in terms of test throughput, but there are some big disadvantages.</p>
<p>First, you have to design and build the jig, and track changes to the actual device with it.  This effort is completely disconnected from moving your actual product on, except that it&#8217;s meant to help in production.</p>
<p>Second, test points don&#8217;t test your connectors; the test point may be connected OK but not the connector pin the user actually accesses.</p>
<p>Third, you need something else outside the device to assess what is happening on the test points, the code for that also has to be written and maintained against changes in the actual product.  It also means that it&#8217;s not possible for the tests to be casually performed outside the factory, or maybe by the original engineers if they have access to the ATE gear themselves.</p>
<h2>Pain into torture</h2>
<p>Additionally the bringup of GTA02 required special versions of U-Boot and kernel which had added &#8220;test magic&#8221; created by the test guys and unknown to anyone else.  These versions were seldom uplevelled.</p>
<p>Since GTA02 had raw NAND, it needed filling up at the factory with the rootfs.  The way to do this was via a very fragile OpenOCD using a custom USB &#8211; serial based device that was bitbanged.  It only worked with certain versions of the usb library needed to talk to it.</p>
<p>All of these quirks and requirements at the factory made production runs difficult and expensive to get right.</p>
<h2>I only hurt you because I love you</h2>
<p>I spent a lot of time thinking about how to avoid this end result next time I would design something.  The mistakes started in having anything special for test I concluded.  The jig: special, and so evil.  Test kernels or bootloader: special -&gt; evil.  Test rootfs -&gt; Evil.  test software, like Openmoko&#8217;s DM1 and DM2, evil.  The device should naturally be able to test itself with the arrangements that already exist inside it to operate at all.</p>
<p>The answer to the problem of &#8220;production test&#8221; is to completely subsume it into the rest of the design.  So it is the responsibility of Linux drivers to provide enough functionality by probe errors, or sysfs features, that one can perform test and diagnosis.  The &#8220;test suite&#8221; should boil down to a bash script that is using features exposed in a normal shipping rootfs and kernel.  Bash is ideal because most of the test action will be calling existing commandline tools like ifconfig, ping, l2ping and grepping or looking at their return code, this is what bash is best at.  It&#8217;s also easily understood and edited by anyone who has worked with Linux for a while.</p>
<p>The bootloader is required for test in only one capacity, it is the only part of the system that is capable to run the SDRAM tests; once you enter Linux you can&#8217;t perform a full SDRAM test any more.  But even that should be done by the one shipping bootloader image.</p>
<p>In many cases, device interfaces can be tested by external loopback connectors, this proves connectivity through the connectors and it leaves open the possibility of end-users being able to run the same tests on the shipping rootfs.</p>
]]></content:encoded>
			<wfw:commentRss>http://warmcat.com/_wp/2010/02/12/dont-let-production-test-be-special/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Bootloader Envy</title>
		<link>http://warmcat.com/_wp/2010/02/08/bootloader_envy/</link>
		<comments>http://warmcat.com/_wp/2010/02/08/bootloader_envy/#comments</comments>
		<pubDate>Mon, 08 Feb 2010 20:14:54 +0000</pubDate>
		<dc:creator>andy</dc:creator>
				<category><![CDATA[Embedded Linux]]></category>
		<category><![CDATA[Linux peripherals]]></category>
		<category><![CDATA[Openmoko Lessons]]></category>
		<category><![CDATA[Software design]]></category>

		<guid isPermaLink="false">http://warmcat.com/_wp/?p=67</guid>
		<description><![CDATA[Lesson #2:  A bootloader is to load and boot Linux
On the first day of FOSDEM I sat through a presentation on what could be called another &#8220;U-Boot derivative&#8221;.  One of the greatest asspains at Openmoko was the various kinds of Hell caused by the U-Boot bootloader and its philosophy, which can be summed up as [...]]]></description>
			<content:encoded><![CDATA[<h2>Lesson #2:  A bootloader is to load and boot Linux</h2>
<p><img class="alignleft" title="Qi" src="http://warmcat.com/qi.png" alt="" width="126" height="183" />On the first day of FOSDEM I sat through a presentation on what could be called another &#8220;U-Boot derivative&#8221;.  One of the greatest asspains at Openmoko was the various kinds of Hell caused by the U-Boot bootloader and its philosophy, which can be summed up as &#8220;I wanna be Linux when I grow up&#8221;.</p>
<h2>Configure system is a bad alternative to good bootloader design</h2>
<p>First, it has a config system.  That should be good though, right?  The problem with the config system is that if anything differs from your current config, you must build another incompatible binary with another config and take care of that.  When you have more than a handful of different boards, you are in a maze of incompatible bootloaders.  Openmoko took it one step further, they mandated a different bootloader binary per PCB revision, so left unchecked there would have been a continuous proliferation of incompatible bootloaders, all basically the same.</p>
<h2>All persistent bootloader private state is EVIL</h2>
<p>Second, U-Boot thinks it&#8217;s a good idea to have these environment &#8220;scripts&#8221;, because it&#8217;s &#8220;configurable&#8221;.  Actually, the job of a bootloader is to Load, then Boot Linux.  You don&#8217;t need any configurability for that if the bootloader can figure out what it&#8217;s running on and therefore where the memory is and how much there is.  These scripts expose a really deadly trap I call &#8220;private bootloader state&#8221;.  It means the bootloader stores stuff in nonvolatile memory on the PCB and acts different according to what it hides there.  The end result is that two boards from the same factory may act totally different even with the same rootfs due to &#8220;bootloader secrets&#8221;.  This is totally needless and ALL private bootloader state can be eliminated by correct design of the bootloader leading to completely deterministic boot action per rootfs.</p>
<p>A good example how that lead you to the path to hell is hardcoding in the U-Boot environment of the amount of kernel image you will copy from somewhere.  People commonly set it to 2MBytes, forget about it and one day they generate a 2.1MB kernel image and wonder why decompress blows up.  Actually, that whole procedure is insane, the kernels are uImages that report their length in a header.  The bootloader should examine the header and compute the length of image to pull.  But that doesn&#8217;t fit with this &#8220;environment&#8221; nonsense.</p>
<h2>Do Linux Stuff In Linux</h2>
<p>In any of these bloated U-Boot style bootloaders, is there even one feature they do better than the same feature in Linux?  The startup time should be better by a few 100ms.  Other than that, no, every single bloated &#8220;I will add it to the bootloader beacuse I can&#8221; feature is shittier than you get in Linux.  Every single feature!</p>
<p>If you need some advanced capability or backup / recovery boot action, check for a button held down at boot-time in the bootloader and go fetch a different Linux partition + kernel.  Use standard Linux tools and shells.  In return, get really high quality network stack, proper USB support, NAND access that&#8217;s compatible to your main Linux system access in BBT / ECC terms, and all the other advantages of Linux.</p>
<h2>Do your peripheral bringup in drivers in Linux</h2>
<p>Typically you do not need ANY bringup in the bootloader except SDRAM controller and chip init, since it&#8217;s a prerequisite to put Linux in the RAM that it&#8217;s initialized.</p>
<p>That&#8217;s right, all the megabytes of source spent in U-Boot providing support for so many kinds of peripheral is a waste of time, effort and maintenance.  I am being kind saying &#8220;maintenance&#8221;, because the drivers in U-Boot are typically &#8220;dumbed down&#8221; versions of the equivalent Linux driver that were forked irretrievably the moment all the Linux APIs were ripped, so there&#8217;s no coherent effort to keep them up to date with the Linux ones .  Lately I saw that they try to ape some Linux APIs there&#8230; why not go the whole hog and just <strong>load and boot real Linux</strong>?  After all, modern CPUs can be running your driver probes in Linux in ~2 seconds from power using a bootloader that doesn&#8217;t get in the way.</p>
<p>You typically don&#8217;t even need to talk to the PMU in the bootloader, after all, you are running code fine already, right?  Otherwise you wouldn&#8217;t be able to run the bootloader code itself.</p>
<h2>Fat girl in Ibiza</h2>
<p>At least at Openmoko, code quality inside U-Boot was awful bad.  I called U-Boot on the lists there &#8220;the fat girl in Ibiza&#8221; because you know she&#8217;s going to do anything you want.  All kinds of constant-only code, weird new scripting keywords were added for test undocumented, you name it.  Hardware guys felt up to writing such code secretly by themselves once they learned the software engineering marvel that is *((unsigned int *)0x&#8230;) = 0x&#8230;;</p>
<h2>Your bootloader just tests SDRAM</h2>
<p>There&#8217;s only one test action your bootloader is suited to do, and that is SDRAM test.  Once you are in Linux, it can&#8217;t perform a full SDRAM test while it&#8217;s running.  But the bootloader is typically starting from on-CPU SRAM, it can actually run a true SRAM test from there.  Otherwise, the bootloader should be completely absent from the test plan.  All other tests should be performed in Linux via standard driver and rootfs tools.</p>
<p>More about board and test and board bringup will feature in another report of a lesson learned.</p>
<h2>Qi</h2>
<p>While at Openmoko (mainly) I wrote a bootloader that meets these ideals, you can find it <a title="Qi git" href="http://git.warmcat.com/cgi-bin/cgit/qi/log/?h=txtr">in git here</a> One of the nicest things about it is that unlike the bloated bootloaders whose job never finishes trying to become Linux cargo cult style, Qi has been pretty much complete for a few months.  It&#8217;s a new job to support a new CPU, a much smaller job to add a new board and it doesn&#8217;t want to talk to your peripherals anyway so no problem there.</p>
<p>Qi creates one binary per CPU, that supports all boards with that CPU.  That sounds like a big job but we don&#8217;t care about your peripherals so all boards with the same CPU look almost identical.  You have to find something that can detect your particular board at runtime, for example NOR device ID read check.  So there is zero build-time config and Qi generates all CPU support when it&#8217;s buit, it takes 3 sec or so typically.</p>
<p>Typical bootloader binary size per CPU is 28-30KBytes.  That supports VFAT, ext2/3/4 typcially the SD controller as well.  The single Qi image also supports being booted from NAND, JTAG or SD Card on processors that support it just by being copied into place and without any changes.</p>
<p>There is zero bootloader private state, however Qi can look in the rootfs and append kernel commandline text from the content of a filesystem file.  This maintains the rule that boot should be completely deterministic per rootfs.</p>
]]></content:encoded>
			<wfw:commentRss>http://warmcat.com/_wp/2010/02/08/bootloader_envy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Fosdem and the Linux Cross Niche</title>
		<link>http://warmcat.com/_wp/2010/02/08/fosdem-and-the-linux-cross-niche/</link>
		<comments>http://warmcat.com/_wp/2010/02/08/fosdem-and-the-linux-cross-niche/#comments</comments>
		<pubDate>Mon, 08 Feb 2010 12:42:34 +0000</pubDate>
		<dc:creator>andy</dc:creator>
				<category><![CDATA[Embedded Linux]]></category>
		<category><![CDATA[Openmoko Lessons]]></category>
		<category><![CDATA[Software design]]></category>
		<category><![CDATA[fosdem cross build distro fedora]]></category>

		<guid isPermaLink="false">http://warmcat.com/_wp/?p=56</guid>
		<description><![CDATA[
I was at Fosdem over the weekend, there were several interesting talks I attended but the most interesting one for me was a roundtable about the future of Cross distributions.  I was invited to give a 5 minute talk there which I gave, but unfortunately it was right at the end and the people before [...]]]></description>
			<content:encoded><![CDATA[<p><img class=" alignleft" title="fosdem" src="http://warmcat.com/fosdem.png" alt="fosdem" width="121" height="116" /></p>
<p>I was at Fosdem over the weekend, there were several interesting talks I attended but the most interesting one for me was a roundtable about the future of Cross distributions.  I was invited to give a 5 minute talk there which I gave, but unfortunately it was right at the end and the people before had overrun, so there was no time to make much of a coherent case.   So I am going to write some articles covering the issues involved here.</p>
<h2>Cross as a niche</h2>
<p>Cross itself remains absolutely necessary for systems below a certain level of horsepower.  For example, 8051, ARM7, cortex M3 are not really capable to consider native build.  But processors get faster each year, a lot of things we would have used an 8051 on use an ARM7 or cortex M3 now, in a few years it is likely that baseline has moved further up and it&#8217;s an ARM9 equivalent.  What I am suggesting then is that over time, the niche where you need cross is shrinking.</p>
<p>All four of the cross distros at FOSDEM target a CPU that&#8217;s powerful enough to run Linux, but not powerful enough to build its own binaries.  That is the niche that I believe will shrink to the point that it won&#8217;t support all these cross Linux distro projects, possibly none of them in the end.</p>
<h2>My background with cross Linux</h2>
<p>A few years ago I created an RPM-based cross distro singlehanded, and used it on a product for a customer  This was AT91RM9200-based, a 200MHz ARM9 with 32MBytes of SDRAM.  The amount of effort needed to create a set of cross packages sufficient to create a workable rootfs was huge, it took me many weeks.  Some packages like perl were just so cross-unfriendly that they were basically out of reach (although I later saw other people have done the invasive magic necessary).  It did work well, and I added patches for busybox RPM support that allowed it to do more useful things like erase and keep a package database.  The packaging was valuable in itself but a nice advantage was the source RPMs it generated ensuring GPL compliance.</p>
<h2>My background with Openmoko</h2>
<p>Subsequently I spent 14 months as (mainly) the kernel maintainer for Openmoko.  Openmoko had an OpenEmbedded basis for it&#8217;s rootfs, also a cross system.  I attempted to use it for &#8220;hello world&#8221; while I was at Openmoko, but it broke because I was on a newly released Fedora.  How it broke was very revealing, the official way to get started with it was to run a huge script that wgetted and locally built 1100 packages.  It died due to some assumption somewhere breaking while it tried to build <strong>host</strong> dbus libraries.</p>
<p>What I wanted was a cross toolchain that would let me package &#8220;hello world&#8221;.  What I got was a massive host build action including host dbus libs.  I have perfectly good host dbus libs in my Fedora install, I enquired about it and was told they were the &#8220;wrong&#8221; libs for the expectation of the rest of the packages, so they had to be rebuilt.</p>
<p>I gave up on trying to use OpenEmbedded, as I guess most of Openmoko&#8217;s customers did.</p>
<p>After Openmoko imploded, I designed the software architecture (and influenced the hardware design in some aspects) for the txtr reader device.  On this device, I put into action various lessons I had learned in how not to do things from Openmoko.  I will write further about the other lessons in future articles, but here&#8217;s the first one:</p>
<h2>Lesson #1: Don&#8217;t compile your own rootfs</h2>
<p>I was told by a manager at Openmoko that Openmoko had hired most of the main devs of OpenEmbedded and were paying for that accordingly.  This was a pretty big drain on their resources over a long period.</p>
<p>In contrast, nowadays you can head over to <a title="Fedora ARM project" href="http://fedoraproject.org/wiki/Architectures/ARM" target="_blank">http://fedoraproject.org/wiki/Architectures/ARM</a> and download a generic <a title="rootfs tarball" href="http://ftp.linux.org.uk/pub/linux/arm/fedora/rootfs/rootfs-f12.tar.bz2">rootfs tarball </a>of prebuilt binaries for ARMv5 and above[1].  It&#8217;s made from unpacking prebuilt binary packages.  Once you boot into it, you can install further packages with the usual yum install type action.  You can be up in a high quality rootfs in five minutes flat.</p>
<p>You do not need to go around compiling everything personally when binary packages exist from a reputable distro already.  Normal distros provide -dev and -devel packages for you to link against too, so you do not need to recompile the universe just because you want to build &#8220;hello world&#8221; either.  That&#8217;s how we do things on desktop and server systems, as the processors involved get stronger embedded does not have to be different.</p>
<p>If you want to cross-build specific packages, you install the <a title="cross toolchain" href="http://fedoraproject.org/wiki/Architectures/ARM/CrossToolchain">Fedora ARM Cross Toolchain RPMs</a> on you host via yum and you are ready to go in a couple of minutes.  This is very useful for cooking the kernel on your host both to get started and during development; you can&#8217;t native-build the bootstrap stuff needed to boot your platform.  But that&#8217;s just a cross compiler and related pieces, it&#8217;s not a cross distro.  (The guy from emdebian at this FOSDEM talk also made this point that you do not need to get into making your own toolchain, your distro should have one you can just install).</p>
<p>Fedora ARM&#8217;s strategy is native build.  So you install gcc and other dependencies into the actual device, and use standard rpmbuild to build your package there; you can also just configure ; make ; make install for development too down there.  If something&#8217;s missing on the rootfs you can yum install it.</p>
<p><em>(1 To make the comparison fair to openmoko Fedora ARM came along too late for them to choose it from the start, and the GTA02 s3c2442 was not a v5 class processor, they would have been into a distro recook after changing the distro-level compile options.  However my worry is not repeating Openmoko&#8217;s errors and today Fedora ARM is available.)</em></p>
<h2>Quality and Quantity</h2>
<p>Another major issue is distro quality.  I was so surprised to hear at Fosdem Dr Mickey Lauer of OpenEmbedded boast about the number of devices that managed to use that distro (including the sad shape of the GTA02) and say that unlike the other cross distros, OpenEmbedded focused on &#8220;Quantity not Quality&#8221;.  From my experience I think he&#8217;s right alright about not focusing on quality, and he did go on to explain there are problems with OpenEmbedded they are trying to address.</p>
<p>In the near future, there will be a carcrash between these difficult cross distros that have relatively poor quality and strange requirements to use them and standard, &#8220;proper distros&#8221; like Fedora ARM, because on higher-end ARMv5s say 400MHz and above, it is already perfectly possible to compare the two worlds on the same device.  I think many devs currently are trained by their experience with buildroot type systems to assume they have to personally build everything Gentoo style.  However as CPUs increase in power at the same price point, the ways of working with these systems efficiently change, and desktop / server &#8220;treat it like a PC&#8221; lessons like the value of packaging start to really show their traditional advantages over rootfs tarballs.</p>
<p>Like Debian, Fedora has all kinds of rules and requirements about packaging to ensure high quality, there are a huge number of users of these two normal distributions that leads to tested and debugged basic packages and their dependencies.  OpenEmbedded&#8217;s boast about number of users is not even a blip in comparison to Fedora or Debian&#8217;s consumers and contributors.</p>
<h2>Cross distros are locked into local patch hell</h2>
<p>A worse problem against their quality even than not many users is the patch load these projects are carrying, I think all of the cross distro projects bemoaned that they were carrying huge patchsets across a large number of packages to get them to build cross at all, and that most upstreams did not care to take them (I assume they don&#8217;t want to have to get into testing them).  To uplevel packages, which distros have to daily when they have a large package universe, it can become a nightmare of breakage because of the private patchsets being dragged around.</p>
<p>(BTW I also saw in another presentation that the <a title="limo" href="http://www.limofoundation.org/">limo foundation</a> are carrying around more than 80MBytes of diff between their distro and the upstream projects, and these are the guys who sent out a <a title="limo whitepaper" href="http://www.limofoundation.org/images/stories/pdf/limo%20economic%20analysis.pdf">whitepaper</a> explaining the massive cost of delaying sending patches upstream in dollar terms.)</p>
<p>There was proposed a unified crossbuild patch promoting effort, but the effort seemed only to consist of a domain like &#8220;sends-patches.org&#8221; that you could use when sending patches instead of your own project name, which seems to just be tea and sympathy rather than a solution.</p>
<p>It&#8217;s clear that quality will tend to be higher if you are getting packages built with normal distro specfiles and no pile of local patches to get them to build cross (because they were built native).  Combined with higher quality thresholds at the project level and sheer number of users, native Fedora (or Debian) rootfs basis will provide Quantity <strong>and</strong> Quality if your processor is appropriate.</p>
<p>A couple of hours after the talk I had an interesting conversation with <a title="openinkpot" href="http://openinkpot.org">OpenInkpot</a> dev Mikhail Gusarov, who I found also <a title="openinkpot and openembedded" href="http://openinkpot.org/wiki/FAQ#Whyareyouusing.debsandIPlinux">shared my lack of enthusiasm for OpenEmbedded</a>, although he is trapped still in the cross niche generally by the weak processors he targets at the moment.</p>
<p>[update Feb 10 09:00] Mikhail has <a href="http://fossarchy.blogspot.com/2010/02/cross-build-systems-and-their-future.html">written his own response</a>, he still likes the speed of cross (and still hates OpenEmbedded).  But there&#8217;s some confusion about what Fedora ARM offers, it&#8217;s a generic ARMv5 rootfs, it doesn&#8217;t care what exact kind of CPU, vendor or peripherals available.  Build farms are less of a requirement when you are no longer building your rootfs but installing it from distro binary packages.  <a href="http://en.wikipedia.org/wiki/SheevaPlug">Sheevaplug</a> makes available a 1.2GHz Marvell ARM compatible with 512MBytes of SDRAM that Fedora ARM can work on if you need a native build machine.  Shortly fast dual processor Cortex A9 machines will become available.</p>
]]></content:encoded>
			<wfw:commentRss>http://warmcat.com/_wp/2010/02/08/fosdem-and-the-linux-cross-niche/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Whirlygig Verification and rngtest analysis</title>
		<link>http://warmcat.com/_wp/2009/05/21/whirlygig-verification-and-rngtest-analysis/</link>
		<comments>http://warmcat.com/_wp/2009/05/21/whirlygig-verification-and-rngtest-analysis/#comments</comments>
		<pubDate>Thu, 21 May 2009 08:53:20 +0000</pubDate>
		<dc:creator>andy</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://warmcat.com/_wp/?p=54</guid>
		<description><![CDATA[ENT
Here is 300MB of random from the device checked by ENT (notice I am not using -b as I was before, without it it is checking entropy on BYTE scale which is tougher):
$ ./ent dump
Entropy = 7.999999 bits per byte.

Optimum compression would reduce the size
of this 306380800 byte file by 0 percent.

Chi square distribution for [...]]]></description>
			<content:encoded><![CDATA[<h2>ENT</h2>
<p>Here is 300MB of random from the device checked by ENT (notice I am not using -b as I was before, without it it is checking entropy on BYTE scale which is tougher):</p>
<pre>$ ./ent dump
Entropy = 7.999999 bits per byte.

Optimum compression would reduce the size
of this 306380800 byte file by 0 percent.

Chi square distribution for 306380800 samples is 253.74, and randomly
would exceed this value 51.06 percent of the times.

Arithmetic mean value of data bytes is 127.5022 (127.5 = random).
Monte Carlo value for Pi is 3.141608288 (error 0.00 percent).
Serial correlation coefficient is 0.000074 (totally uncorrelated = 0.0).</pre>
<p>ENT gives better results for Whirlygig in line with how much you feed it.Â  With a 40MB test file, it reported entropy of 7.999996.Â  That makes sense when you consider the data being really random, it shows its true colours only in the longer term since sample by sample, it can be doing anything at all.</p>
<h2>rngtest</h2>
<p>Rngtest had always puzzled me so most of this post is devoted to picking apart the meaning from these results from 1.27Tbits of Whirlygig randomness (1271Gbits).</p>
<pre>rngtest: bits received from input: 1271367467008
rngtest: FIPS 140-2 successes: 63517865
rngtest: FIPS 140-2 failures: 50508
rngtest: FIPS 140-2(2001-10-10) Monobit: 6560
rngtest: FIPS 140-2(2001-10-10) Poker: 6444
rngtest: FIPS 140-2(2001-10-10) Runs: 18865
rngtest: FIPS 140-2(2001-10-10) Long run: 18947
rngtest: FIPS 140-2(2001-10-10) Continuous run: 12
rngtest: input channel speed: (min=39.329; avg=8626.930; max=19531250.000)Kibits/s
rngtest: FIPS tests speed: (min=332.192; avg=105801.561; max=114217.836)Kibits/s
rngtest: Program run time: 155670833366 microseconds</pre>
<p>Considering it calls itself &#8220;rngtest&#8221;, at first sight there are a shocking number of &#8220;failures&#8221;.Â  Over 63,568,373 &#8220;tests&#8221;, 50508 &#8220;failed&#8221;.Â  Is something wrong with Whirlygig?Â  I went and studied the <a href="http://gkernel.cvs.sourceforge.net/viewvc/gkernel/rng-tools/fips.c?revision=1.5&amp;view=markup">rngtest sources</a> to figure out what it was actually doing.</p>
<h2>FIPS 140-2</h2>
<p>rngtest is based on a <a href="http://www.scribd.com/doc/11305936/NIST-Statistical-Test-Suite-for-Random-and-PseudoRandom">document</a> from NIST which goes into detail about assessing random output.Â  It&#8217;s based on 2500-byte blocks of random data which have various tests applied to them.Â  But since the source is meant to be truly random, what does it mean to &#8220;test&#8221; the packet?Â  Any bit pattern can come in there, each is equally likely as any other, including a whole packet of 0 or 1.Â  How can some be considered &#8220;bad&#8221;?</p>
<p>Actually a &#8220;bad&#8221; packet cannot be considered &#8220;bad&#8221; in isolation.Â  Instead you have to look to the spread of packets meeting and &#8220;failing&#8221; the test criteria against the theoretical probability of their occurrence over time, to see if your random source has one kind of bias or another.Â  An individual &#8220;bad&#8221; packet can&#8217;t be said to be bad unless the history of failures is suggesting that there is a bias to generate these bad packets.</p>
<p>Unfortunately, I could not find any documentation about rngtest that explained the expected rate of failures from a genuinely random source.Â  I managed to calculate two of the five.</p>
<h2>Monobit</h2>
<p>monobit is just looking for a 50% distribution of 1s in each 20000 bit packet.Â  If a packet comes with 275 more 1s than 0s or 275 more 0s than 1s, then it&#8217;s a fail.Â  Obviously a packet with 1 or 10 extra bits is highly probable.Â  I found out that these should follow a &#8220;normal distribution&#8221;, but I was unable to calculate where on the curve &#8220;275 more or less 1s&#8221; should fall &#8212; it&#8217;s 0.0275 skew on the expected figure of 10,000&#8230;. if anyone can help me it would be most welcome.</p>
<p>In our case, the observed probability of a monobit packet from my Whirlygig was 0.000103, or 1:9690.</p>
<h2>Poker</h2>
<p>Poker is just looking at the distribution of nybblesÂ  It takes each byte as two 4-bit nybbles, and for each of the 5000 nybbles in the test packet, maintains a count of occurrences of 0 &#8211; 0xf.Â  These counts are squared and then compared to two constants, greater than 1576928 or less than 1563176 for any nybble value gets you a fail.</p>
<p>Again I have no idea how toÂ  calculate the theoretical probability of a &#8220;failure&#8221; here, but our observed probability is 0.000101 or 1:9864.</p>
<h2>Run</h2>
<p>A run is a series of &#8220;all 1s&#8221; or &#8220;all 0s&#8221;.Â  rngtest is counting how many times it sees a run of length 1 through 6 (and run longer than 6 bits is counted as being six bits).Â  The result for each count of run length occurrences is then compared against a magic table:</p>
<p>1-bit: 2315 &lt; run &lt; 2685<br />
2-bit: 1114 &lt; run &lt; 1386<br />
3-bit: 527 &lt; run &lt; 723<br />
4-bit: 240 &lt; run &lt; 384<br />
5-bit: 103 &lt; run &lt; 209<br />
6-bit: 103 &lt; run &lt; 209 (sic)</p>
<p>Once again I couldn&#8217;t find any estimate of probability of failing this test with a true random source.Â  Our observed probability of failing it was 0.000296 or 1:3369.</p>
<h2>Long run</h2>
<p>For rngtest a &#8220;long run&#8221; is seeing 26 or more bits the same level at once.Â  For any 26 bits, the chance of seeing a 26-bit run exactly is 2 in 2^26, or once every 32Mbits (there are two chances because it can be 0&#215;3ffffff or 0&#215;000000).Â  However, to start the run it&#8217;s also a requirement that the previous bit is the opposite level, so it&#8217;s 2 in 2^27 chance, or 1 in 1^26 overall, 1.49 x 10^-8.Â  For a 20000-bit test packet, that&#8217;s 0.000298 or 1:3355 chance per packet.</p>
<p>We observed 18947 of these out of 63,568,373 test packets, it&#8217;s <strong>exactly</strong> matching the theoretical chance of 0.000298 or 1:3355.</p>
<h2>Continuous Run</h2>
<p>A &#8220;continuous run&#8221; is just seeing the same 32-bit pattern twice in a row, considering 32-bit boundaries.Â  For every 32-bits generated, there&#8217;s a 1 : 2^32 chance that it matches the previous one (without having to know what that was).Â  So the theoretical probability of these &#8220;failures&#8221; is &lt;number of bits&gt; / 32 / 4G, for 1.27TB in our sample it comes to 9.5.Â  We observed 12.Â  So this doesn&#8217;t seem unreasonable.</p>
<p>So overall after studying each test, it&#8217;s clear that a random source must fail rngtest with specific probabilities for each test.Â  In no way is a &#8220;failure&#8221; on the rngtest tests in itself indicating a problem with the random source.Â  But if your source does not cause the right amount of failures over time, that is indicating a problem with your source.</p>
<p>It seems wrongheaded then that rngd will reject individual packets that &#8220;fail&#8221; the rngtext / FIPS140 tests.</p>
<h2>Dieharder with a vengence</h2>
<p>Next I ran the current dieharder suite again, this is from the latest RPMs on Rober G Brown&#8217;s site http://www.phy.duke.edu/~rgb/General/dieharder.php.Â  I started running it directly hooked up to the RNG device /dew/hwrng, but then I realized that since a lot of the tests are looking for lagged correlation, in fact I needed to give it a file that it could meaningfully rewind into.</p>
<p>So I generated a 12GByte random file and fed it to dieharder -a (run all the tests).Â  This got us the following summary (grepped just for the decision)</p>
<pre>Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Generalized Minimum Distance Test
Assessment: PASSED at &gt; 5% for RGB Generalized Minimum Distance Test
Assessment: PASSED at &gt; 5% for RGB Generalized Minimum Distance Test
Assessment: PASSED at &gt; 5% for RGB Generalized Minimum Distance Test
Assessment: PASSED at &gt; 5% for RGB Permutations Test
Assessment: PASSED at &gt; 5% for RGB Permutations Test
Assessment: PASSED at &gt; 5% for RGB Permutations Test
Assessment: PASSED at &gt; 5% for RGB Permutations Test
Assessment: PASSED at &gt; 5% for RGB Permutations Test
Assessment: PASSED at &gt; 5% for RGB Permutations Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for Diehard Birthdays Test
Assessment: PASSED at &gt; 5% for Diehard 32x32 Binary Rank Test
Assessment: PASSED at &gt; 5% for Diehard 6x8 Binary Rank Test
Assessment: PASSED at &gt; 5% for Diehard Bitstream Test
Assessment: PASSED at &gt; 5% for Diehard OPSO
Assessment: PASSED at &gt; 5% for Diehard OQSO Test
Assessment: PASSED at &gt; 5% for Diehard DNA Test
Assessment: PASSED at &gt; 5% for Diehard Count the 1s (stream) Test
Assessment: PASSED at &gt; 5% for Diehard Count the 1s Test (byte)
Assessment: PASSED at &gt; 5% for Diehard Parking Lot Test
Assessment: PASSED at &gt; 5% for Diehard Minimum Distance (2d Circle) Test
Assessment: PASSED at &gt; 5% for Diehard 3d Sphere (Minimum Distance) Test
Assessment: PASSED at &gt; 5% for Diehard Squeeze Test
Assessment: PASSED at &gt; 5% for Diehard Runs Test
Assessment: PASSED at &gt; 5% for Diehard Runs Test
Assessment: PASSED at &gt; 5% for Diehard Craps Test
Assessment: PASSED at &gt; 5% for Diehard Craps Test
Assessment: POSSIBLY WEAK at &lt; 5% for Marsaglia and Tsang GCD Test
Assessment: PASSED at &gt; 5% for Marsaglia and Tsang GCD Test
Assessment: PASSED at &gt; 5% for STS Monobit Test
Assessment: PASSED at &gt; 5% for STS Runs Test
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: POSSIBLY WEAK at &lt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: POOR at &lt; 1% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for Lagged Sum Test</pre>
<p>No way!Â  Two &#8220;possibly weak&#8221; and one &#8220;poor&#8221;.Â  I read the manpage for dieharder and got the advice from there to run the tests more times, because if the data is bad, feeding it more skewed badness will make the failing distribution of p-values &#8220;unambiguous&#8221;.Â  Dieharder has a default of 10,000 tests, I cranked it up to 20,000 and ran them all again on the same 12GByte sample.</p>
<pre>Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: POSSIBLY WEAK at &lt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Bit Distribution Test
Assessment: PASSED at &gt; 5% for RGB Generalized Minimum Distance Test
Assessment: PASSED at &gt; 5% for RGB Generalized Minimum Distance Test
Assessment: PASSED at &gt; 5% for RGB Generalized Minimum Distance Test
Assessment: PASSED at &gt; 5% for RGB Generalized Minimum Distance Test
Assessment: PASSED at &gt; 5% for RGB Permutations Test
Assessment: PASSED at &gt; 5% for RGB Permutations Test
Assessment: PASSED at &gt; 5% for RGB Permutations Test
Assessment: PASSED at &gt; 5% for RGB Permutations Test
Assessment: PASSED at &gt; 5% for RGB Permutations Test
Assessment: PASSED at &gt; 5% for RGB Permutations Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for RGB Lagged Sum Test
Assessment: PASSED at &gt; 5% for Diehard Birthdays Test
Assessment: PASSED at &gt; 5% for Diehard 32x32 Binary Rank Test
Assessment: PASSED at &gt; 5% for Diehard 6x8 Binary Rank Test
Assessment: PASSED at &gt; 5% for Diehard Bitstream Test
Assessment: PASSED at &gt; 5% for Diehard OPSO
Assessment: PASSED at &gt; 5% for Diehard OQSO Test
Assessment: PASSED at &gt; 5% for Diehard DNA Test
Assessment: PASSED at &gt; 5% for Diehard Count the 1s (stream) Test
Assessment: PASSED at &gt; 5% for Diehard Count the 1s Test (byte)
Assessment: PASSED at &gt; 5% for Diehard Parking Lot Test
Assessment: PASSED at &gt; 5% for Diehard Minimum Distance (2d Circle) Test
Assessment: PASSED at &gt; 5% for Diehard 3d Sphere (Minimum Distance) Test
Assessment: PASSED at &gt; 5% for Diehard Squeeze Test
Assessment: PASSED at &gt; 5% for Diehard Runs Test
Assessment: PASSED at &gt; 5% for Diehard Runs Test
Assessment: PASSED at &gt; 5% for Diehard Craps Test
Assessment: PASSED at &gt; 5% for Diehard Craps Test
Assessment: PASSED at &gt; 5% for Marsaglia and Tsang GCD Test
Assessment: PASSED at &gt; 5% for Marsaglia and Tsang GCD Test
Assessment: PASSED at &gt; 5% for STS Monobit Test
Assessment: PASSED at &gt; 5% for STS Runs Test
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for STS Serial Test (Generalized)
Assessment: PASSED at &gt; 5% for Lagged Sum Test</pre>
<p>So the &#8220;poor&#8221; and &#8220;possibly weak&#8221; guys became happy when we doubled the number of tests, and there&#8217;s a new &#8220;possibly weak&#8221; guy.Â  But when I looked up the new guy&#8217;s p-value, it was only 0.02888045, which is 1 in 34 chance, it doesn&#8217;t seem that improbable (real dieharder failures tend to look like 0.00000001 or 0.99999998 an should look more like that the more tests you run).</p>
<h2>Conclusion</h2>
<p>So far as I can tell these results are good.</p>
<p>If anyone has enough math power to calculate the theoretical distribution of the rngtest monobit, poker and run rngtests I would be very grateful, so I can compare all the numbers.Â  On the two I was able to calculate, we seem to be very close.</p>
<p>Dieharder seemed happy with double the tests and the one test it flagged then only had a probability of 1:34 which is not unreasonable.</p>
]]></content:encoded>
			<wfw:commentRss>http://warmcat.com/_wp/2009/05/21/whirlygig-verification-and-rngtest-analysis/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Whirlygig PCB</title>
		<link>http://warmcat.com/_wp/2009/05/21/whirlygig-pcb/</link>
		<comments>http://warmcat.com/_wp/2009/05/21/whirlygig-pcb/#comments</comments>
		<pubDate>Thu, 21 May 2009 08:44:49 +0000</pubDate>
		<dc:creator>andy</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://warmcat.com/_wp/?p=53</guid>
		<description><![CDATA[I built the first prototype Whirlygig PCB last weekend, it&#8217;s working well.Â  For testing I left out the noncritical inductors and some caps.Â  I also found the total current consumption at the USB side is 250mA with the CPLD macrocells in low power mode and 350mA with them in high power mode, comfortably within the [...]]]></description>
			<content:encoded><![CDATA[<p>I built the first prototype Whirlygig PCB last weekend, it&#8217;s working well.Â  For testing I left out the noncritical inductors and some caps.Â  I also found the total current consumption at the USB side is 250mA with the CPLD macrocells in low power mode and 350mA with them in high power mode, comfortably within the 500mA USB budget.Â  I decided to use the higher power mode because it should increase the ring oscillator frequencies and hence the randomness.Â  The CPLD runs hot, around 40 degrees C.</p>
<p><img src="/wb-pcb1-top.jpg" alt="" align="left" /></p>
<p><img src="/wg-pcb-1-bot.png" alt="" align="left" /></p>
<h2>Improvements</h2>
<p>I took the opportunity to make some improvements:</p>
<p>- Added JTAG programming of the CPLD to the SiLabs microcontroller over USB.Â  This allows change or update of the CPLD logic from the host PC without any hardware needed.Â  However because the kernel module blocks the logical USB interface, it&#8217;s safe from being rewritten while in use.</p>
<p>- Changed the random logic.Â  I&#8217;ll explain the changes and results in the rest of this article.</p>
<p>- Decreased the polling rate of the CPLD but increased the total USB random throughput, 1.0MBytes/sec sustained (for as long as you like) by making the code in the microcontroller &#8220;multithreaded&#8221;.Â  You can also plug in more Whirlygig devices to linearly increase random production; the kernel module allows hotplug and unplug without problems and combines the output seamlessly all in /dev/hwrng.</p>
<p>- I was pleased to see the kernel module had hardly bitrotted at all, it only needed a one-line edit to build a working module against a current Fedora Rawhide kernel.</p>
<p>The second LED lights while the PC is requesting random packets from the device.Â  It lights briefly on plugging it in while the driver&#8217;s cache is filled, then it only lights when something is using the hard random numbers on the PC.</p>
<h2>New random scheme</h2>
<p>I had three main ideas about improving the random hardware inside the CPLD.</p>
<p>First I realized we can decrease predictability by having more oscillators than are used at one time to change an output bit.Â  We have 8 output bits, but we now have 16 oscillator sets.Â  Instead of combining them all, on average several will not be used on any given operation.</p>
<p><img src="/ring-rng-block.png" alt="" /></p>
<p>The second idea was that now we have a pool of oscillators greater than needed at any one time, we can randomly select from them for each output bit operation.Â  So I added an additional 32 oscillator sets (4 for each output bit) which are only used to select which of the pool of 16 we use for any operation.Â  The end result is that at least 8 oscillators from the pool will be unused for each operation, and which oscillators do get used for which bit are individually &#8220;random&#8221; with &#8220;no&#8221; correlation between output bits.Â  This makes any attacker&#8217;s attempt to model the pool oscillator states very tough because there&#8217;s no longer any knowledge about which bit contains information about which pool oscillator, or even if its state has affected any output bit.</p>
<p>Lastly we now operate from a clock (24MHz) that is 14 times faster than the sample rate.Â  This lets us mix 14 randomly chosen oscillator states by xor before the output is sampled for each bit.Â  Even if two output bits were mixed with the same 14 oscillators, the order would have to be the same as well to get the same result, since the oscillators are never standing still.Â  For this same reason selecting a pool oscillator more than once in the 14 operations is not equivalent to a NOP.</p>
<p>I added another small tweak, all of the random generators shift ther oiginal state by 1 generator on each clock.Â  This is intended to reduce the impact of any hard nonliniarity in individual generator routing on the CPLD.</p>
<p>There were no problems with the PCB, but to save myself a headache working with the crossbar in the CPU I blobbed together pins 26 and 27 on the CPU.</p>
<p>In the next article we look at the random performance again with the new scheme.</p>
]]></content:encoded>
			<wfw:commentRss>http://warmcat.com/_wp/2009/05/21/whirlygig-pcb/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Exhaustion and the GPL</title>
		<link>http://warmcat.com/_wp/2008/05/23/exhaustion-and-the-gpl/</link>
		<comments>http://warmcat.com/_wp/2008/05/23/exhaustion-and-the-gpl/#comments</comments>
		<pubDate>Fri, 23 May 2008 09:55:58 +0000</pubDate>
		<dc:creator>andy</dc:creator>
				<category><![CDATA[Licenses]]></category>
		<category><![CDATA[autodesk]]></category>
		<category><![CDATA[exhaustion]]></category>
		<category><![CDATA[first sale doctrone]]></category>
		<category><![CDATA[gpl]]></category>

		<guid isPermaLink="false">http://warmcat.com/_wp/?p=52</guid>
		<description><![CDATA[Some years ago I came across a guy Alexander Terekhov who worked then for IBM and had outspoken views about the viability of the GPL.
If I understood it, his opinion was that the license terms of the GPL would not survive resale, due to the well established &#8220;first sale doctrine&#8221; and its EU equivalent &#8220;exhaustion&#8221;.Â  [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft" margin="5" src="http://warmcat.com/exhaustion.png" alt="exhaustion" />Some years ago I came across a guy Alexander Terekhov who worked then for IBM and had outspoken views about the viability of the GPL.</p>
<p>If I understood it, his opinion was that the license terms of the GPL would not survive resale, due to the well established <a href="http://en.wikipedia.org/wiki/First-sale_doctrine">&#8220;first sale doctrine&#8221;</a> and its EU equivalent <a href="http://en.wikipedia.org/wiki/Exhaustion_of_rights">&#8220;exhaustion&#8221;</a>.Â  It basically means that the copyright holder cannot stop you reselling your software, and that the license terms will not apply to the guy receiving it.</p>
<p>I tried to understand this further, but Alexander was not always easy for me to comprehend and had then a habit of linking to his own posts elsewhere to bolster his position, leading to a kind of echo chamber of Terekhovs all nodding vigorously at each other.Â  He also back then and evidently more recently too explained legal decisions that did not fit his understanding by <a href="http://www.mail-archive.com/gnu-misc-discuss@gnu.org/msg06021.html">calling the Judges in question &#8220;morons&#8221;</a>, etc.Â  Well the forum I met him at had a very high trolling quotient so it just joined the rest of the anti-GPL sentiment there for me in the end and I ignored it.</p>
<h3>GPL is a license too</h3>
<p>But I was reminded of this last night when I read about a recent <a href="http://williampatry.blogspot.com/2008/05/first-sale-victory-in-vernor.html">decision against Autodesk</a> which is being widely seen as a victory for Joe Softwarebuyer.Â  From the Patry blog post link above:</p>
<blockquote><p>&#8230;many software companies have taken the position that they can convey the copy to the customer in an over-the-counter transaction for a one-time payment, but describe that transaction as a license; as a license, the first sale doctrine doesn&#8217;t apply, meaning copyright owners can prevent further distribution of the copy&#8230;</p></blockquote>
<p>Doesn&#8217;t this vindicate Alexander&#8217;s position?Â  How can GPL terms stick past resale if Autodesk EULA ones don&#8217;t?Â  Nothing stops &#8220;built-in&#8221; or &#8220;automated&#8221; resale to clense software of any licensing restriction.</p>
<p>A lot of people seem to be happy about the paid-for world being freed from license conditions, are they going to be happy if it turns out that everyone is also freed from GPL conditions?</p>
<h3>Civil infringement and Punishment</h3>
<p>What effect would this have on contribution I wonder.Â  It seems to me the real-world advantages from being active in a project by contributing will still apply.Â  But it will enable private proprietary forking for products, the kind of thing that Harald Welte&#8217;s <a href="http://gpl-violations.org">gp-violations.org</a> has had success attacking and punishing to date.Â  Contributors will see their work used in commercial products without the changes being open.</p>
<p>But the BSD folks seem to survive this outrage without it removing their motivation.Â  And from time spent looking at music licensing over the years, I kind of recognize an element of proprietary vindictiveness in gpl-violations&#8230; of course the member companies hiding behind the RIAA attacks are also &#8220;perfectly within their rights&#8221; to embark on much worse vindictive destruction, but they are not entirely dissimilar and that always bothered me.</p>
<h3>Playing ball or going home?</h3>
<p>Well, this decision is subject to appeal, will only apply to the jurisdiction of that court, etc, so the sky didn&#8217;t fall in already.Â  But there is quite a bit of harmonization of copyright law thanks to the insistence of rich rightholder companies mainly from the US side.Â  But if this is upheld, it may come to contaminate most Western countries and turn GPL terms in unenforcable noise &#8212; the choices would be in effect public domain or closed.</p>
<p>I guess some people will go closed rather than have their work exploited, but I expect most people will just continue on, and contributions will continue to come perfectly fine.Â  The advantages from being a visible contributor and taking upstream directly are still going to apply, so will the bitrot that happens to any additional code put on top and maintained privately.</p>
<h3>Too mature to care?</h3>
<p>Maybe now we reached a point that the social, financial, engineering and public advantages from cooperation are ingrained enough that we don&#8217;t need a license to protect them anyway?Â  But I read this and I feel a sinking feeling about the naivity of such a proposal.</p>
]]></content:encoded>
			<wfw:commentRss>http://warmcat.com/_wp/2008/05/23/exhaustion-and-the-gpl/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Whirlygig GPL&#8217;d HWRNG</title>
		<link>http://warmcat.com/_wp/2007/11/24/whirlygig-gpld-hwrng/</link>
		<comments>http://warmcat.com/_wp/2007/11/24/whirlygig-gpld-hwrng/#comments</comments>
		<pubDate>Sat, 24 Nov 2007 10:45:44 +0000</pubDate>
		<dc:creator>andy</dc:creator>
				<category><![CDATA[Hardware design]]></category>
		<category><![CDATA[Linux peripherals]]></category>

		<guid isPermaLink="false">http://warmcat.com/_wp/2007/11/24/whirlygig-gpld-hwrng/</guid>
		<description><![CDATA[
Hardware random for the masses
I made available the result of the ring oscillator random generator as a GPL project called Whirlygig.  It&#8217;s a 2.75cm x 4cm PCB with a mini USB connector, it provides a sustained 5.5Mbps (~620KBytes/sec) of apparently very high quality random bits using the Linux hw_random API.  The large amount [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/whirlygig-logo.png" align=left hspace=5></p>
<h3>Hardware random for the masses</h3>
<p>I made available the result of the ring oscillator random generator as a GPL project <a href="http://warmcat.com/_wp/whirlygig-rng/">called Whirlygig</a>.  It&#8217;s a 2.75cm x 4cm PCB with a mini USB connector, it provides a sustained 5.5Mbps (~620KBytes/sec) of apparently very high quality random bits using the Linux hw_random API.  The large amount of randomness should make it useful for statistical tests as well as hard crypto.</p>
<p>I prototyped it using a couple of boards I had lying around, so I know it works fine, but I am waiting for the PCBs to come back from fabrication to actually build a final one.  I placed the CPLD VHDL, the board hardware design, the driver software and the firmware for the USB controller into <a href="http://git.warmcat.com">http://git.warmcat.com</a>.</p>
<h3>Dieharder</h3>
<p>I spent some time worrying about how to test the quality of the result &#8212; I found that &#8220;diehard&#8221; mentioned in an earlier post has been superceded by <a href="http://www.phy.duke.edu/~rgb/General/dieharder.php">&#8220;dieharder&#8221;</a>.  This has a much tougher general testing regime, even though many of its test are reproductions of the diehard ones &#8212; it runs each test many times and forms histograms of the p-value results from the many runs, and gives an assessment of fail, poor, possibly weak or pass on the spread of results rather than a single result.</p>
<p>At first the RNG failed three of the 18 tests, but on looking closer one of the tests (#2) currently fails for all RNG input and is marked up as not for use with assessing RNG quality, and the two others required by default more than the 400MBytes of randomness I had prepared.  Unfortunately in that case they simply rewind the randomness file and re-use the same data to make up the balance!  Of course this is no longer quite &#8220;random&#8221;.  When I adjusted those two tests to use a smaller sample that fitted into the 400MBytes without repetition, the output of the RNG get a &#8220;pass&#8221; on all 17 of the relevant dieharder suite tests.</p>
<h3>Max Entropy</h3>
<p>During the validation phase I changed the RNG algorithm in the CPLD significantly.  The scheme is described on the project page, but basically I moved away from a bit-centric to a byte-centric design with 8 identical sets of 3 oscillators.  To stop any characteristic of a particular oscillator&#8217;s routing from being associated with a particular bit of the result byte and creating a bias, I introduced a &#8220;mixer&#8221; that first generates 8 random bits by combining six oscillator outputs each with XOR, then rotates these oscillator sets between the result bits sequentially at 24MHz.  I also removed the toggling action and used the random bit directly.</p>
<p>I also found the Linux rng-tools suite which repeatedly runs FIPS-140-2 tests on the bits, this fails 1 in 1200 or so packets of testing over 20 billion bits, I believe this is normal for a real random generator that it will produce sequences with low probability that don&#8217;t look very random in the short term.</p>
<p>Aside from passing dieharder and FIPS-140-2, the changes also got me a reported 8.000000 bits of entropy per byte from the ENT test, so there are reasons to imagine the quality of the output is very good.</p>
]]></content:encoded>
			<wfw:commentRss>http://warmcat.com/_wp/2007/11/24/whirlygig-gpld-hwrng/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>FIPS-140-2 and ENT validation vs ring RNG</title>
		<link>http://warmcat.com/_wp/2007/11/15/fips-140-2-and-ent-validation-vs-ring-rng/</link>
		<comments>http://warmcat.com/_wp/2007/11/15/fips-140-2-and-ent-validation-vs-ring-rng/#comments</comments>
		<pubDate>Thu, 15 Nov 2007 09:20:13 +0000</pubDate>
		<dc:creator>andy</dc:creator>
				<category><![CDATA[Embedded Linux]]></category>
		<category><![CDATA[Hardware design]]></category>

		<guid isPermaLink="false">http://warmcat.com/_wp/2007/11/15/fips-140-2-and-ent-validation-vs-ring-rng/</guid>
		<description><![CDATA[NIST lists some more test suites.  NIST also have their own suite, but it is now Windows-only, and lacks a necessary DLL to run there.  The last UNIX version segfaulted here before giving any results&#8230; sigh.  
I ran the last 10MByte sample against ENT and TestU01&#8230; to cut a long story short
$ [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://csrc.nist.gov/groups/ST/toolkit/rng/batteries_stats_test.html">NIST</a> lists some more test suites.  NIST also have their own suite, but it is now Windows-only, and lacks a necessary DLL to run there.  The last UNIX version segfaulted here before giving any results&#8230; sigh.  </p>
<p>I ran the last 10MByte sample against <a href="http://www.fourmilab.ch/random/">ENT</a> and <a href="http://www.iro.umontreal.ca/~simardr/testu01/TestU01.zip">TestU01</a>&#8230; to cut a long story short</p>
<blockquote><p><font size=-2>$ ./ent ../die.c/dump3<br />
Entropy = 7.999980 bits per byte.</p>
<p>Optimum compression would reduce the size<br />
of this 10002432 byte file by 0 percent.</p>
<p>Chi square distribution for 10002432 samples is 281.26, and randomly<br />
would exceed this value 25.00 percent of the times.</p>
<p>Arithmetic mean value of data bytes is 127.4958 (127.5 = random).<br />
Monte Carlo value for Pi is 3.140111525 (error 0.05 percent).<br />
Serial correlation coefficient is -0.000212 (totally uncorrelated = 0.0).</font></p></blockquote>
<p>7.9999 bits of entropy per byte!  TestU01 is less turnkey than the other suites &#8212; it&#8217;s literally a test library with some example code.  I amended an example to call the FIPS-140-2 tests:</p>
<blockquote><pre><font size=-2>============== Summary results of FIPS-140-2 ==============

 File:             dump3
 Number of bits:   20000

       Test          s-value        p-value    FIPS Decision
 --------------------------------------------------------
 Monobit               9933           0.83       Pass
 Poker                11.88           0.69       Pass

 0 Runs, length 1:     2482                      Pass
 0 Runs, length 2:     1227                      Pass
 0 Runs, length 3:      630                      Pass
 0 Runs, length 4:      319                      Pass
 0 Runs, length 5:      161                      Pass
 0 Runs, length 6+:     166                      Pass

 1 Runs, length 1:     2466                      Pass
 1 Runs, length 2:     1302                      Pass
 1 Runs, length 3:      620                      Pass
 1 Runs, length 4:      311                      Pass
 1 Runs, length 5:      140                      Pass
 1 Runs, length 6+:     146                      Pass

 Longest run of 0:       16           0.14       Pass
 Longest run of 1:       14           0.46       Pass
 ----------------------------------------------------------
 All values are within the required intervals of FIPS-140-2</font></pre>
</blockquote>
<p>So the design&#8217;s output is compliant to FIPS-140-2, a requirement for many uses.</p>
]]></content:encoded>
			<wfw:commentRss>http://warmcat.com/_wp/2007/11/15/fips-140-2-and-ent-validation-vs-ring-rng/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Diehard validation vs ring RNG</title>
		<link>http://warmcat.com/_wp/2007/11/14/diehard-validation-vs-ring-rng/</link>
		<comments>http://warmcat.com/_wp/2007/11/14/diehard-validation-vs-ring-rng/#comments</comments>
		<pubDate>Wed, 14 Nov 2007 15:56:03 +0000</pubDate>
		<dc:creator>andy</dc:creator>
				<category><![CDATA[Hardware design]]></category>
		<category><![CDATA[Linux peripherals]]></category>

		<guid isPermaLink="false">http://warmcat.com/_wp/2007/11/14/diehard-validation-vs-ring-rng/</guid>
		<description><![CDATA[
RNG Quality assessment
A timely article flew by on Reddit about the RANDU pseudo-random generator algorithm widely used in the 1960s, which it turns out was very flawed indeed.  It was explained to one student that &#8221;We guarantee that each number is random individually, but we donâ€™t guarantee that more than one of them is [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/catbowl1.png" align=left hspace=5></p>
<h3>RNG Quality assessment</h3>
<p>A timely article flew by on Reddit about the <a href="http://en.wikipedia.org/wiki/RANDU">RANDU</a> pseudo-random generator algorithm widely used in the 1960s, which it turns out was very flawed indeed.  It was explained to one student that &#8221;We guarantee that each number is random individually, but we donâ€™t guarantee that more than one of them is random&#8221;.  Basically it produced numbers that belonged to one of 15 &#8220;planar&#8221; groupings and nothing in the gaps between the planes.  It isn&#8217;t just a minor annoyance, because many statistical studies in the 60s and 70s used it, and it can easily have contaminated their results.  That&#8217;s definitely not what I am trying to reproduce with the ring oscillator device &#8212; so how can I figure out how &#8220;good&#8221; the randomness is in an objective way?</p>
<h3>RNG quality test suites</h3>
<p>It turns out that empirically testing RNG outputs has been the subject of a lot of work for decades, and there are some established testing suites available online.  A major one seems to be the &#8220;<a href="http://stat.fsu.edu/pub/diehard/">diehard</a>&#8221; suite &#8212; I guess it is a pun on die as the plural of dice.</p>
<p>It needs you to fetch 10M bytes of random numbers or more and let it run a bunch of tests on them.  The output was a little hard to assess initially: most tests issue a &#8220;p&#8221; number which only suggests something is bad if it is 0.000&#8230; OR 0.999&#8230;.  All other numbers inbetween are to be taken as a good result as I understood it.  Except there is a warning that even good RNGs can produce the occasional test fail.</p>
<blockquote><p> Thus you should not be surprised with  occasional p-values near 0 or 1, such as .0012 or .9983. When a bit stream really FAILS BIG, you will get p`s of 0 or 1 to six or more places.  By all means, do not, as a Statistician might, think that a p < .025 or p> .975 means that the RNG has &#8220;failed the test at the .05 level&#8221;.  Such p`s happen among the hundreds that DIEHARD produces, even with good RNGs.  So keep in mind that &#8220;p happens&#8221;</p></blockquote>
<p>I duly fetched 10M bytes of 115kbps randomness from the device and fed it to diehard.  It seemed to give fine results except on &#8220;Count the 1s stream&#8221; and &#8220;Squeeze&#8221; (devastating p=0.000000), &#8220;Count the 1s specific&#8221; for bits 1-11 (p=0.000030) and 9-16 (p=0.000064), and QQSO 2-6 (p=0.000005).  It passed the dozens of other tests but it was disappointing, looks like a big fat &#8216;failed&#8217;.</p>
<h3>Triple Scoop</h3>
<p>Well, since my test CPLD was an XC95288XL with 288 Macrocells to burn, I naturally wondered if I could improve matters by tripling the amount of ring oscillators getting Xor-ed &#8212; that is to implement the three varying sized oscillators 3 times each, totaling nine, and sum them with a big XOR.  They&#8217;ll all be drifting around individually as much as together, it should be a mighty noise-fest.</p>
<p>I edited the VHDL and blew it into the CPLD&#8230; visually the summed RNG output &#8220;bit&#8221; was an awful lot more noisy than before.   I pulled another 10M bytes from that setup: but just looking at the byte distribution as I did before told me something is still up.</p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/byte-dist-10m-2.png"></p>
<p>That sawtooth type distribution is &#8220;not random&#8221; to coin a phrase.  If you look at the large jump at 0&#215;80 (128) it is telling us that we are more likely to get 1000000 binary than we are to get 01111111, in other words, since this is over 10M bytes, there is a distribution problem favouring &#8216;0&#8242;.  When I analyze the distributions of 1s and 0s I find</p>
<table>
<tr>
<td>
<pre>0: 40436204, 1: 39563804... delta=872400, skew=1.090500%</pre>
</td>
</tr>
</table>
<p>You can see the same thing even better looking at 0&#215;00 (42,000 hits) vs 0xFF (36,000 hits), they are like 8% off the median of 39,000.  Clearly that distribution of 1s and 0s has to have a very small skew to stop these kinds of effects showing up, and equally clearly this is telling us something deep about the RNG hardware.</p>
<h3>Spiky</h3>
<p>Although the individual oscillators are quite slow thanks to the number of inverter stages, at 4 &#8211; 6MHz, the way they are being summed makes for trouble from bandwidth limitations inside the CPLD.  At the moment it just uses a dumb asynchronous XOR action, that means that potentially very fast spikes can be seen when one &#8220;slow&#8221; oscillator changes state very shortly after another &#8220;slow&#8221; oscillator.  For example:</p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/f0002tek.jpg"></p>
<p>You can see on the left (this is 5ns/div notice) a runt pulse where this happened, the XOR was convinced to rise by one oscillator changing and then countermanded when another oscillator changed state less than 5ns later, resulting in a doubtful pulse that was probably not visible as a &#8216;1&#8242;.  This also happens when going from &#8216;1&#8242; to &#8216;0&#8242;, but maybe the threshold for the transistors in the CPLD is not at exactly 50% of the 3.3V supply.  So we suddenly have it seeing more &#8216;0&#8217;s than &#8216;1&#8217;s on average when spikes are involved.</p>
<p>This whole high bandwidth summing step is completely needless, it&#8217;s only there because it is a literal interpretation of the diagram in the original RFC.  I changed it instead to have nine latches sample the nine oscillators every 125ns (there is an 8MHz clock on the prototype board) and sum those results with XORs into a single bit.  In turn this output is sampled by another latch at 8MHz to hide any metastability.</p>
<h3>Latched up</h3>
<p>The latched summing version performs much better and has gotten rid of most of the bit skew, and the sawtooth behaviour:</p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/byte-dist-10m-3.png"></p>
<p>&#8230;but there is still a problem with 0&#215;00&#8230;. the bit skew looks like this</p>
<table>
<tr>
<td>
<pre>0: 39960076, 1: 40039932... delta=79856, skew=0.099820%</pre>
</td>
</tr>
</table>
<p>so the skew is now on the side of &#8216;1&#8217;s but only by 0.1%.  You can see the byte count spread is much tighter than before too &#8212; 1800 instead of 6000 counts before.</p>
<h3>Balancing out the skew</h3>
<p>Well if the remaining skew is something to do with the ratio of rise to fall times, or the non-squareness of the oscillator outputs for some other reason by something as low as 0.1%, that is hard to do much about, especially as it may vary on the specific silicon die.</p>
<p>But it shouldn&#8217;t matter &#8212; now the bandwidth situation at the XOR summer is sane, if we invert the summed output 50% of the time it should spread any excess on &#8216;1&#8217;s or &#8216;0&#8217;s to the opposite as well, cancelling any bias.  I added a couple of terms to the summer to xor against the UART bit index LSB and a bit which toggles after every byte sent by the UART.  It&#8217;s the equivalent of xor with 0&#215;55 for the first byte and then 0xAA for the second byte, over and over.</p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/byte-dist-10m-4.png"></p>
<p>That glitch in the middle is actually at 134 (0&#215;86), maybe it is random but I guess we will see&#8230;. the skew is further reduced as anticipated</p>
<table>
<tr>
<td>
<pre>0: 39974218, 1: 40025790... delta=51572, skew=0.064465%</pre>
</td>
</tr>
</table>
<h3>Diehard sequel</h3>
<p>I ran 10M bytes from this version through Diehard again&#8230; the really bad p-value results are gone.  For example Squeeze was a deadly 0.000000 before and is now 0.255260.</p>
<p>I made one last adjustment, I added the current state of the latched random value to the XOR term.  That means it decides whether to keep or invert the latched value, it no longer directly accepts the value from the RNG.  This got me to the promised land: 0.0005% skew between &#8216;1&#8242; and &#8216;0&#8242;.</p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/byte-dist-10m-5.png"></p>
<table>
<tr>
<td>
<pre>0: 40000206, 1: 39999802... delta=404, skew=0.000505%</pre>
</td>
</tr>
</table>
<p>This also gets me the apparently good diehard results with no obvious failures on any tests, you can see the actual results <a href="/diehard.txt">here</a>.  So it seems the current version can tentatively be called a &#8220;real RNG&#8221;. </p>
]]></content:encoded>
			<wfw:commentRss>http://warmcat.com/_wp/2007/11/14/diehard-validation-vs-ring-rng/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Ring oscillator RNG performance</title>
		<link>http://warmcat.com/_wp/2007/11/12/ring-oscillator-rng-performance/</link>
		<comments>http://warmcat.com/_wp/2007/11/12/ring-oscillator-rng-performance/#comments</comments>
		<pubDate>Mon, 12 Nov 2007 01:33:12 +0000</pubDate>
		<dc:creator>andy</dc:creator>
				<category><![CDATA[Hardware design]]></category>
		<category><![CDATA[Linux peripherals]]></category>

		<guid isPermaLink="false">http://warmcat.com/_wp/2007/11/12/ring-oscillator-rng-performance/</guid>
		<description><![CDATA[
Pretty random
After some scrabbling around porting my Jtag SVF interpreter to Octotux and creating a kernel module for the PIO end of it &#8212; and moving to a different board with a XC95288XL CPLD to prototype it, the triple ring oscillator RNG is working.    It issues a 9600 baud result, but after [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/dawg.png" align=left hspace=5></p>
<h3>Pretty random</h3>
<p>After some scrabbling around porting my Jtag SVF interpreter to Octotux and creating a kernel module for the PIO end of it &#8212; and moving to a different board with a XC95288XL CPLD to prototype it, the triple ring oscillator RNG is working.    It issues a 9600 baud result, but after some initial confusion I modified it 1/8th of the time to sit out a sample time leaving &#8220;break&#8221; on the serial line.  This should make sure that the receiving UART does not get confused by the data as a start bit.  The true data rate is something like 800 random bytes per second at 9600 baud.</p>
<p>Here are the three chains of inverters (19, 23 and 29 long) oscillating at the different fundamentals</p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/f0016tek.jpg" height=263></p>
<p></p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/f0017tek.jpg" align=center height=263></p>
<p></p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/f0018tek.jpg" align=center height=263></p>
<p>&#8230; and here is what the xor summing looks like, first over 1s then sampled once.</p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/f0019tek.jpg" align=center height=263></p>
<p></p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/f0020tek.jpg" align=center height=263></p>
<p>Although the single shot sample doesn&#8217;t look very random, the oscillators are drifting around all the time.  If you wait a little while between samples (currently it is 104us, a 9600 baud bit-period) it&#8217;s pretty hard to guess what phase all the oscillators have drifted to &#8212; at least, that&#8217;s the plan.</p>
<h3>Distribution of binary levels</h3>
<p>The first test I did was to see what the distribution of &#8216;1&#8242; and &#8216;0&#8242; in the results was&#8230; clearly if the device is really random it should on average be 50% each.  I fetched 1M random bytes, or 8Mbits:</p>
<table align=center>
<tr>
<td>0: 4008913, 1: 3991095&#8230; delta=17818, skew=0.222725%</td>
</tr>
</table>
<p>Its okay for a really random source to deviate to 50:50 at any given time, although on average it should be 50:50.</p>
<h3>Octet distribution</h3>
<p>Next I looked at the distribution of the results from 0&#215;00 through 0xFF as the result &#8220;random byte&#8221;.  This would show up if the RNG fails to ever issue some result or favours certain results over others &#8212; every result should on average have an equal chance of showing up and so an equal count.  I ran it for 1M random bytes&#8230;</p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/rng-dist-1.png" align=center></p>
<p>This is pretty decent, every possible result is seen with a frequency within +/-200 counts of the 3,900 average after 1M bytes.</p>
<h3>115200 baud results</h3>
<p>Encouraged by this I cranked the baud rate up to 115220 or 8.68us between samples and around 10K random bytes per second.  The skew is increased somewhat and the spread of result counts is increased a little.</p>
<table align=center>
<tr>
<td>0: 4028746, 1: 3971262&#8230; delta=57484, skew=0.718549%</td>
</tr>
</table>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/rng-dist-2.png" align=center></p>
<p>So far so good!</p>
]]></content:encoded>
			<wfw:commentRss>http://warmcat.com/_wp/2007/11/12/ring-oscillator-rng-performance/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
