<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Way of the exploding head &#187; Linux peripherals</title>
	<atom:link href="http://warmcat.com/_wp/category/linux-peripherals/feed/" rel="self" type="application/rss+xml" />
	<link>http://warmcat.com/_wp</link>
	<description>Embedded and desktop Linux</description>
	<lastBuildDate>Fri, 12 Feb 2010 23:49:41 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.6</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Bootloader Envy</title>
		<link>http://warmcat.com/_wp/2010/02/08/bootloader_envy/</link>
		<comments>http://warmcat.com/_wp/2010/02/08/bootloader_envy/#comments</comments>
		<pubDate>Mon, 08 Feb 2010 20:14:54 +0000</pubDate>
		<dc:creator>andy</dc:creator>
				<category><![CDATA[Embedded Linux]]></category>
		<category><![CDATA[Linux peripherals]]></category>
		<category><![CDATA[Openmoko Lessons]]></category>
		<category><![CDATA[Software design]]></category>

		<guid isPermaLink="false">http://warmcat.com/_wp/?p=67</guid>
		<description><![CDATA[Lesson #2:  A bootloader is to load and boot Linux
On the first day of FOSDEM I sat through a presentation on what could be called another &#8220;U-Boot derivative&#8221;.  One of the greatest asspains at Openmoko was the various kinds of Hell caused by the U-Boot bootloader and its philosophy, which can be summed up as [...]]]></description>
			<content:encoded><![CDATA[<h2>Lesson #2:  A bootloader is to load and boot Linux</h2>
<p><img class="alignleft" title="Qi" src="http://warmcat.com/qi.png" alt="" width="126" height="183" />On the first day of FOSDEM I sat through a presentation on what could be called another &#8220;U-Boot derivative&#8221;.  One of the greatest asspains at Openmoko was the various kinds of Hell caused by the U-Boot bootloader and its philosophy, which can be summed up as &#8220;I wanna be Linux when I grow up&#8221;.</p>
<h2>Configure system is a bad alternative to good bootloader design</h2>
<p>First, it has a config system.  That should be good though, right?  The problem with the config system is that if anything differs from your current config, you must build another incompatible binary with another config and take care of that.  When you have more than a handful of different boards, you are in a maze of incompatible bootloaders.  Openmoko took it one step further, they mandated a different bootloader binary per PCB revision, so left unchecked there would have been a continuous proliferation of incompatible bootloaders, all basically the same.</p>
<h2>All persistent bootloader private state is EVIL</h2>
<p>Second, U-Boot thinks it&#8217;s a good idea to have these environment &#8220;scripts&#8221;, because it&#8217;s &#8220;configurable&#8221;.  Actually, the job of a bootloader is to Load, then Boot Linux.  You don&#8217;t need any configurability for that if the bootloader can figure out what it&#8217;s running on and therefore where the memory is and how much there is.  These scripts expose a really deadly trap I call &#8220;private bootloader state&#8221;.  It means the bootloader stores stuff in nonvolatile memory on the PCB and acts different according to what it hides there.  The end result is that two boards from the same factory may act totally different even with the same rootfs due to &#8220;bootloader secrets&#8221;.  This is totally needless and ALL private bootloader state can be eliminated by correct design of the bootloader leading to completely deterministic boot action per rootfs.</p>
<p>A good example how that lead you to the path to hell is hardcoding in the U-Boot environment of the amount of kernel image you will copy from somewhere.  People commonly set it to 2MBytes, forget about it and one day they generate a 2.1MB kernel image and wonder why decompress blows up.  Actually, that whole procedure is insane, the kernels are uImages that report their length in a header.  The bootloader should examine the header and compute the length of image to pull.  But that doesn&#8217;t fit with this &#8220;environment&#8221; nonsense.</p>
<h2>Do Linux Stuff In Linux</h2>
<p>In any of these bloated U-Boot style bootloaders, is there even one feature they do better than the same feature in Linux?  The startup time should be better by a few 100ms.  Other than that, no, every single bloated &#8220;I will add it to the bootloader beacuse I can&#8221; feature is shittier than you get in Linux.  Every single feature!</p>
<p>If you need some advanced capability or backup / recovery boot action, check for a button held down at boot-time in the bootloader and go fetch a different Linux partition + kernel.  Use standard Linux tools and shells.  In return, get really high quality network stack, proper USB support, NAND access that&#8217;s compatible to your main Linux system access in BBT / ECC terms, and all the other advantages of Linux.</p>
<h2>Do your peripheral bringup in drivers in Linux</h2>
<p>Typically you do not need ANY bringup in the bootloader except SDRAM controller and chip init, since it&#8217;s a prerequisite to put Linux in the RAM that it&#8217;s initialized.</p>
<p>That&#8217;s right, all the megabytes of source spent in U-Boot providing support for so many kinds of peripheral is a waste of time, effort and maintenance.  I am being kind saying &#8220;maintenance&#8221;, because the drivers in U-Boot are typically &#8220;dumbed down&#8221; versions of the equivalent Linux driver that were forked irretrievably the moment all the Linux APIs were ripped, so there&#8217;s no coherent effort to keep them up to date with the Linux ones .  Lately I saw that they try to ape some Linux APIs there&#8230; why not go the whole hog and just <strong>load and boot real Linux</strong>?  After all, modern CPUs can be running your driver probes in Linux in ~2 seconds from power using a bootloader that doesn&#8217;t get in the way.</p>
<p>You typically don&#8217;t even need to talk to the PMU in the bootloader, after all, you are running code fine already, right?  Otherwise you wouldn&#8217;t be able to run the bootloader code itself.</p>
<h2>Fat girl in Ibiza</h2>
<p>At least at Openmoko, code quality inside U-Boot was awful bad.  I called U-Boot on the lists there &#8220;the fat girl in Ibiza&#8221; because you know she&#8217;s going to do anything you want.  All kinds of constant-only code, weird new scripting keywords were added for test undocumented, you name it.  Hardware guys felt up to writing such code secretly by themselves once they learned the software engineering marvel that is *((unsigned int *)0x&#8230;) = 0x&#8230;;</p>
<h2>Your bootloader just tests SDRAM</h2>
<p>There&#8217;s only one test action your bootloader is suited to do, and that is SDRAM test.  Once you are in Linux, it can&#8217;t perform a full SDRAM test while it&#8217;s running.  But the bootloader is typically starting from on-CPU SRAM, it can actually run a true SRAM test from there.  Otherwise, the bootloader should be completely absent from the test plan.  All other tests should be performed in Linux via standard driver and rootfs tools.</p>
<p>More about board and test and board bringup will feature in another report of a lesson learned.</p>
<h2>Qi</h2>
<p>While at Openmoko (mainly) I wrote a bootloader that meets these ideals, you can find it <a title="Qi git" href="http://git.warmcat.com/cgi-bin/cgit/qi/log/?h=txtr">in git here</a> One of the nicest things about it is that unlike the bloated bootloaders whose job never finishes trying to become Linux cargo cult style, Qi has been pretty much complete for a few months.  It&#8217;s a new job to support a new CPU, a much smaller job to add a new board and it doesn&#8217;t want to talk to your peripherals anyway so no problem there.</p>
<p>Qi creates one binary per CPU, that supports all boards with that CPU.  That sounds like a big job but we don&#8217;t care about your peripherals so all boards with the same CPU look almost identical.  You have to find something that can detect your particular board at runtime, for example NOR device ID read check.  So there is zero build-time config and Qi generates all CPU support when it&#8217;s buit, it takes 3 sec or so typically.</p>
<p>Typical bootloader binary size per CPU is 28-30KBytes.  That supports VFAT, ext2/3/4 typcially the SD controller as well.  The single Qi image also supports being booted from NAND, JTAG or SD Card on processors that support it just by being copied into place and without any changes.</p>
<p>There is zero bootloader private state, however Qi can look in the rootfs and append kernel commandline text from the content of a filesystem file.  This maintains the rule that boot should be completely deterministic per rootfs.</p>
]]></content:encoded>
			<wfw:commentRss>http://warmcat.com/_wp/2010/02/08/bootloader_envy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Whirlygig GPL&#8217;d HWRNG</title>
		<link>http://warmcat.com/_wp/2007/11/24/whirlygig-gpld-hwrng/</link>
		<comments>http://warmcat.com/_wp/2007/11/24/whirlygig-gpld-hwrng/#comments</comments>
		<pubDate>Sat, 24 Nov 2007 10:45:44 +0000</pubDate>
		<dc:creator>andy</dc:creator>
				<category><![CDATA[Hardware design]]></category>
		<category><![CDATA[Linux peripherals]]></category>

		<guid isPermaLink="false">http://warmcat.com/_wp/2007/11/24/whirlygig-gpld-hwrng/</guid>
		<description><![CDATA[
Hardware random for the masses
I made available the result of the ring oscillator random generator as a GPL project called Whirlygig.  It&#8217;s a 2.75cm x 4cm PCB with a mini USB connector, it provides a sustained 5.5Mbps (~620KBytes/sec) of apparently very high quality random bits using the Linux hw_random API.  The large amount [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/whirlygig-logo.png" align=left hspace=5></p>
<h3>Hardware random for the masses</h3>
<p>I made available the result of the ring oscillator random generator as a GPL project <a href="http://warmcat.com/_wp/whirlygig-rng/">called Whirlygig</a>.  It&#8217;s a 2.75cm x 4cm PCB with a mini USB connector, it provides a sustained 5.5Mbps (~620KBytes/sec) of apparently very high quality random bits using the Linux hw_random API.  The large amount of randomness should make it useful for statistical tests as well as hard crypto.</p>
<p>I prototyped it using a couple of boards I had lying around, so I know it works fine, but I am waiting for the PCBs to come back from fabrication to actually build a final one.  I placed the CPLD VHDL, the board hardware design, the driver software and the firmware for the USB controller into <a href="http://git.warmcat.com">http://git.warmcat.com</a>.</p>
<h3>Dieharder</h3>
<p>I spent some time worrying about how to test the quality of the result &#8212; I found that &#8220;diehard&#8221; mentioned in an earlier post has been superceded by <a href="http://www.phy.duke.edu/~rgb/General/dieharder.php">&#8220;dieharder&#8221;</a>.  This has a much tougher general testing regime, even though many of its test are reproductions of the diehard ones &#8212; it runs each test many times and forms histograms of the p-value results from the many runs, and gives an assessment of fail, poor, possibly weak or pass on the spread of results rather than a single result.</p>
<p>At first the RNG failed three of the 18 tests, but on looking closer one of the tests (#2) currently fails for all RNG input and is marked up as not for use with assessing RNG quality, and the two others required by default more than the 400MBytes of randomness I had prepared.  Unfortunately in that case they simply rewind the randomness file and re-use the same data to make up the balance!  Of course this is no longer quite &#8220;random&#8221;.  When I adjusted those two tests to use a smaller sample that fitted into the 400MBytes without repetition, the output of the RNG get a &#8220;pass&#8221; on all 17 of the relevant dieharder suite tests.</p>
<h3>Max Entropy</h3>
<p>During the validation phase I changed the RNG algorithm in the CPLD significantly.  The scheme is described on the project page, but basically I moved away from a bit-centric to a byte-centric design with 8 identical sets of 3 oscillators.  To stop any characteristic of a particular oscillator&#8217;s routing from being associated with a particular bit of the result byte and creating a bias, I introduced a &#8220;mixer&#8221; that first generates 8 random bits by combining six oscillator outputs each with XOR, then rotates these oscillator sets between the result bits sequentially at 24MHz.  I also removed the toggling action and used the random bit directly.</p>
<p>I also found the Linux rng-tools suite which repeatedly runs FIPS-140-2 tests on the bits, this fails 1 in 1200 or so packets of testing over 20 billion bits, I believe this is normal for a real random generator that it will produce sequences with low probability that don&#8217;t look very random in the short term.</p>
<p>Aside from passing dieharder and FIPS-140-2, the changes also got me a reported 8.000000 bits of entropy per byte from the ENT test, so there are reasons to imagine the quality of the output is very good.</p>
]]></content:encoded>
			<wfw:commentRss>http://warmcat.com/_wp/2007/11/24/whirlygig-gpld-hwrng/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Diehard validation vs ring RNG</title>
		<link>http://warmcat.com/_wp/2007/11/14/diehard-validation-vs-ring-rng/</link>
		<comments>http://warmcat.com/_wp/2007/11/14/diehard-validation-vs-ring-rng/#comments</comments>
		<pubDate>Wed, 14 Nov 2007 15:56:03 +0000</pubDate>
		<dc:creator>andy</dc:creator>
				<category><![CDATA[Hardware design]]></category>
		<category><![CDATA[Linux peripherals]]></category>

		<guid isPermaLink="false">http://warmcat.com/_wp/2007/11/14/diehard-validation-vs-ring-rng/</guid>
		<description><![CDATA[
RNG Quality assessment
A timely article flew by on Reddit about the RANDU pseudo-random generator algorithm widely used in the 1960s, which it turns out was very flawed indeed.  It was explained to one student that &#8221;We guarantee that each number is random individually, but we donâ€™t guarantee that more than one of them is [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/catbowl1.png" align=left hspace=5></p>
<h3>RNG Quality assessment</h3>
<p>A timely article flew by on Reddit about the <a href="http://en.wikipedia.org/wiki/RANDU">RANDU</a> pseudo-random generator algorithm widely used in the 1960s, which it turns out was very flawed indeed.  It was explained to one student that &#8221;We guarantee that each number is random individually, but we donâ€™t guarantee that more than one of them is random&#8221;.  Basically it produced numbers that belonged to one of 15 &#8220;planar&#8221; groupings and nothing in the gaps between the planes.  It isn&#8217;t just a minor annoyance, because many statistical studies in the 60s and 70s used it, and it can easily have contaminated their results.  That&#8217;s definitely not what I am trying to reproduce with the ring oscillator device &#8212; so how can I figure out how &#8220;good&#8221; the randomness is in an objective way?</p>
<h3>RNG quality test suites</h3>
<p>It turns out that empirically testing RNG outputs has been the subject of a lot of work for decades, and there are some established testing suites available online.  A major one seems to be the &#8220;<a href="http://stat.fsu.edu/pub/diehard/">diehard</a>&#8221; suite &#8212; I guess it is a pun on die as the plural of dice.</p>
<p>It needs you to fetch 10M bytes of random numbers or more and let it run a bunch of tests on them.  The output was a little hard to assess initially: most tests issue a &#8220;p&#8221; number which only suggests something is bad if it is 0.000&#8230; OR 0.999&#8230;.  All other numbers inbetween are to be taken as a good result as I understood it.  Except there is a warning that even good RNGs can produce the occasional test fail.</p>
<blockquote><p> Thus you should not be surprised with  occasional p-values near 0 or 1, such as .0012 or .9983. When a bit stream really FAILS BIG, you will get p`s of 0 or 1 to six or more places.  By all means, do not, as a Statistician might, think that a p < .025 or p> .975 means that the RNG has &#8220;failed the test at the .05 level&#8221;.  Such p`s happen among the hundreds that DIEHARD produces, even with good RNGs.  So keep in mind that &#8220;p happens&#8221;</p></blockquote>
<p>I duly fetched 10M bytes of 115kbps randomness from the device and fed it to diehard.  It seemed to give fine results except on &#8220;Count the 1s stream&#8221; and &#8220;Squeeze&#8221; (devastating p=0.000000), &#8220;Count the 1s specific&#8221; for bits 1-11 (p=0.000030) and 9-16 (p=0.000064), and QQSO 2-6 (p=0.000005).  It passed the dozens of other tests but it was disappointing, looks like a big fat &#8216;failed&#8217;.</p>
<h3>Triple Scoop</h3>
<p>Well, since my test CPLD was an XC95288XL with 288 Macrocells to burn, I naturally wondered if I could improve matters by tripling the amount of ring oscillators getting Xor-ed &#8212; that is to implement the three varying sized oscillators 3 times each, totaling nine, and sum them with a big XOR.  They&#8217;ll all be drifting around individually as much as together, it should be a mighty noise-fest.</p>
<p>I edited the VHDL and blew it into the CPLD&#8230; visually the summed RNG output &#8220;bit&#8221; was an awful lot more noisy than before.   I pulled another 10M bytes from that setup: but just looking at the byte distribution as I did before told me something is still up.</p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/byte-dist-10m-2.png"></p>
<p>That sawtooth type distribution is &#8220;not random&#8221; to coin a phrase.  If you look at the large jump at 0&#215;80 (128) it is telling us that we are more likely to get 1000000 binary than we are to get 01111111, in other words, since this is over 10M bytes, there is a distribution problem favouring &#8216;0&#8242;.  When I analyze the distributions of 1s and 0s I find</p>
<table>
<tr>
<td>
<pre>0: 40436204, 1: 39563804... delta=872400, skew=1.090500%</pre>
</td>
</tr>
</table>
<p>You can see the same thing even better looking at 0&#215;00 (42,000 hits) vs 0xFF (36,000 hits), they are like 8% off the median of 39,000.  Clearly that distribution of 1s and 0s has to have a very small skew to stop these kinds of effects showing up, and equally clearly this is telling us something deep about the RNG hardware.</p>
<h3>Spiky</h3>
<p>Although the individual oscillators are quite slow thanks to the number of inverter stages, at 4 &#8211; 6MHz, the way they are being summed makes for trouble from bandwidth limitations inside the CPLD.  At the moment it just uses a dumb asynchronous XOR action, that means that potentially very fast spikes can be seen when one &#8220;slow&#8221; oscillator changes state very shortly after another &#8220;slow&#8221; oscillator.  For example:</p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/f0002tek.jpg"></p>
<p>You can see on the left (this is 5ns/div notice) a runt pulse where this happened, the XOR was convinced to rise by one oscillator changing and then countermanded when another oscillator changed state less than 5ns later, resulting in a doubtful pulse that was probably not visible as a &#8216;1&#8242;.  This also happens when going from &#8216;1&#8242; to &#8216;0&#8242;, but maybe the threshold for the transistors in the CPLD is not at exactly 50% of the 3.3V supply.  So we suddenly have it seeing more &#8216;0&#8217;s than &#8216;1&#8217;s on average when spikes are involved.</p>
<p>This whole high bandwidth summing step is completely needless, it&#8217;s only there because it is a literal interpretation of the diagram in the original RFC.  I changed it instead to have nine latches sample the nine oscillators every 125ns (there is an 8MHz clock on the prototype board) and sum those results with XORs into a single bit.  In turn this output is sampled by another latch at 8MHz to hide any metastability.</p>
<h3>Latched up</h3>
<p>The latched summing version performs much better and has gotten rid of most of the bit skew, and the sawtooth behaviour:</p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/byte-dist-10m-3.png"></p>
<p>&#8230;but there is still a problem with 0&#215;00&#8230;. the bit skew looks like this</p>
<table>
<tr>
<td>
<pre>0: 39960076, 1: 40039932... delta=79856, skew=0.099820%</pre>
</td>
</tr>
</table>
<p>so the skew is now on the side of &#8216;1&#8217;s but only by 0.1%.  You can see the byte count spread is much tighter than before too &#8212; 1800 instead of 6000 counts before.</p>
<h3>Balancing out the skew</h3>
<p>Well if the remaining skew is something to do with the ratio of rise to fall times, or the non-squareness of the oscillator outputs for some other reason by something as low as 0.1%, that is hard to do much about, especially as it may vary on the specific silicon die.</p>
<p>But it shouldn&#8217;t matter &#8212; now the bandwidth situation at the XOR summer is sane, if we invert the summed output 50% of the time it should spread any excess on &#8216;1&#8217;s or &#8216;0&#8217;s to the opposite as well, cancelling any bias.  I added a couple of terms to the summer to xor against the UART bit index LSB and a bit which toggles after every byte sent by the UART.  It&#8217;s the equivalent of xor with 0&#215;55 for the first byte and then 0xAA for the second byte, over and over.</p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/byte-dist-10m-4.png"></p>
<p>That glitch in the middle is actually at 134 (0&#215;86), maybe it is random but I guess we will see&#8230;. the skew is further reduced as anticipated</p>
<table>
<tr>
<td>
<pre>0: 39974218, 1: 40025790... delta=51572, skew=0.064465%</pre>
</td>
</tr>
</table>
<h3>Diehard sequel</h3>
<p>I ran 10M bytes from this version through Diehard again&#8230; the really bad p-value results are gone.  For example Squeeze was a deadly 0.000000 before and is now 0.255260.</p>
<p>I made one last adjustment, I added the current state of the latched random value to the XOR term.  That means it decides whether to keep or invert the latched value, it no longer directly accepts the value from the RNG.  This got me to the promised land: 0.0005% skew between &#8216;1&#8242; and &#8216;0&#8242;.</p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/byte-dist-10m-5.png"></p>
<table>
<tr>
<td>
<pre>0: 40000206, 1: 39999802... delta=404, skew=0.000505%</pre>
</td>
</tr>
</table>
<p>This also gets me the apparently good diehard results with no obvious failures on any tests, you can see the actual results <a href="/diehard.txt">here</a>.  So it seems the current version can tentatively be called a &#8220;real RNG&#8221;. </p>
]]></content:encoded>
			<wfw:commentRss>http://warmcat.com/_wp/2007/11/14/diehard-validation-vs-ring-rng/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Ring oscillator RNG performance</title>
		<link>http://warmcat.com/_wp/2007/11/12/ring-oscillator-rng-performance/</link>
		<comments>http://warmcat.com/_wp/2007/11/12/ring-oscillator-rng-performance/#comments</comments>
		<pubDate>Mon, 12 Nov 2007 01:33:12 +0000</pubDate>
		<dc:creator>andy</dc:creator>
				<category><![CDATA[Hardware design]]></category>
		<category><![CDATA[Linux peripherals]]></category>

		<guid isPermaLink="false">http://warmcat.com/_wp/2007/11/12/ring-oscillator-rng-performance/</guid>
		<description><![CDATA[
Pretty random
After some scrabbling around porting my Jtag SVF interpreter to Octotux and creating a kernel module for the PIO end of it &#8212; and moving to a different board with a XC95288XL CPLD to prototype it, the triple ring oscillator RNG is working.    It issues a 9600 baud result, but after [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/dawg.png" align=left hspace=5></p>
<h3>Pretty random</h3>
<p>After some scrabbling around porting my Jtag SVF interpreter to Octotux and creating a kernel module for the PIO end of it &#8212; and moving to a different board with a XC95288XL CPLD to prototype it, the triple ring oscillator RNG is working.    It issues a 9600 baud result, but after some initial confusion I modified it 1/8th of the time to sit out a sample time leaving &#8220;break&#8221; on the serial line.  This should make sure that the receiving UART does not get confused by the data as a start bit.  The true data rate is something like 800 random bytes per second at 9600 baud.</p>
<p>Here are the three chains of inverters (19, 23 and 29 long) oscillating at the different fundamentals</p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/f0016tek.jpg" height=263></p>
<p></p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/f0017tek.jpg" align=center height=263></p>
<p></p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/f0018tek.jpg" align=center height=263></p>
<p>&#8230; and here is what the xor summing looks like, first over 1s then sampled once.</p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/f0019tek.jpg" align=center height=263></p>
<p></p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/f0020tek.jpg" align=center height=263></p>
<p>Although the single shot sample doesn&#8217;t look very random, the oscillators are drifting around all the time.  If you wait a little while between samples (currently it is 104us, a 9600 baud bit-period) it&#8217;s pretty hard to guess what phase all the oscillators have drifted to &#8212; at least, that&#8217;s the plan.</p>
<h3>Distribution of binary levels</h3>
<p>The first test I did was to see what the distribution of &#8216;1&#8242; and &#8216;0&#8242; in the results was&#8230; clearly if the device is really random it should on average be 50% each.  I fetched 1M random bytes, or 8Mbits:</p>
<table align=center>
<tr>
<td>0: 4008913, 1: 3991095&#8230; delta=17818, skew=0.222725%</td>
</tr>
</table>
<p>Its okay for a really random source to deviate to 50:50 at any given time, although on average it should be 50:50.</p>
<h3>Octet distribution</h3>
<p>Next I looked at the distribution of the results from 0&#215;00 through 0xFF as the result &#8220;random byte&#8221;.  This would show up if the RNG fails to ever issue some result or favours certain results over others &#8212; every result should on average have an equal chance of showing up and so an equal count.  I ran it for 1M random bytes&#8230;</p>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/rng-dist-1.png" align=center></p>
<p>This is pretty decent, every possible result is seen with a frequency within +/-200 counts of the 3,900 average after 1M bytes.</p>
<h3>115200 baud results</h3>
<p>Encouraged by this I cranked the baud rate up to 115220 or 8.68us between samples and around 10K random bytes per second.  The skew is increased somewhat and the spread of result counts is increased a little.</p>
<table align=center>
<tr>
<td>0: 4028746, 1: 3971262&#8230; delta=57484, skew=0.718549%</td>
</tr>
</table>
<p style="text-align:center; margin-top:0px; margin-bottom:0px; padding:0px;"><img src="/rng-dist-2.png" align=center></p>
<p>So far so good!</p>
]]></content:encoded>
			<wfw:commentRss>http://warmcat.com/_wp/2007/11/12/ring-oscillator-rng-performance/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Adding entropy to /dev/random</title>
		<link>http://warmcat.com/_wp/2007/11/07/adding-entropy-to-devrandom/</link>
		<comments>http://warmcat.com/_wp/2007/11/07/adding-entropy-to-devrandom/#comments</comments>
		<pubDate>Wed, 07 Nov 2007 11:13:54 +0000</pubDate>
		<dc:creator>andy</dc:creator>
				<category><![CDATA[Hardware design]]></category>
		<category><![CDATA[Linux peripherals]]></category>

		<guid isPermaLink="false">http://warmcat.com/_wp/2007/11/07/adding-entropy-to-devrandom/</guid>
		<description><![CDATA[
A hard RNG is good to find
The recent statistical analysis for drumbeat reminded me I could do with a proper source of random numbers, not generated by a pseudorandom feedback action.  Back in the early 1990s I was looking at statistical profiling of execution on microcontrollers, I was surprised then to discover that only [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/buffalo.png" align=left hspace=5></p>
<h3>A hard RNG is good to find</h3>
<p>The recent statistical analysis for drumbeat reminded me I could do with a proper source of random numbers, not generated by a pseudorandom feedback action.  Back in the early 1990s I was looking at statistical profiling of execution on microcontrollers, I was surprised then to discover that only by making the sampling period random could I get a true picture of execution distribution.  If the address bus was sampled at a fixed rate, say 100kHz, instead of a true picture it would be distorted by activity that was happening at some fraction or harmonic of the sampling frequency.  So you would alias out pieces of loops completely or get a bloated count for other areas.  Only by true randomness in the sampling timing could you see the reality &#8212; a paradox.</p>
<h3>Analogue RNG methodologies</h3>
<p>A Google or two around showed that most of the techniques are analogue one way or the other.  Many of the methods suffer from a problematic need to amplify some very tiny source of noise, a Zener diode or avalanche transistor junction, by really huge amounts, 90dB or more.  There are a couple of suppliers of RF &#8220;noise diodes&#8221; with flat spectra across a wide frequency range, but they are hard to source.</p>
<h3>Digital non-pseudorandom technique</h3>
<p>However there is one technique which while still relying on analogue noise is basically digital &#8212; to run multiple chains of unlocked inverting oscillators and xor the outputs.  The unlocked oscillators have no reference at all, they&#8217;re basically an inverter fed back on its own input &#8212; in fact a chain of inverters.  Such a circuit oscillates according to the period of the total delay through the inverter chain&#8230; and that is highly sensitive to temperature.  Normally with synchronous digital design we choose a clock rate for a circuit that is just below the maximum possible at the worst temperature it is expected to operate at &#8212; and after that we can forget about temperature.  But with this asynchronous unlocked oscillator concept, the micro- and macro- temperature dependence is revealed in all its freaky glory, causing the oscillation to drift unpredictably slightly every cycle and over larger period with gross temperature fluctuations.</p>
<h3>RFC4086</h3>
<p><a href="http://tools.ietf.org/html/rfc4086">RFC4086</a> mentions a recommendation for a RNG based on unlocked inverter chains that is found in IEEE 802.11i.</p>
<blockquote><pre>
             |\     |\                |\
         +-->| >0-->| >0-- 19 total --| >0--+-------+
         |   |/     |/                |/    |       |
         |                                  |       |
         +----------------------------------+       V
                                                 +-----+
             |\     |\                |\         |     | output
         +-->| >0-->| >0-- 23 total --| >0--+--->| XOR |------>
         |   |/     |/                |/    |    |     |
         |                                  |    +-----+
         +----------------------------------+      ^ ^
                                                   | |
             |\     |\                |\           | |
         +-->| >0-->| >0-- 29 total --| >0--+------+ |
         |   |/     |/                |/    |        |
         |                                  |        |
         +----------------------------------+        |
                                                     |
             Other randomness, if available ---------+</pre>
</blockquote>
<p>This has three unlocked, wandering oscillator chains of different lengths being summed at an XOR gate.</p>
<h3>Implementing the RFC4086 RNG</h3>
<p>Since it needs 71 inverters, you would need 12 74hc04 or similar, it makes more sense to put it all in one CPLD.  I have an old XC95108 lying around, so I wrote up the design in VHDL and added a UART interface to issue the sampled random data.  This brings up the issue of how quickly it can be sampled and still get high quality randomness&#8230; clearly if we sampled it at 10ps it wouldn&#8217;t be very random at all, since it didn&#8217;t have time to change between samples.  On the other hand if we sampled it at some high multiple of the fastest free-running oscillator period, then there is a lot of opportunity for each oscillator phase to have been affected over the longer time.  By using the UART we can control how often we sample the RNG by the baud rate&#8230; I initially set it to 9600 baud or 104us/sample.  The oscillators should have periods on the order of 150 &#8211; 200ns (5 &#8211; 6MHz), so this is allowing 500+ cycles of jitter to accumulate in each oscillator before the summed sample is taken.</p>
<p>I&#8217;m currently waiting for a programming tool to be delivered so I can program another device to allow programming the XC95108 &#8212; I no longer have any PCs with a printer port I realized yesterday.  I am very interested to see what the performance and quality of the randomness is like!</p>
]]></content:encoded>
			<wfw:commentRss>http://warmcat.com/_wp/2007/11/07/adding-entropy-to-devrandom/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Out of your tree</title>
		<link>http://warmcat.com/_wp/2007/03/03/out-of-your-tree/</link>
		<comments>http://warmcat.com/_wp/2007/03/03/out-of-your-tree/#comments</comments>
		<pubDate>Sat, 03 Mar 2007 12:30:19 +0000</pubDate>
		<dc:creator>andy</dc:creator>
				<category><![CDATA[Linux peripherals]]></category>

		<guid isPermaLink="false">http://warmcat.com/_wp/?p=28</guid>
		<description><![CDATA[The willingness of the kernel devs to refactor stuff is both a huge strength and weakness for the kernel.  The strength is in the extraordinary continual optimization and improvement in the codebase, not just locally to an area of code but for cross-kernel concepts, like the recent workqueue changes.
But this has a pretty harsh [...]]]></description>
			<content:encoded><![CDATA[<p><img hspace=8 align=left src="http://warmcat.com/illustration-tree.png" alt="Out of your tree" />The willingness of the kernel devs to refactor stuff is both a huge strength and weakness for the kernel.  The strength is in the extraordinary continual optimization and improvement in the codebase, not just locally to an area of code but for cross-kernel concepts, like the recent workqueue changes.</p>
<p>But this has a pretty harsh cost for people writing or maintaining code that is outside the kernel tree and which therefore does not get the reworking applied to it as part of the core kernel.  Whatever code they put out is invalidated and broken again and again sometimes in just the space of a few weeks.</p>
<p>The freedom to refactor despite breaking external code is a huge luxury for the devs seldom seen elsewhere in the coding world.  Some projects take some care to allow compilation of their drivers for all recent kernels, using conditional compilation based on the kernel tree it is being compiled against, but other projects have an attitude that it will only compile against the current Linus tree.</p>
<p>The foaming churn of change makes for pretty hard work trying to make any kernel code that is not in the main tree work for any length of time.  Greg KH at least is on record that his concept of the solution is to bring everything inside the kernel tree, but I don&#8217;t know how that will ever scale, and it loads the devs with having to understand a work with an ever growing amount of device-specific code.  Aside from that, it makes the kernel devs gatekeepers for what will be accepted, and since not everything that can exist will be deemed acceptable, there will always be a class of device driver that is living out of the comforts of the main tree.</p>
<p>Anyway the end result is that for many projects that people need drivers for, the shelf-life of any instance of the driver sources is extremely narrowly defined.  A Wifi driver for example touches many subsections of the kernel that have a history of changes in the recent past, yet requires a pretty recent kernel to compile at all with the stuff that it actually needs.  So each driver tree has a quite narrow slot of kernel versions that it will work with, annoyingly current CVS from many drivers will not compile against current kernel source, not -git either but -rc versions.  It means that out of tree drivers are a lottery for any recent kernel any kind of driver is a high commitment project, that needs constant revisiting to keep it alive.</p>
<p>There doesn&#8217;t seem to be an answer except that over time more and more critical subsystems in the kernel will surely mature to the point that they get fiddled with less and less, and things should therefore die down on the breakage front.  But in truth the adolescent codebase of Linux shows no signs at the moment of slowing down its crazed foaming froth of reinvention and massive damage and breakage to the code around it.</p>
]]></content:encoded>
			<wfw:commentRss>http://warmcat.com/_wp/2007/03/03/out-of-your-tree/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Conexant ADSL Binary driver damage</title>
		<link>http://warmcat.com/_wp/2006/08/14/conexant-adsl-binary-driver-damage/</link>
		<comments>http://warmcat.com/_wp/2006/08/14/conexant-adsl-binary-driver-damage/#comments</comments>
		<pubDate>Sun, 13 Aug 2006 23:19:03 +0000</pubDate>
		<dc:creator>andy</dc:creator>
				<category><![CDATA[Linux peripherals]]></category>

		<guid isPermaLink="false">http://warmcat.com/_wp/?p=16</guid>
		<description><![CDATA[A couple who are friends with Jenny and I asked what must be getting on for two years ago about what could be done to remove the constant virus problems they were having with their Windows box.  Naturally after making sure they did not need anything that Linux was poor at, ie, 3D games [...]]]></description>
			<content:encoded><![CDATA[<p>A couple who are friends with Jenny and I asked what must be getting on for two years ago about what could be done to remove the constant virus problems they were having with their Windows box.  Naturally after making sure they did not need anything that Linux was poor at, ie, 3D games and so on, I recommended FC2 at that time.  I nuked their box with it and the guy has been very happy all this time.  I updated him to FC4 a while back.</p>
<p>But now he is upgrading from dialup to ADSL, he needed this taking care of.  He had a Zoom PCI Adsl card, model 5506 with a conexant chipset.  I found a driver here for it:<br />
<a href="http://patrick.spacesurfer.com/linux_conexant_pci_adsl.html">http://patrick.spacesurfer.com/linux_conexant_pci_adsl.html</a></p>
<p>Hm so the first sign all was not well was the age of the page and the results from Google, they are all from circa 2003.  This project has been continued to be worked up in the last couple of months though.  After some struggle trying to avoid the 4kBytes/sec modem download we got the driver and the kernel-devel sorted out and compiled it.  It quickly blew chunks, on a #error that our kernel had CONFIG_REGPARM defined.  Well we run the stock Fedora kernel and are not much interested in moving off it, why on earth should the driver care about this detail?  Hm closer inspection of the site showed:</p>
<table border="1">
<tr>
<td><strong class="em1">&#8221;Note:</strong> Linux 2.6.* users should note that their kernel must be compiled without the &#8220;use register arguments&#8221; (CONFIG_REGPARM) option. This is an experimental option that will almost certainly never work reliably with this driver or any other driver that uses proprietary object code. Newer versions of Fedora and SuSE come with kernels that use this option, in these cases you will have to recompile the kernel.&#8221;</td>
</tr>
</table>
<p>Ugh, so the reason it couldn&#8217;t survive CONFIG_REGPARM is because it has a binary blob which demands stack args!  No chance apparently to get two binary blobs compiled with and without.  This is a stupid situation, because the site itself documents that Fedora kernels after 2.6.9 on FC3 are compiled with CONFIG_REGPARM, since it should speed things up at no cost.  His solution is to insist on a vanilla kernel.org kernel solely to support the needs of the binary blob <img src='http://warmcat.com/_wp/wp-includes/images/smilies/icon_sad.gif' alt=':-(' class='wp-smiley' /> </p>
<p>We had to give up trying to get it cooking, and instead the guy blew GBP20 on an ADSL router from ebuyer.  Just what awesome secrets do they think that binary blob is concealing?  What astounding concepts that would set the world on fire if their sources were known?<br />
Binary blobs, causing trouble and bitrotting where ever you find them.</p>
<p>I sent the guy running the project a polite email</p>
<table border="1">
<tr>
<td>Hi Patrick &#8211;  First thanks for your work on the Conexant ADSL project.  I was trying to install a Zoom ADSL PCI card for a friend, we are both running Fedora Core 5.  I saw after some time that I was on a loser because there is a binary blob in the project which was basically compiled with different compiler switches to cut a long story short.  What is the situation with Conexant and this blob as you understand it?  It seems that the chipset dates from 2002 or 2003, is there no chance that this far down the road they might be willing to be more liberal with the sources for it?  My friend and I gave up on the PCI card and ordered a GBP20 ADSL router from ebuyer instead, simply due to there being a binary blob.  -Andy</td>
</tr>
</table>
<p>I got a reply a couple of hours later, the guy does not have a relationship with Conexant and says they are ignoring his mails.</p>
]]></content:encoded>
			<wfw:commentRss>http://warmcat.com/_wp/2006/08/14/conexant-adsl-binary-driver-damage/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>RT73 Belkin stick depression</title>
		<link>http://warmcat.com/_wp/2006/08/09/rt73-belkin-stick-depression/</link>
		<comments>http://warmcat.com/_wp/2006/08/09/rt73-belkin-stick-depression/#comments</comments>
		<pubDate>Wed, 09 Aug 2006 11:12:15 +0000</pubDate>
		<dc:creator>andy</dc:creator>
				<category><![CDATA[Linux peripherals]]></category>

		<guid isPermaLink="false">http://warmcat.com/_wp/?p=13</guid>
		<description><![CDATA[Sadly I have thrown three days down the toilet on trying to get a Belkin &#8220;Wireless G Network Adapter&#8221;, F5D7050, containing a Ralink RT2571 chip to work using either the Ralink RT73 driver or the newer serialmonkey rt2&#215;00 driver which contains the rt73usb.ko driver, this is on my AT91 platform.
Initially I started, like a happy [...]]]></description>
			<content:encoded><![CDATA[<p>Sadly I have thrown three days down the toilet on trying to get a Belkin &#8220;Wireless G Network Adapter&#8221;, F5D7050, containing a Ralink RT2571 chip to work using either the Ralink RT73 driver or the newer serialmonkey rt2&#215;00 driver which contains the rt73usb.ko driver, this is on my AT91 platform.</p>
<p>Initially I started, like a happy idiot, trying to get either to work with wpa_supplicant, since we have a WPA2 80211g network here.  The Ralink RT73 sources did not initially crosscompile cleanly, there is a bad reference to asm/i386/&#8230; in an include, but after that it went better.  However, at least when crosscompiled on gcc 4.02, this driver is a useless piece of crap, I outlined the problems <a href="http://forums.ralinktech.com.tw/phpbb2/viewtopic.php?t=2373">here</a> but naturally there was zero response from Ralink.<br />
Well okay, I knew about the alternative serialmonkey driver from getting my elder stepson&#8217;s laptop working, which incorporated another Ralink chipset.  They did not seem to have any support in the form of the modified Ralink drivers, but they do have a beta 2&#215;00 driver which supports the RT73 chipset. This got a lot further, the MAC address was correctly initialized and in the end, with some coaxing, it can be made to show results from iwlist wlan0 scan that include our AP.  But it won&#8217;t associate and stay associated.  After I removed the encryption from the AP temporarily, I was once &#8211; one time only &#8211; able to contact the DHCP server long enough to get an IP, but then it immediately deassociated again.  And this is with no encryption!  Again I posted to the forums <a href="http://rt2x00.serialmonkey.com/phpBB2/viewtopic.php?t=1743&#038;sid=b5d497eeedba1b98361030f1e75ac857">here</a> and again there was zero response.  Perhaps it is the Arm crosscompile that is freaking the devs out, but since it is littleendian and 32 bits, it&#8217;s really not so wild to expect it to just work.</p>
<p>Another issue &#8211; actually here is the one bit of good news from the work &#8211; is there are two versions of firmware for the RT73 I found, in the form rt73.bin.  One is shipped with the ralink driver and is also available on their site, which claims to be version 1.7.  The other was provided in the Win98 directory on the CDROM that came with the Belkin device and is referred to as version 1.0 in the debug output.  The Ralink-supplied driver has its own code to grab the file from a specific path &#8211; /etc/Wireless/something &#8211; and also has a private copy of the firmware in the sources of the driver itself if it can&#8217;t find the driver in its magic path.  The serialmonkey driver does it the proper Linux way using the firmware API in the kernel.  Anyway this was the good news, I learned how this worked and created a hotplug script that is compatible with it, allowing it to load the firmware successfully from /lib/firmware.</p>
<p>Anyway, while I have been saying recently that the wifi driver problems are largely resolved in Linux, which has been my experience on x86 laptops, they sure as hell aren&#8217;t resolved for crosscompile usage <img src='http://warmcat.com/_wp/wp-includes/images/smilies/icon_sad.gif' alt=':-(' class='wp-smiley' /> ((</p>
<p>Edit 2006-08-13: I <a href="http://sourceforge.net/mailarchive/forum.php?thread_id=30145643&#038;forum_id=40708">posted</a> to the serialmonkey project mailing list about it, it&#8217;s too tantalizing close to forget about it.Â  Head serialmonkey replied, &#8220;send hardware&#8221;.Â  Trying to see how feasible that it, since they need a build env and so on.</p>
]]></content:encoded>
			<wfw:commentRss>http://warmcat.com/_wp/2006/08/09/rt73-belkin-stick-depression/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Broadcomm and WPA</title>
		<link>http://warmcat.com/_wp/2006/07/11/broadcomm-and-wpa/</link>
		<comments>http://warmcat.com/_wp/2006/07/11/broadcomm-and-wpa/#comments</comments>
		<pubDate>Tue, 11 Jul 2006 08:45:10 +0000</pubDate>
		<dc:creator>andy</dc:creator>
				<category><![CDATA[Linux peripherals]]></category>

		<guid isPermaLink="false">http://warmcat.com/_wp/?p=6</guid>
		<description><![CDATA[About 18 months ago on a periodic trip to gawk at PC World (a superstore for PC stuff here in the UK) I purchased a Belkin PC Card 54g adapter with a Broadcomm 4306 chipset.  Of course I took a flyer on the chipset, it was relatively cheap and I figured I would have [...]]]></description>
			<content:encoded><![CDATA[<p>About 18 months ago on a periodic trip to gawk at PC World (a superstore for PC stuff here in the UK) I purchased a Belkin PC Card 54g adapter with a Broadcomm 4306 chipset.  Of course I took a flyer on the chipset, it was relatively cheap and I figured I would have some fun trying to get it to work with Linux.  Yes, the same madness that grips me every time in PC World.  The cheaper peripherals that do not have standardized interfaces (unlike, say, USB Audio devices like headsets, which always just work) always have a very new chip from a company that regards the interface to it as part of what makes their IP such a special flower and Must Never Be Told.  Webcams seem to be the chief culprit at the moment.<br />
Periodically I took it down from its box of dead things and tried to get it working with a new version of Fedora.  Well I read that the BCM43xx driver was integrated to 2.6.17 and that is where Fedora are at (Fedora do a good job of tracking the latest kernels, there is a chart in a Linux magazine here in the UK this month showing Fedora has much later kernels than distros except SuSE).  Since I was going to upgrade the laptop Rohan uses here to FC5, I did this and at the same time without too much hope tried the old Broadcom bookstop.</p>
<p>To my pleasure I was able to get it working here after extracting some firmware and sorting out wpa_supplicant, which I gained some experience in from getting this Samsung laptop working. I sat there loading webpages and looking at its power and data lights, which I was never before able to light.  Good old Linux!</p>
<p>Hum later that evening the behaviour became intermittent.  I ran wpa_supplicant with a debug switch and I see it is having problems maintaining sync with the crypto.  Bringing the (eth1) interface down and up got it working for a while but then it would stutter into silence again.  I modprobe -r&#8217;d the bcm43xx driver and pulled out the card, it was hot but not so hot.  I know that wpa_supplicant is working fine on FC5 because this laptop&#8217;s wifi is super stable (ipw3495-based).  So the problem is either in the bcm43xx driver, or is a physical (heat?) problem with the adapter, I guess it makes sense it can show up in WPA breakage if it is a low level problem.</p>
<p>Edit: couple of days later, I changed the /etc/epa_supplicant/wpa_supplicant.conf contents and that seemed to resolve the problem, we will have to see if the improvement is permanent.Â  Here is the contents:</p>
<p>ctrl_interface=/var/run/wpa_supplicant<br />
network={<br />
ssid=&#8221;myssid&#8221;<br />
scan_ssid=1<br />
key_mgmt=WPA-PSK<br />
proto=WPA<br />
pairwise=CCMP TKIP<br />
group=CCMP TKIP<br />
psk=xxxxxxxx&#8230;xxxx<br />
priority=3<br />
}</p>
]]></content:encoded>
			<wfw:commentRss>http://warmcat.com/_wp/2006/07/11/broadcomm-and-wpa/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
