Archive for July, 2006

VMware networking in Fedora

Thursday, July 27th, 2006

Hum the wonderful czech guy who provides

http://ftp.cvut.cz/vmware/

allows VMware to work fine on Fedora normally (I have an XP install that runs inside VMware to provide Protel). But since I migrated the VM to this laptop, networking seems to be fixed on host-only. I can ping the host from inside the VM but nothing else, despite (or because of perhaps) I selected “bridged”.

Going to have a fiddle.

Edit: Hm, looks like arp proxying is broken on the wlan0 interface. Moving to a brdige on eth0 works fine.

Behind the Embedded sofa

Sunday, July 16th, 2006

Does your root filesystem df -h sometimes seem a bit more than the sum of it’s du -h / parts? My main embedded filesystem has seemed that way for some months, yet when I run my script that checks that lists all the files in the filesystem that did not come out of a package, nothing really showed up as out of place.
Yesterday my /etc/fstab was partially corrupted so /proc and /tmp (a tmpfs) did not get mounted. To my surprise ll /tmp showed a bunch of stuff in there from Janurary, that had accidentally gotten created at the root filesystem mountpoint dir /tmp, before the tmpfs was placed there. The pile of ‘invisible’ (because you can’t see it after the tmpfs takes up residence there at the real /tmp) junk amount to ~800KBytes uncompressed, in an 8MB root filesystem that is a very welcome new injection of space. Only yesterday I got a board into a bad place by updating the busybox package, but there was not enough space left to create all the symlinks… these symlinks are all of the shell commands like ln, rm, ls etc…. that required a boot into the serially loaded kernel to repair… I guess that won’t be happening so much now I got some space out of my ass.

Yahoeuvre broken by Yahoo changes

Saturday, July 15th, 2006

Yahoeuvre was a PHP project to capture and enhance the Yahoo boards related to the SCOG attack on Linux.  It lasted a good couple of years, but today Yahoo have changed the format of their boards.  It would require a fair amount of work to change the monitoring software to support the new format.  Up until about 18 months ago Yahoeuvre served post content in various ways, including NNTP and email, provided full thread content on one page and so on.  Somebody chose to complain about their content, freely visible and downloadable from Yahoo, being served additionally (unchanged) by Yahoeuvre, and I took all the content re-serving down and stopped visiting the forum myself (which was previously pretty addictive to me).  It was still useful as a full-text search and archive of post actions by nyms, but it’s not worth the effort to me to bring it into line.  The sources are GPL’d and downloadable from the old site if someone else wants to try, but I’m not too sure how useful the mix of features chosen to complement the old board, with its limited threading support, are with the new board.

Interesting AT91 clock quirk

Wednesday, July 12th, 2006

Just sent this to linux-arm-kernel. I saw the web archive screwed up on my signature, so I include it here to eventually get to Google.

Hi folks -There appears to be a subtle problem with the otherwise neat PLL setting api for AT91 found in./arch/arm/mach-at91rm9200/clock.c

The nice code in at91_pll_calc() does a search at runtime for the best match for the requested PLL output frequency given the base clock rate. So if you tell it you have a 18.432MHz crystal, and want 96MHz, it will find a good PLL multiplier and divide pair. This is commendable and cool.

The problem comes from the code not having the free hand that it thinks it does to choose the PLL ratios. This is because the physical external PLL filter components must be matched to the details of the PLL settings, and of course these are chosen at design-time.

(more…)

Chip of weirdness

Tuesday, July 11th, 2006

Telephony box I am working on is using a pair of AKM2304 quad codecs, they work very nicely most of the time. But they have always been very sensitive to the powersupply. With certain PSUs that issue too high a voltage, eg, 5.4V instead of 5V, they are prone to stopping working and getting hot, too hot to touch. On giving them the correct voltage they start working again. In addition on fitting the chips to a board they have a relatively high dropout rate, again either working or getting irretrievably hot.
Yesterday I decided to examine the problem closer, since we are nearing production. I reviewed the datasheet and saw the configuration of Digital and Analog powersupply decoupling I remembered, a 10R series resistor between the digital DVdd that took the 5V directly and the Analog. But then I did a double-take… in fact the datasheet showed the Analog power getting it directly and a 10R in series on the DVdd side. This made sense when combined with a warning note in the datasheet that AVdd must not fall below DVdd or there could be “damage”… their idea was to kneecap DVdd slightly and give AVdd the full 5V feed to avoid this. I shorted out the 10R series resistor I had wrongly placed in AVdd and now these codecs are happy with 5.4V… subtle…

Broadcomm and WPA

Tuesday, July 11th, 2006

About 18 months ago on a periodic trip to gawk at PC World (a superstore for PC stuff here in the UK) I purchased a Belkin PC Card 54g adapter with a Broadcomm 4306 chipset. Of course I took a flyer on the chipset, it was relatively cheap and I figured I would have some fun trying to get it to work with Linux. Yes, the same madness that grips me every time in PC World. The cheaper peripherals that do not have standardized interfaces (unlike, say, USB Audio devices like headsets, which always just work) always have a very new chip from a company that regards the interface to it as part of what makes their IP such a special flower and Must Never Be Told. Webcams seem to be the chief culprit at the moment.
Periodically I took it down from its box of dead things and tried to get it working with a new version of Fedora. Well I read that the BCM43xx driver was integrated to 2.6.17 and that is where Fedora are at (Fedora do a good job of tracking the latest kernels, there is a chart in a Linux magazine here in the UK this month showing Fedora has much later kernels than distros except SuSE). Since I was going to upgrade the laptop Rohan uses here to FC5, I did this and at the same time without too much hope tried the old Broadcom bookstop.

To my pleasure I was able to get it working here after extracting some firmware and sorting out wpa_supplicant, which I gained some experience in from getting this Samsung laptop working. I sat there loading webpages and looking at its power and data lights, which I was never before able to light. Good old Linux!

Hum later that evening the behaviour became intermittent. I ran wpa_supplicant with a debug switch and I see it is having problems maintaining sync with the crypto. Bringing the (eth1) interface down and up got it working for a while but then it would stutter into silence again. I modprobe -r’d the bcm43xx driver and pulled out the card, it was hot but not so hot. I know that wpa_supplicant is working fine on FC5 because this laptop’s wifi is super stable (ipw3495-based). So the problem is either in the bcm43xx driver, or is a physical (heat?) problem with the adapter, I guess it makes sense it can show up in WPA breakage if it is a low level problem.

Edit: couple of days later, I changed the /etc/epa_supplicant/wpa_supplicant.conf contents and that seemed to resolve the problem, we will have to see if the improvement is permanent.  Here is the contents:

ctrl_interface=/var/run/wpa_supplicant
network={
ssid=”myssid”
scan_ssid=1
key_mgmt=WPA-PSK
proto=WPA
pairwise=CCMP TKIP
group=CCMP TKIP
psk=xxxxxxxx…xxxx
priority=3
}

Coolest Mailserver

Sunday, July 9th, 2006

Sata-eating Monster

Losing my 24/7 local box to the SATA-eating monster left an immediate problem – nothing was taking my mail. My MX for warmcat.com ends up at my cablemodem, incoming mail was just timing out with nobody to talk to. I thought for a bit about setting up an external postfix and moving the MX, but I didn’t like it as a permanent solution, therefore it was wasted effort to make such a temporary solution. Annoyingly the AMD64 box went down just before I had to take a trip to Spain with Jenny for a few days. I shrugged my shoulders and hoped anyone with important mail would retry, and that the mailing lists I am on would be understanding of the rather unrefined behaviour.
When I returned from the few days of overheated childcare-in-another-land (formally, 16 years ago now, known as “holiday”) I immediately fell ill with some kind of bad cold that laid me up in bed for a couple of days additionally. So after nearly a week the stress was on to get a permanent solution for the missing mailserver question.

(more…)

Cursed AMD64 box

Sunday, July 9th, 2006

AMD Athlon 64 x2 4400+ box has been working fine for about a year with a Western Digital 10KRPM SATA drive. This is a DFI Lanparty motherboard and a 450W PSU IIRC. The machine was up 24/7 for most of that year since it was acting as my mailserver amongst other things.

A few weeks ago the drive started acting erratically, I would waken in the morning and find that the ext3 filesystem on there had been remounted read-only because filesystem corruption had been detected. I was able to fsck the filesystem back into sanity and the drive would act fine for several days. Well these stories always end the same way, with a drive that won’t complete a boot, and that was the case for this idiot too.

The particular disease was that the area of the disc that contained the LVM structure — Fedora sends in LVM by default now — was spewing hard IO errors when touched. Therefore it couldn’t get past trying to bring up the LVM on boot and simply dropped dead. I documented the evasive actions I took on this fedora-list mail , basically I was able to recover the ext3 filesystem that was inside the LVM block on to a new SATA drive. “LVM”‘s physical footprint is basically an 0×30000 byte header before the ext3 filesystem starts.
I installed FC5 on the new drive and brought over most of the data from the copy of the ext3 filesystem from the damaged drive, and went on pretty much as normal, with brief interruptions while I fished something I had forgotten I needed from the old filesystem. But then to my disbelief, after just a week, the new drive — the only drive in the machine — blew chunks in a similar way, hard IO errors one morning. I came in my work room and heard it performing the click of death.
I recovered from this rather grimly from backups, I did not fancy attempting a second recovery of 60GB of data from a second drive inside of a week. I stared at the AMD box for a minute or two though… I could think of two likely causes, the most likely one being the power supply. If it was having trouble with its 12VDC line, serious trouble, it might cause the drive to reset itself as if a poweron was happening repeatedly. It’s not hard to imagine that a set of such resets at random intervals might eventually catch the drive out in its initialization phase and cause it to throw a fit ending it its head scratching the surface. The other possible cause is a bit more uncertain, both boxes were running the new FC5 2.6.17 kernel which has had a lot of work going on with libata and the kernel code for SATA. I wonder if that is repeatedly attempting drive resets as a last resort.
Anyway it had caused enough trouble, I swore off it and migrated back to running from this Centrino Duo laptop, it is plenty fast enough for a main workstation. One nice feature of vmware is that the XP I am running inside it has no idea that it has moved machine, there is no activation crap — although this is of course a genuine retail copy of XP, one of two I own. I shall probably have cause to write about it another time but I have to have XP for Protel. It runs on top of Fedora Core thanks to Vmware workstation.

Blog logic

Sunday, July 9th, 2006

I lean very heavily on Google during my long working day… most of the time the collected wisdom inside Google gets me out of whatever technical trouble I am, perhaps with a bit of headscratching and elbow grease on my part.  Last week I was looking at a very specific problem that existed in a version of gcc when used with buildroot, and I found a mention of the problem from Rob Landley who runs busybox now.  He had some kind of blog type thing going where he noted stuff, Google got ahold of it and presented it to people who where interested in that specific thing — in this case, me.  The post of his was about a year old, don’t know if he kept it up or it fell into disrepeair as many of these ventures do, but I decided to try this style out.