80 column limitThe last few months I have been working on the Penumbra project. I started off patching wireless drivers in and out of the kernel tree to achieve the anonymous broadcast action that the project needs, but it became clear this would be completely unworkable for general use... getting wireless up in Linux can still be a struggle and hoping people will patch their driver or kernel in addition isn't going to happen. After trying a couple of other methods in the end I created a radiotap-based packet injection patch for the mac80211 stack (formerly dscape / d80211), and bound it together with a patch from Michael Wu that provides radiotap-based Monitor mode. At the moment it is still in front of the linux-wireless folks and it's not clear what the result will be. If the patch is accepted, then the code should make it into the mainline kernel and all mac80211-based wireless drivers will work with Penumbra out of the box in the future. The patchset provides generic radiotap monitoring injection that "just works" with libpcap both ways, so I am hoping it will get accepted without people having to form an opinion about Penumbra. But one of the biggest hurdles in creating the patch was not technical, since I already had the core functionality working, but in fact the Linux kernel coding style. In some ways the coding style fits well with my own personal style (formed over 20 years of writing C and C++), we basically use the same K&R style. There are some spacing and commenting rules that are actually better than my style and I will adopt them wholesale. But that's where the fun stops and the recrimination begins! The basic problem is the combination of three rules which has a terrible effect on eliminating what I consider good coding practice due to the constraints introduced by those rules.
  • Tabs are 8 chars NO EXCEPTIONS
  • Lines are less than 80 cols NO EXCEPTIONS
  • Everything in { } is indented by a tab (except switch cases!)
Now almost everything is inside a function body, so that gets you down to 72 chars already. And if your function is doing something non-trivial, your code is probably inside a while() for a for() and there are one or more levels of if() to decide to do it or not. Pretty soon you are writing code crushed up against the 80-col limit with only 30 chars that are usable and 50 spaces behind it. It strongly puts me in mind of the Bonsai Kittens fake website that showed how to push kittens into bottles so they would grow into the shape of the bottle. Under these abnormal circumstances, certain things become very difficult to do:
  • \t\t\t\t\t"any kind of long " \n\t\t\t\t\t "string has to be " \n\t\t\t\t\t "artificially chopped up"
  • nested if()s may make perfect sense to explain your code logic. But you can no longer afford them because of the tab each one adds. So you have to invert the if() sense and use a goto (I kid you not, this is preferred due to the coding style rules)
  • I strongly prefer descriptive variable names which include type. Type is part of the information you need to understand what that variable is when looking at it. "nCountWaysILoveHer" tells me (now, and in 6 months when I have forgotten the code) it is an int that is counting a specific thing. "i" or "cnt" could be anything, although "cnt" is better. But you can't afford a long variable name with the rules above, you can get into a situation where there is not even enough room left after the tabbing to hold just the variable name on a single line.
On that last point there is some handwaving nonsense in the coding style doc for Linux that "C programmers don't use long variable names"... well I call bullshit on that one. The truth is that because of the other tabbing and length rules, Kernel C programmers can't use long variable names even if they realized it was better: they ran out of room for it. To be fair to the coding style doc it does have a point when talking about what to do when the indents get too much: it suggests to break the indented content out into a new function and to call through to it. It also says that massively indented code means you were screwed anyway, because the logic was too complicated, and that can also be true. But calling through to functions can be a very bad fit if the code you are migrating out touches many variables defined at the parent function top level. I am still working through the style rules trying to see what I should take on board to replace my own style and what I have to "fake" just for kernel code, but it seems to me life would be better for everyone if they relaxed the line length to 120 chars instead of 80.