1983. The year of the IBM PC XT, the Apple Lisa, Pioneer 10 leaving the solar system, and Hooters opening up shop in Florida. It’s also the birthyear of a 25 year old BSD bug, squashed only a few days ago.
A few days ago, Marc Balmer, OpenBSD developer, received an email from an OpenBSD user. The email claimed that SAMBA would crash when serving files off an MS-DOS filesystem. Balmer got into contact with a few SAMBA developers who claimed that SAMBA uses a special workaround in order to function properly on BSD systems: the code for reading directories in all BSDs was flawed.
Understandably, Balmer’s first reaction was disbelief. “Of course my first reaction was to blame Samba,” he writes. Despite his initial reaction, he decided to dig deeper into this case, and he uncovered a bug that had been sitting in the code of all BSDs (including Mac OS X), including a lot of old releases. He confirmed the bug was already in 4.2BSD, released in August of 1983.
The bug itself? Well, I’m no programmer so the actual code is kind of gibberish to me, but I think I get the gist of the problem.
This code will not work as expected when seeking to the second entry of a block where the first has been deleted: seekdir() calls readdir() which happily skips the first entry (it has inode set to zero), and advance to the second entry. When the user now calls readdir() to read the directory entry to which he just seekdir()ed, he does not get the second entry but the third.
Marshall Kirk McKusick, the original developer of the *dir() library, commented on the issue in a personal conversation with Balmer:
As the original author of the *dir() library, you probably fixed one of my bugs :-). Prior to the *dir() commands, programs just opened, read, and interpreted directories directly. I had to update a shocking 22 programs (a large percentage of the programs available on UNIX at the time) to replace their direct interpretation of directories with the *dir() library calls.
This little bug’s fix was actually fairly trivial (as is common with these sorts of long-standing bugs): “The fix is surprisingly simple, not to say trivial: _readdir_unlocked() must not skip directory entries with inode set to zero when it is called from __seekdir().”
“Sorry that it took us almost twenty-five years to fix it,” Balmer adds, jokingly.
A bug as old as I am. *lol*
hehe, It’s older then me!
Way to go OpenBSD developers!
Older than me too. This shows that Open Source only helps in solving the problems, not in detecting them.
Most people hate others’ code, myself included. Hell, I even hate my own code when I haven’t looked at it for a while.
Gotta love those “Who the hell wrote this piece of…… oh.” moments
“Gotta love those “Who the hell wrote this piece of…… oh.” moments”
LMAO. That pretty much describes the last couple of months for me
Cripes I am old.
That’s what I was thinking…
Exactly what I was thinking too…
Those younglings! an OS in use is older then they are!
Ha!
In our time you had to manually start the computer! Naaa… i’m joking.
The first time I actually got to interact with a computer was in high school. Computers were quite exotic at the time, and it was only because ours was a “magnet school” that we had such resources. I started on the Altair Time Sharing Basic version of the Altair 8800 talking to serial terminals at 300 baud (Other students later upped that to 9600):
http://en.wikipedia.org/wiki/Altair_8800
Ours did not have the front panel switches and LEDs but did have an 8 inch floppy drive in a separate enclosure about the size of a modern mid tower on its side. We also had a DECwriter hard copy terminal, some old teletypes, which were also connected. One of the teletypes had a punch tape reader/writer. We also had a section of core memory. But that was just sitting on a shelf collecting dust.
There was a Data General Nova, mentioned in the Widipedia article, upstairs. But it was nonfunctional.
Our school also had 2 TRS-80s. One in the physics lab and one in one of the math classrooms which sort of doubled as a computer science classroom.
Kids today, brought up on the Internet, have no conception of how things were in the late 70s and very early 80’s. Of being 16 years old before touching a computer for the first time, and feeling privileged to do so at so young an age.
Alas, if IBM found a bug in Z/OS which had gone undetected since the advent of OS/360 it would not quite predate me.
Edited 2008-05-11 18:47 UTC
OK… not THAT old
Er .. if you’re under 25, you really haven’t written much code. Trust me.
Actually, this doesn’t show that at all. To show that it doesn’t help detecting bugs, you’d need some closed source examples that match the open source examples, and the number of examples would have to be statistically significant, at least. One example of a bug that wasn’t reported to the people who matter does not prove that open source development models don’t help with detecting bugs.
As stated in the article, this isn’t a 25 year old ‘UNIX’ bug (what does it means, anyway?) but a BSD bug.
It is a fringe case bug that caused some headaches with Samba. The problem was actually known for three years.
It comes down to directory access. A directory is stored as a block that points to all of its ‘contents’ somewhere else on the disk. To navigate these directories, originally, you had to manually pull out the contents from the pointer. This library addressed a lot of those issues for the BSDs.
UNIX is a concurrent system. Several people can do several things at once on the same system. This is where the bug comes from. If one persons program were to seek — the term for iterating through the contents until you find what you’re looking for — through a directory to find a file, you get an index into that directory’s ‘contents array’ as a return value.
The index has several purposes, but that’s besides the point.
EDIT: This is not techinally correct. Correction follows.
If you seeked to a point in the directory, and then someone else deleted something.
EDIT:
There was a special case when you were deleting the first file in a directory, and that case was not correctly accounted for when you seeked through a directory.
Your index would be off because of the way the deletion was handled. The result? You get the contents of another file — one that you didn’t ask for.
Edited 2008-05-11 01:06 UTC
You do know the history of BSD?
http://www.oreilly.com/catalog/opensources/book/kirkmck.html
It’s called Berkeley Unix
Sure, but it’s not called Berkeley UNIX, the 4.4Lite sources specifically are not derived from any of the AT&T UNIX sources and you wont find the same bug in anything that derives purely from SysV.
It’s a bug in a Unix implementation, but it isn’t a UNIX bug.
Actully, had you bothered to read the article, this BUG was found in 4.2BSD as well…
Yeeesss…and? 4.2BSD isn’t UNIX either: it just has slightly more SysV derived code in it than 4.4Lite. The current modern BSDs don’t derive from 4.2BSD anyway, and SysV doesn’t derive from 4.2BSD.
I think you’re wrong. At least judging from the websites of the “three big” BSDs, FreeBSD, NetBSD and OpenBSD, you can read:
FreeBSD(R) is an advanced operating system for x86 compatible (including Pentium(R) and Athlon(TM)), amd64 compatible (including Opteron(TM), Athlon(TM)64, and EM64T), UltraSPARC(R), IA-64, PC-98 and ARM architectures. It is derived from BSD, the version of UNIX(R) developed at the University of California, Berkeley.
The NetBSD Project is an international collaborative effort of a large group of people, to produce a freely available and redistributable UNIX-like operating system, NetBSD. In addition to our own work, NetBSD contains a variety of other free software, including 4.4BSD Lite from the University of California, Berkeley.
The OpenBSD project produces a FREE, multi-platform 4.4BSD-based UNIX-like operating system. Our efforts emphasize portability, standardization, correctness, proactive security and integrated cryptography. OpenBSD supports binary emulation of most programs from SVR4 (Solaris), FreeBSD, Linux, BSD/OS, SunOS and HP-UX.
The term “Berkeley Unix” or “Berkeley UNIX” isn’t mentioned anywhere. Refering to the article you linked to, the term “Twenty Years of Berkeley Unix” seems to be intended as a kind of title, not as a name; it could be “Twenty Years of UNIX development at Berkeley”, too.
You may want to have a look at the BSD family tree to learn about the BSDs’ origins:
http://www.freebsd.org/cgi/cvsweb.cgi/src/share/misc/bsd-family-tre…
EDIT: I changed the (TM), (R) and (C) special characters into a more convenient form because the OSAlert comments system doesn’t seem to handle them correctly. They are displayed correctly in the input form, but aren’t displayed correctly in the thread later on…
Edited 2008-05-11 14:40 UTC
http://en.wikipedia.org/wiki/BSD
Good lord boy… You obviously have never studied the history of Unix.
In the read more section.
The bug itself? Well, I’m no programmer so the actual code is kind of gibberish to me, but I think I get the gist of the problem.
Are those Ballmers words? How do you solve a bug and not be a programmer. How does he know what an inode is?
Above it already says he is a developer.
I am confused. I guess to early for me in the morning.
Perhaps when you read the original OSAlert post there was an error in the layout or something, but at it’s current state it’s pretty clear it was not a quote from the developer.
NOT as old as I am. But I am glad to hear that it eventually got fixed. Shows Persistence and Dedication, if nothing else.
With all due respect, it does not show persistence for the original developers, or much of anything good about them, really: any bug of this sort that’s known about to cause data loss/corruption is generally considered a “no-ship” bug everywhere I’ve been, and this bug shows also that they simply didn’t do very good testing and think about all the reasonably possible edge cases, and this existed for 25 years until some puzzled developer victimized by it tracked it down in a short time period. Kudos the developer that tracked this down: raspberries at the one(s) that should have caught and fixed this ages ago (how many generations does that count in when it comes to this field? ACK!)
Maybe you should try first to develop something like an operating system, then you should speak again. And now be calm, something professional is going on.
I understand how difficult it is to find every last single bug in a complex piece of software, but I still think it’s a little odd how all the BSD devs are getting praised for fixing this bug like they are.
I mean, think about the kind of comments that would be here if it was a story about how Microsoft just fixed a 25 year old bug.
I’m pretty sure even Linux would get hammered, as people would come in claiming that the Linux devs spend too much time on new features without polishing the old ones, and someone would try to blame it on the unstable API.
>but I still think it’s a little odd how all the BSD devs are getting praised for fixing this bug like they are.
Do you know the meaning of sarcasm? It’s some polite kind of it
Ah, shows you have absolutely no clue what I’ve developed in the past, or what I’m developing now:
1. Various CD premastering/analysis utilities, some multithreaded: a patent was involved with one project as I worked on the code for making CD+/CD-PLUS format (whatever they’re calling it these days). Oh, that also involved me having to hack the Linux kernel because it wasn’t setup to do what was needed previously.
2. Engine monitoring software for Cummins diesel engines from pickup-sized to ship engine-sized.
3. CNC/press brake software: yeah, I bet you’ll have to look that up! Multithreading throughout, and oh yeah, if that’s wrong, machinery is damaged and people may die, no exaggeration.
4. 3D CAD software: at least as complicated for all edge cases as an OS.
5. MPP cluster-based database running on linux, currently in use on smaller systems than what it’ll be in a month or two, which is 1024 node system. Oh, yes, the database itself, not a “database application” and yes, it’s at one of the big internet companies, and this database beats Oracle’s best offerings for speed and running costs right now. I’m currently working as the white box QA engineer.
Before you have a clue what other people have done and have experience in, you really should think carefully. Fact of the matter is, this BSD bug has been there 25 years and was known by others. Sure, it’s a multithreaded bug, but it isn’t rocket science for this one, and it has been known about for several years before it was finally fixed, even though it was known to effectively lose data.
You seriously need to not assume things you don’t have a clue about.
This is the internet. One should almost always assume that someone out there will make baseless assumptions about the things you post. @.~
(Sorry, couldn’t resist.)
Also, it’s probably safe to assume that there will be people out there that doubt your claims regarding the nature of your current projects. [citation needed]
See also: http://xkcd.com/285/
EDIT: The Wikipedia joke was written before bothering to look at the parent poster’s page, which details (to paraphrase: screen captures forthcoming) an unfinished project last updated in 2005. It was not intended to mock the parent poster. After all, I haven’t done anything with my page (apart from halfheartedly moving it to different servers and updating a few links) in nearly five years. Paying web work and laziness always won; mock-ups that never became PHP/HTML/CSS don’t really count. *grins*
Edited 2008-05-13 12:08 UTC
Wow, you’re cool. Can we bow at your god-like programming feet?
Edited 2008-05-13 16:23 UTC
Only if you provide me a soft place to fall when I trip over the unexpected body I didn’t look to dodge when running/walking!
Oliver would seem to think that just because someone doesn’t link/explain everything in their OSAlert profile, and he’s got something he claims he’s done (he might even be telling the truth, but why should I care, or anyone else?) his post implies he thinks that nobody else but him and those he knows from first-hand interaction would be qualified to make such a comment on the topic from experience.
If I were to judge from only posts in this thread, I would have to peg you both as arrogant fifth wheels. But it would, of course, be unfair to judge only from posts in this thread.
I think this illustrates the fallacy of the “many eyeballs” meme that is taken as the gospel truth in open source circles.
Here, we have a bunch of open source developers (samba) who find a flaw in an open source product (BSD) and instead of stepping through the BSD source tree to find the problem, they code a samba hack that works around the problem on BSDs. In fact, it appears they didn’t even submit a bug report.
Guess what? This happens all the time in the closed source world. If we come across a bug in someone else’s code, we code a temporary hack around the problem and wait for the bug to be resolved. This article suggests that such development culture appears in open source projects too, which is understandable.
I hope that naive open source advocates who keep preaching the “many eyeballs” meme will stop doing so. The majority of developers do not have the desire or the inclination to fix other peoples bugs even if the source is available. Hell, it seems that some don’t even file bug reports…
So you’re taking one specific incident, and using it to back up generalized statements about the whole Open Source community?
One specific incident where developers of a popular open source project cba’ed to step through the source code of another popular open source project demonstrates the fallacy of the meme.
On the other hand, the reasoning that “many eyeballs” makes code more secure while intuitive, AFAIK is not substantiated. The reason for this is because application programmers rarely have the skill set necessary to muck about with kernel internals. And vice versa.
However, if this incident causes the meme to be updated to something along the lines of “many eyeballs makes secure code, but with notable exceptions” I’d call that a vast improvement.
As far as I recall, the saying is “many eyeballs make bugs shallow”.
Which ist true. Not every fish in shallow water is caught by the fisher. It is just more likely.
Good coding style and good logical structures which lead to easily replaceable chunks of code can be done in closed-source as in open-source programs. IRIX, AIX and the likes have as clean code as the BSD’s and Linux, I am sure about that.
On the other hand, lots of Software companies have to rush out the next release which makes lots of programmers resort to dirty hacks they want to “clean up later”.
I one read some study comparing code quality of closed vs. open source software. OSS code quality is bad at the beginning, and if the project continues, the code quality increases. CSS has rather good code quality at the beginning, but it get worse than OSS code quality over time.
The exception confirms the rule. The fact that this is newsworthy and thus uncommon in open source projects proves quite the opposite of what you are saying. Open source isn’t a perfect development system but these kinds of unfixed bugs aren’t exactly the norm in open development. Closed source software has all kinds of workarounds built into them to address issues with their interaction with other closed source software but we don’t hear about it because it is so common that it isn’t newsworthy. Workarounds are only necessary in the closed source world because there is no code to look at to confirm the bug and the developer can just deny the bug exists and claim it is the interaction with other software that is flawed.
> I hope that naive open source advocates who keep preaching the “many
> eyeballs” meme will stop doing so. The majority of developers do not
> have the desire or the inclination to fix other peoples bugs even if the
> source is available. Hell, it seems that some don’t even file bug
> reports…
Just to show you another point of view: I have had that situation just a few days ago, where I would rather code around a bug in Eclipse IDE than report details or even fix it. Guess why? The bug had been known for more than three years, and repeatedly been marked as “we won’t fix this, and we won’t accept fixes” (for backwards compatibility).
No, I do not have the desire nor the inclination to investigate any deeper when I know that the aim of the developers is NOT to fix those bugs.
Well the test case was quite complex, wasn’t it?
So I’m not surprised that this kind of things wasn’t found early..
But the fact that Samba knew about this bug and that this wasn’t fixed in the BSDs for several years is bad, yes.
Great work to track this bug down and fix it!
This is typical of the really good work that the BSD devs do – keep up the great work, all!
Everything derived from BSD eh? Affects OSX?
Hey Apple, can I have this fix now please, or will this be another fix that you drag your heels on and maybe release later, with minimal change log information?
*sigh*
Want some cheese with that wine?
This bug has been undetected for 25 years. That speaks volumes about the effect it has on the end user.
Just to be clear. I am not the one that found the bug. I had nothing to do with it getting fixed. This article is the first time I had heard about it. I am in no way connected to this article or any of the events that took place.
Twenty-five years ago the internet did not exist.
Twenty-five years ago few public (as compared to websites today) existed.
Twenty-five years ago not very many people had access to the BSD source files.
Twenty-five years ago most people America and Europe had never seen a personal computer.
Twenty-five years ago Berkley, CA seemed further away than Mars does today.
Twenty-five years ago, or so, a bug entered an operating system that only a ***relative*** few people had access to the source code.
Twenty-five years ago people who didn’t have access to the source code and didn’t easily understand how to contact the person that introduced the bug, whom they didn’t know or know the name of, found that the only way to “fix” the bug was to make a work around.
MANY YEARS went by during which all significant systems that interacted with BSD already had the workaround in place so so long that most people never knew the bug existed anymore because those people that had put in the workaround had done such a good job that nobody noticed.
The amazing thing is that someone did something different recently and encountered this bug and realized it was in fact a bug in BSD UNIX and they were in the position to be able to not only contact someone who could do something about it, but did exactly that. They didn’t get sidelined by all the other demands in life. They grabbed onto this due to a great curiosity of getting the initial bug fixed instead of creating a work around.
Twenty-five years ago I programmed on IBM and HP Mainframes. I started learning how in 1979, learning how to program in COBOL, RPT II, FORTRAN IV, and in 1981 BASIC on an Atari 400 and later C. I just might not even have heard of BSD UNIX. Yes I had head of UNIX and Berkley but maybe not BSD UNIX. That wasn’t part of my world.
Fixing other people’s code has little to do with liking to do this or not. It’s a matter of time and demands on our time. I was lucky enough to spend a lot of time not creating new code in the beginning but figuring out and fixing code that was unimportant enough for the main programmers to fix but important enough for ***someone*** to fix. I learned more in the first six months than I had spending two years writing brand new code.
Schr~APdinger’s cat problem. The bug doesn’t exists until you find it.
http://en.wikipedia.org/wiki/Schr%C3%B6dinger‘s_cat
Simple as that.
i am looking for they fix this bug.
i am not a big fan of linux but now thinks that linux is best replacement for windows vista. not for xp
http://readerszone.com
“This little bug’s fix was actually fairly trivial (as is common with these sorts of long-standing bugs)”
Why do you believe this is the case?
Anyone else observe this?
In my experience, bugs occur in clusters — so perhaps there are more decades old bugs out there.
It is surprising no one has quoted ESR in this thread:
“Given enough eyes, all bugs are shallow”
And he’s right. Just because it wasn’t fixed doesn’t mean it wasn’t found.
Perhaps in another 25 years, FreeBSD developers will get around to fixing the notorious hard disk geometry bug (the main reason why I don’t have FBSD installed now).
That doesn’t make any sense. The bug doesn’t actually interfere with hard drive geometry detection — it just pretends to. Here’s an example of someone remarking on the bug, and in an off-hand manner mentioning that how it doesn’t actually stop you from getting proper geometry data for an install:
http://www.softwareinreview.com/bsd/freebsd_6.2_review.html
I think the actual bug was:
sin(x)^2 + cos(x)^2 = -1
I’m sorry – perhaps I’m easily amused but I think it’s hilarious that a dude named Ballmer made BSD buggy. Not that Senor Microshonk is actually capable of more than defecating on himself in public (just search youtube for evidence)…
At 40 I still laugh at fart jokes too – so perhaps it’s a little more in grain…