Lewis Mettler comments on RedHat's filing in the SCO bankruptcy. Now, I disagree with him about SCO management. Darl has been through this kind of thing before. I think SCO's management knew exactly how flimsy and baseless their case was. Whatever their lawyers told them, I think they knew the reality. They'd simply decided that the strength of their case didn't matter. They figured IBM would settle because it was cheaper than fighting and winning.
SCO's problem was just that they miscalculated how IBM would figure the costs. If it's cheaper to settle than to win a lawsuit, you settle. SCO calculated the cost of settling as just the dollars IBM would have to pay them. IBM, though, felt that settling would be taken by their customers as an admission that the accusations had some merit. Since the accusations were that IBM had broken contracts just to make more money, they felt that'd negatively affect their business. Their software business runs around $6 billion a year, so even a 1% drop in business from customers getting nervous about IBM not honoring contract terms would be $60 million a year in lost revenue. That makes settling a lot more expensive, and IBM decided it was cheaper to throw a few tens of millions of dollars at defending their good name than to put hundreds of millions of dollars of revenue at risk.
SCO weren't deceived by their lawyers. They didn't believe they ever had a case. They simply figured that IBM would pay Danegeld if it wasn't too much. And they were wrong.
Wednesday, June 24, 2009
Tuesday, June 9, 2009
Web fail
A challenge most Web designers today fail: design a generic Web page. It must display readably on a 24" widescreen LCD monitor at 1920x1200 resolution, and readably on the 320x200 pixel display of a Web-capable cel phone. It must not fail to render because the browser lacks Flash, or Java, or .Net or anything else not present in the basic browser installation. It must not fail to render because the user isn't running any particular browser.
For extra points, it must render readably on a 24" widescreen LCD monitor running in 640x480 16-color mode (user has failed their check vs. INT to set the correct monitor type and Windows has defaulted to a minimal known-good resolution).
For extra points, it must render readably on a 24" widescreen LCD monitor running in 640x480 16-color mode (user has failed their check vs. INT to set the correct monitor type and Windows has defaulted to a minimal known-good resolution).
Tuesday, June 2, 2009
System capacity
Immediate thought: if your system gets overloaded handling 20 requests a minute, something's really hosed. I'm used to thinking in terms of rates 10-100x that, and counting them per second not per minute. The problem of course is that the offending software is a commercial package, so we can't dig into it to find out what the problem is and fix it.
Labels:
software
Wednesday, May 6, 2009
News from SCO
Two bits of news about SCO. First, in the bankruptcy case the US Trustee has moved to convert SCO to Chapter 7 liquidation. Second, in the appeal of the judgement in the Novell case, the appeals court heard oral arguments and my first impression is that they weren't impressed with SCO's arguments.
In the appeals hearing, it sounded very much like at least two of the three judges were looking at the contracts themselves, and at SCO's arguments, and going "You know, you're arguing that when the contract says "excludes" it really means "includes". We aren't buying that.". I expect the appeals ruling won't be favorable to SCO.
On the bankruptcy front, the next hearing is June 12th. I suspect the appeals ruling will be in in at least a preliminary form before then. If it goes against SCO, that pretty much shreds the last hope SCO had for postponing the inevitable. And SCO really doesn't want to end up in Chapter 7. When that happens the current executives lose their jobs and the US Trustee takes over management of the company. He's got no dog in the fights between SCO and Novell and IBM, his only interest in them will be to settle them at the minimum cost to the bankruptcy estate. And he'll have access to all SCO's corporate and legal records. Attorney-client privilege won't apply because, as of his appointment, he'll be the client. If he finds records showing SCO knew they didn't have a case when they filed it, he'll have no problem whatsoever filing a sworn statement to that effect in a settlement deal and turning over the records to back it up. That could place BSF in a very bad position. Not that they're in a good one now, mind you.
I've said it before: SCO miscalculated the cost to IBM of fighting. SCO assumed IBM would look at the demand for a few million dollars and count it cheaper than the cost of fighting it out in court and winning. IBM looked at a threat to half or more of their annual revenue world-wide (their gross revenue tops $100 billion), multiplied by decades, and decided a few million was cheap. I can imagine the conversation with their lawyers: "You know it's going to cost to fight this." "Yes, we know. Here's a quarter of a billion for the initial deposit, call us when it gets low and we'll add more.". To give you scale, that's half a percent of the first year of the revenue at risk. IBM's looking at 50 years, most likely (which is less than half the time the company's been in business, they've got current product lines that're nearly that old).
In the appeals hearing, it sounded very much like at least two of the three judges were looking at the contracts themselves, and at SCO's arguments, and going "You know, you're arguing that when the contract says "excludes" it really means "includes". We aren't buying that.". I expect the appeals ruling won't be favorable to SCO.
On the bankruptcy front, the next hearing is June 12th. I suspect the appeals ruling will be in in at least a preliminary form before then. If it goes against SCO, that pretty much shreds the last hope SCO had for postponing the inevitable. And SCO really doesn't want to end up in Chapter 7. When that happens the current executives lose their jobs and the US Trustee takes over management of the company. He's got no dog in the fights between SCO and Novell and IBM, his only interest in them will be to settle them at the minimum cost to the bankruptcy estate. And he'll have access to all SCO's corporate and legal records. Attorney-client privilege won't apply because, as of his appointment, he'll be the client. If he finds records showing SCO knew they didn't have a case when they filed it, he'll have no problem whatsoever filing a sworn statement to that effect in a settlement deal and turning over the records to back it up. That could place BSF in a very bad position. Not that they're in a good one now, mind you.
I've said it before: SCO miscalculated the cost to IBM of fighting. SCO assumed IBM would look at the demand for a few million dollars and count it cheaper than the cost of fighting it out in court and winning. IBM looked at a threat to half or more of their annual revenue world-wide (their gross revenue tops $100 billion), multiplied by decades, and decided a few million was cheap. I can imagine the conversation with their lawyers: "You know it's going to cost to fight this." "Yes, we know. Here's a quarter of a billion for the initial deposit, call us when it gets low and we'll add more.". To give you scale, that's half a percent of the first year of the revenue at risk. IBM's looking at 50 years, most likely (which is less than half the time the company's been in business, they've got current product lines that're nearly that old).
Tuesday, April 7, 2009
The Internet turns 40
The Internet as we know it is 40 years old today. The very first RFC, RFC 1 - Host Software, was published on April 7, 1969.
Labels:
internet
Monday, April 6, 2009
Cloud computing
Cloud computing isn't anything new. WorldNet from "Valentina: Soul in Sapphire" in 1984. "The Adolescence of P-1" in 1977. The idea of a world-wide computer network where programs can run on any processor on the grid isn't new. The hard part, of course, is making it happen. Machine code's specific to a particular processor. You can't run x86 code on a PowerPC CPU without some hairy emulation happening. Ditto for running Windows software on a Linux system, or vice-versa. So the first big technical challenge is making it possible for code to migrate from machine to machine without getting caught in CPU and OS compatibility issues. Things like Java's bytecode solved that, though. From a technical standpoint, there's nothing stopping code from being shifted around between machines as long as the author didn't choose to build hardware-specific executables.
The non-technical problems are bigger, though. The first is obvious: when running on the cloud or the grid or whatever you call it, your programs and data reside on someone else's system. How do you protect your data from being exposed to people who shouldn't see it? How do you force the hardware's owner, someone you may not have a direct contract with, to protect it from theft or damage? How do you insure you can get your data and programs back in the event you want to go elsewhere? They're all intertwined, you know. Amazon, for instance, solves the problems of data exposure and protection by keeping your data only on their systems, where you've got a contract with them. But at the same time, you don't entirely control the data formats. Your data resides on their systems, not yours, and in their formats, not yours. Unless you built special facilities into your software to send you copies of your data, if you decide to move from Amazon to Google you may find Amazon either won't export your data for you or won't export it in a format Google can import. And you may find your programs, written for Amazon's systems and APIs, won't compile for Google's without major rewriting.
The whole idea is a wonderful one, but you need to think about the logistics and remember that there are no silver bullets. There's no magic here. The obvious problems still exist, and still have to be dealt with, and they won't go away just because they're inconvenient to deal with.
The non-technical problems are bigger, though. The first is obvious: when running on the cloud or the grid or whatever you call it, your programs and data reside on someone else's system. How do you protect your data from being exposed to people who shouldn't see it? How do you force the hardware's owner, someone you may not have a direct contract with, to protect it from theft or damage? How do you insure you can get your data and programs back in the event you want to go elsewhere? They're all intertwined, you know. Amazon, for instance, solves the problems of data exposure and protection by keeping your data only on their systems, where you've got a contract with them. But at the same time, you don't entirely control the data formats. Your data resides on their systems, not yours, and in their formats, not yours. Unless you built special facilities into your software to send you copies of your data, if you decide to move from Amazon to Google you may find Amazon either won't export your data for you or won't export it in a format Google can import. And you may find your programs, written for Amazon's systems and APIs, won't compile for Google's without major rewriting.
The whole idea is a wonderful one, but you need to think about the logistics and remember that there are no silver bullets. There's no magic here. The obvious problems still exist, and still have to be dealt with, and they won't go away just because they're inconvenient to deal with.
Thursday, April 2, 2009
Trademark ownership
An interesting decision about trademark ownership. What's interesting about it is that the judge ruled that trademark use trumped a registered trademark. A lot of companies have been treating trademarks as if registration, however recent, trumped all common use no matter how long-term and well-established. The judge here threw that reasoning out, ruling that the Dallas Cowboys, even though they didn't have a formal registration on the phrase "America's Team", nonetheless owned it by virtue of long use of the phrase and it's association with the team. This overturned the relatively recent registration of that phrase as a trademark by another company.
This is good news for people who've been using names and phrases associated with their products. When someone else comes along, registers that name or phrase as a trademark and tries to usurp your usage on the grounds that a registered trademark trumps an unregistered one, you can point to this decision and say "The courts say otherwise.".
This is good news for people who've been using names and phrases associated with their products. When someone else comes along, registers that name or phrase as a trademark and tries to usurp your usage on the grounds that a registered trademark trumps an unregistered one, you can point to this decision and say "The courts say otherwise.".
Labels:
intellectual property,
law,
trademark
Tuesday, March 3, 2009
Google and copyright claims
You know, I'm thinking that Google should be getting a bit more hard-nosed about copyright. When someone sues them claiming that Google can't internally cache anything without permission, Google should simply shrug and immediately blacklist everything by the plaintiff. From that point on, nothing belonging to the plaintiff will appear anywhere in any of Google's services. You don't want Google to maintain even neccesary internal copies of your stuff? Fine, Google won't and you'll live with the consequences.
Note that this is different from distributing copies of your work. Google can't hand out copies of your book. But returning a sentence or two in response to a search query and pointing to where the book can be bought? That's just as fine as a person mentioning something interesting from a book and telling their friend where to buy it. An author doesn't like that, that author gets to live with nobody recommending their books to friends too.
Note that this is different from distributing copies of your work. Google can't hand out copies of your book. But returning a sentence or two in response to a search query and pointing to where the book can be bought? That's just as fine as a person mentioning something interesting from a book and telling their friend where to buy it. An author doesn't like that, that author gets to live with nobody recommending their books to friends too.
Thursday, February 26, 2009
The Kindle 2 and newspapers
Thinking about the Kindle 2, it may be the salvation of newspapers. A lot of the cost of newspapers is in the printing: the paper, the ink, the presses, the cost of distributing the sheer physical mass of paper. The Kindle 2 provides a secure subscription-based channel for delivering black-and-white printed content that doesn't require moving physical material around. Amazon already has a content distribution network set up. A newspaper could mark up their edition electronically and distribute it to Kindles. As long as nobody involved gets greedy, I think it could be profitable once the costs of physical printing and distribution are shed.
Tuesday, February 17, 2009
Verizon using mail submission port 587
Verizon is moving to using port 587 for mail submission, requiring encryption and authentication to send mail. That alone won't stop the spam originating from their networks, but it's a start. My thought is that there should be 3 ports for 3 different purposes:
- Port 25, no encryption or authentication required, is for server-to-server mail transfer. Relaying shouldn't be allowed, all e-mail arriving should be addressed to an in-network domain. Anything else should be rejected. This means no relaying. Messages should not be modified except for adding an appropriate Received header.
- Port 587, encryption and authentication required, is for end-user mail submission only. Mail submitted to it should have the Sender header stripped and replaced with one based on the authenticated username.
- Port 465, encryption required and authentication allowed, is a hybrid. If the session isn't authenticated, it should act per the rules for port 25. Authenticated sessions should be allowed to relay. If relaying, authentication information should be added to the Received header and if no Sender header is present one should be added based on the authentication information. Messages should not be otherwise altered.
The Pirate Bay trial
Apparently half the charges against The Pirate Bay have been dropped by the prosecution. This isn't based on a technicality, as I read it, but on such basic things as the screenshots the prosecution was using as evidence the client was connected to the TPB tracker clearly saying it was not connected to the tracker. It's no wonder the prosecution dropped those charges rather than continue. If they'd've continued, the defense would've introduced the prosecution's own screenshots and the prosecutor wouldn't've been able to rebut them.
I don't particularly agree with piracy, but when the prosecutors screw up this badly they deserve to lose.
I don't particularly agree with piracy, but when the prosecutors screw up this badly they deserve to lose.
Labels:
copyright,
law,
peer to peer,
the pirate bay
Thursday, February 12, 2009
Code comments
Motivated by a Daily WTF entry on code comments. I know exactly what creates this: Comp Sci instructors. You know them, the ones who insist that every line of code be commented, no matter how trivial. After a couple of years of this, students get in the habit of including comments just to have a comment for that line of code. Of course the easy way to do this is to just restate what the line of code does.
Now, as a professional programmer doing maintenance on code I don't need to know what the code does. I can read that line of code and see exactly what it does. I need something a bit higher-level. I need to know what the code's intended to do, and I need to know why it's doing it and why that way of doing it was selected. I know it's interating down a list looking for an entry, I need to know what that list is for and why the code's looking for that particular entry. Instead of comments describing how to iterate through a list, I need a block comment saying something like "We've got our list of orders entered this week, we know they're ordered by vendor, and we're trying to find the first order for a particular vendor so we can extract all his orders quickly.". Probably followed by something like "Now that we have this vendor's first order, we'll grab everything until we see the vendor number change. When we see that, we've got all his orders.". Much shorter, wouldn't satisfy that instructor at all since most of the code won't be commented, but much much more useful when I have to change things. It tells me what the code's trying to do, and what assumptions it's making that I have to avoid invalidating.
Too many Comp Sci teachers need some real-life experience maintaining the kind of code they exhort their students to produce.
Now, as a professional programmer doing maintenance on code I don't need to know what the code does. I can read that line of code and see exactly what it does. I need something a bit higher-level. I need to know what the code's intended to do, and I need to know why it's doing it and why that way of doing it was selected. I know it's interating down a list looking for an entry, I need to know what that list is for and why the code's looking for that particular entry. Instead of comments describing how to iterate through a list, I need a block comment saying something like "We've got our list of orders entered this week, we know they're ordered by vendor, and we're trying to find the first order for a particular vendor so we can extract all his orders quickly.". Probably followed by something like "Now that we have this vendor's first order, we'll grab everything until we see the vendor number change. When we see that, we've got all his orders.". Much shorter, wouldn't satisfy that instructor at all since most of the code won't be commented, but much much more useful when I have to change things. It tells me what the code's trying to do, and what assumptions it's making that I have to avoid invalidating.
Too many Comp Sci teachers need some real-life experience maintaining the kind of code they exhort their students to produce.
Labels:
computer science,
software,
teachers
Tuesday, February 3, 2009
Powerful and easy to use
"You'll find this system to be incredibly flexible and powerful. You'll also find it to be incredibly simple and easy to use with minimal training. However, don't ask it to be both at the same time."
Labels:
engineering,
it,
software
Monday, January 19, 2009
Security vulnerabilities and disclosure
Triggered by this SecurityFocus column. I've always been a proponent of full disclosure: releasing not just a general description of a vulnerability but the details on how it works and how to exploit it. I considered that neccesary because vendors are prone to saying "There's no practical exploit there." and stalling on fixing it, and the only way to prove there is a practical exploit is to actually produce the code to exploit the vulnerability. It also removes any question about whether you're right or wrong about the vulnerability. There's the code, anybody can verify your claims for themselves. But I've also always been a proponent of telling the vendor first, and giving them the opportunity to close the hole themselves before the rest of the world gets the details. General public disclosure was, in my view, the last resort, the stick to wave at the vendor that you'd employ only if they didn't act with reasonable dispatch to actually fix the problem.
But, as this column points out, these days the vendor's most likely to respond not by trying to fix the problem but by hauling you into court to try and silence or even jail you for having the temerity to tell them they've got a problem in thier software. Which is leading me to believe that responsible disclosure, while preferrable, simply isn't viable anymore. The only safe thing to do, the only effective way to get vendors to respond to problems, is to dump all the details including working exploit code out into public view so the vendor can't ignore it, and to do it anonymously (making sure to cover your tracks thoroughly and leave no trail leading back to you) so the vendor doesn't have a target to go after. That's the only way to avoid months if not years of legal hassles and courtroom appearances, all for you having the temerity to try and tell the vendor privately that they had a problem. IMO this is a sad state of affairs, but it also seems to be the way the vendors want it to be.
It's either this, or fight for court rulings saying that vendors have no legal right to hound researchers who try to disclose privately to the vendor. In fact, we need legal rulings saying that a vendor who tries to silence the reporters of a vulnerability instead of fixing the vulnerability make themselves legally liable for the results of that vulnerability. Short of that, researchers have to protect themselves.
But, as this column points out, these days the vendor's most likely to respond not by trying to fix the problem but by hauling you into court to try and silence or even jail you for having the temerity to tell them they've got a problem in thier software. Which is leading me to believe that responsible disclosure, while preferrable, simply isn't viable anymore. The only safe thing to do, the only effective way to get vendors to respond to problems, is to dump all the details including working exploit code out into public view so the vendor can't ignore it, and to do it anonymously (making sure to cover your tracks thoroughly and leave no trail leading back to you) so the vendor doesn't have a target to go after. That's the only way to avoid months if not years of legal hassles and courtroom appearances, all for you having the temerity to try and tell the vendor privately that they had a problem. IMO this is a sad state of affairs, but it also seems to be the way the vendors want it to be.
It's either this, or fight for court rulings saying that vendors have no legal right to hound researchers who try to disclose privately to the vendor. In fact, we need legal rulings saying that a vendor who tries to silence the reporters of a vulnerability instead of fixing the vulnerability make themselves legally liable for the results of that vulnerability. Short of that, researchers have to protect themselves.
Labels:
full disclosure,
law,
security
Tuesday, January 13, 2009
Meece
Grumble. The trusty old Microsoft Trackball Optical I've been using at work is starting to go. The optics work fine, but the ball is starting to stick and not want to roll easily. I've no idea what's causing it or how to correct it. It's done it and then cleared up a couple of times before, but it's happening more often and not clearing up as readily each time. So now I have to go get a replacement. Microsoft doesn't make this model of trackball anymore, the Logitech thumb-operated trackballs are all too narrow for my hand and the finger-operated trackballs I just can't get used to. So I guess it's back to a laser mouse for me.
Monday, January 12, 2009
Mass transit
The major problem with mass transit is, frankly, that it's inconvenient for the things people commonly need to do. Stuff like shopping, or quick runs to random places. It's hard to bring anything back, let alone large items like a television or a full load of groceries for a family, and usually the busses and trains take twice as long to get there as a car would even after allowing for traffic snarls. I don't see a fix for this as long as mass transit is designed around large-capacity transports running on fixed routes on a fixed schedule. What we need is a completely different design, which will require a street network designed to accomodate it.
First, the basic local unit is a transit pod running in a dedicated guideway. Stops are cut-outs where the pods can get out of the traffic flow. Pods would be in 3 sizes to cut down on the number of varieties needed. 2-seat pods are designed to hold just people, no cargo, to provide a physically small unit for getting individuals from point A to point B when they don't need to carry much more than a backpack or briefcase. Larger pods are 2-row and 4-row versions, with the rear rows designed to fold flat into the floor to convert seating into cargo deck as needed for that particular trip. These don't run on fixed schedules or routes, people call them to a stop as needed based on how many people are in their group and how much cargo they expect to have and pick the destination once they're in. Pods are routed automatically by the shortest, least-congested path. Guideways don't have to run on every single street, but they should run on enough that it's never more than half a block from any house to a pod stop. For instance, in a residential neighborhood the guideways might run on every east-west street so you have to walk no more than half a block north or south to a guideway. The preference, though, would be to have a guideway on every street so pods can stop literally at your driveway. With this kind of routing, you avoid the waits to change lines that're typical of conventional bus and train systems.
Pods would operate in a large area, and in theory you can take a pod for the entirety of a trip anywhere within a sane distance, but for longer-distance travel inter-area trams would be used. These wouldn't run everywhere. They'd connect transit hubs, and be organized into lines much the way trains are currently. There would, however, be more interconnection than is typical of train lines, so you could take a direct route with less going out of your way and changing trains at a central station. I call them trams rather than trains because I'd design them using dedicated guideways like the pods rather than rails, so a tram could at a hub choose between multiple ways out. That way the system can dynamically allocate trams to routes at each hub to accomodate traffic. If you're going further than a few miles, you'd typically take a pod to the nearest hub and grab a tram to a hub near your destination. If you picked up cargo that couldn't be delivered, you'd take a pod the whole way back.
Using guideways also allows another trick: commercial pods could be designed that'd run on both pod and tram guideways. A store could, for instance, load up a delivery pod with loads for several customers in the same area and route it out (on a pod guideway to the nearest tram hub, then over the tram guideways to a hub near it's destination, and finally via pod guideways to a stop near the delivery address) to drop off deliveries for customers.
The major problem I see with implementing this is that you'd need to majorly disrupt the street network to build the guideways. You literally can't do this on top of the existing streets, you'd need to redesign the streets to accomodate the guideways (no more on-street parking, the guideways will be occupying that space) and have a whole new way to handle cars crossing the guideways without interfering with pod traffic (probably requiring traffic-control gates). IMO it'd be worth it once implemented, but the up-front cost of implementing it makes it a hard sell.
First, the basic local unit is a transit pod running in a dedicated guideway. Stops are cut-outs where the pods can get out of the traffic flow. Pods would be in 3 sizes to cut down on the number of varieties needed. 2-seat pods are designed to hold just people, no cargo, to provide a physically small unit for getting individuals from point A to point B when they don't need to carry much more than a backpack or briefcase. Larger pods are 2-row and 4-row versions, with the rear rows designed to fold flat into the floor to convert seating into cargo deck as needed for that particular trip. These don't run on fixed schedules or routes, people call them to a stop as needed based on how many people are in their group and how much cargo they expect to have and pick the destination once they're in. Pods are routed automatically by the shortest, least-congested path. Guideways don't have to run on every single street, but they should run on enough that it's never more than half a block from any house to a pod stop. For instance, in a residential neighborhood the guideways might run on every east-west street so you have to walk no more than half a block north or south to a guideway. The preference, though, would be to have a guideway on every street so pods can stop literally at your driveway. With this kind of routing, you avoid the waits to change lines that're typical of conventional bus and train systems.
Pods would operate in a large area, and in theory you can take a pod for the entirety of a trip anywhere within a sane distance, but for longer-distance travel inter-area trams would be used. These wouldn't run everywhere. They'd connect transit hubs, and be organized into lines much the way trains are currently. There would, however, be more interconnection than is typical of train lines, so you could take a direct route with less going out of your way and changing trains at a central station. I call them trams rather than trains because I'd design them using dedicated guideways like the pods rather than rails, so a tram could at a hub choose between multiple ways out. That way the system can dynamically allocate trams to routes at each hub to accomodate traffic. If you're going further than a few miles, you'd typically take a pod to the nearest hub and grab a tram to a hub near your destination. If you picked up cargo that couldn't be delivered, you'd take a pod the whole way back.
Using guideways also allows another trick: commercial pods could be designed that'd run on both pod and tram guideways. A store could, for instance, load up a delivery pod with loads for several customers in the same area and route it out (on a pod guideway to the nearest tram hub, then over the tram guideways to a hub near it's destination, and finally via pod guideways to a stop near the delivery address) to drop off deliveries for customers.
The major problem I see with implementing this is that you'd need to majorly disrupt the street network to build the guideways. You literally can't do this on top of the existing streets, you'd need to redesign the streets to accomodate the guideways (no more on-street parking, the guideways will be occupying that space) and have a whole new way to handle cars crossing the guideways without interfering with pod traffic (probably requiring traffic-control gates). IMO it'd be worth it once implemented, but the up-front cost of implementing it makes it a hard sell.
Labels:
public transit
Monday, January 5, 2009
Programmers and undefined behavior
ISAGN for a sadistic C/C++ compiler to school the current generation of programmers in the dangers of relying on undefined behavior. Too many of them do things like assume that dereferencing a null pointer should cause a crash and core dump. The problem is that nothing says that. The C++ standard leaves the behavior in that case undefined, which means the code and possibly the compiler is free to do anything it wants to at that point. It doesn't even have to consistently do the same thing every time.
So, a sadistic compiler. At run time it'd check for various sorts of undefined behavior. When it detected them, eg. an attempt to use the result of dereferencing a null pointer (as by calling a method through a null object pointer), it'd branch to a routine that'd randomly select from a list of dangerous things to do, such as spewing the contents of /dev/random onto the console, kill -9ing a random process on the system, or zeroing all blocks on a random attached storage device. In addition, the compiler would check for any undefined behavior it could feasibly check for, and take similar actions when asked to compiler such code. Thus, trying to compile "x = a++ - a++;" might result in your system disk being wiped.
The goal: to impress upon programmers that you cannot rely on undefined behavior. At all. Not on what it does, not even that it does the same thing all the time. The only guarantee you have is that you won't like what'll happen. So avoid it like the plague, or pay the price.
So, a sadistic compiler. At run time it'd check for various sorts of undefined behavior. When it detected them, eg. an attempt to use the result of dereferencing a null pointer (as by calling a method through a null object pointer), it'd branch to a routine that'd randomly select from a list of dangerous things to do, such as spewing the contents of /dev/random onto the console, kill -9ing a random process on the system, or zeroing all blocks on a random attached storage device. In addition, the compiler would check for any undefined behavior it could feasibly check for, and take similar actions when asked to compiler such code. Thus, trying to compile "x = a++ - a++;" might result in your system disk being wiped.
The goal: to impress upon programmers that you cannot rely on undefined behavior. At all. Not on what it does, not even that it does the same thing all the time. The only guarantee you have is that you won't like what'll happen. So avoid it like the plague, or pay the price.
Labels:
software
Saturday, January 3, 2009
Zune lock-up bug
Owners of Microsoft's Zune MP3 player saw their devices lock up hard at the end of 2008. It turns out there's a leap-year bug in Microsoft's code. The clock in the Zune records time as the number of days and seconds since 1/1/1980. To convert that into a normal calendar time, the Zune starts with this code to convert the number of days to years and days:
It's basically looping through incrementing the year and decrementing days by the number of days in that year until it's got less than a full year's worth of days left. The problem comes on the last day of a leap year. In that case, days will be 366 and isLeapYear() will return true. The loop won't terminate because days is still greater than 365. But the leap-year path inside the loop won't decrement the days because days isn't greater than 366. End result: infinite loop on 12/31 of any leap year. This bug should've been caught during standard testing. Leap years are a well-known edge case when dealing with dates, likewise the first and last days of the year and the transition from one year to the next are standard problem points where any errors tend to show up.
Microsoft's proposed solution: wait until sufficiently far into 1/1 of the next year, then force a hard reset of your Zune. Yeah, that'll get your Zune working but it doesn't fix the bug. Bets that Microsoft's fix for this code causes a different kind of failure on 1/1/2013?
year = ORIGINYEAR;
while (days > 365)
{
if (IsLeapYear(year))
{
if (days > 366)
{
days -= 366;
year += 1;
}
}
else
{
days -= 365;
year += 1;
}
}
It's basically looping through incrementing the year and decrementing days by the number of days in that year until it's got less than a full year's worth of days left. The problem comes on the last day of a leap year. In that case, days will be 366 and isLeapYear() will return true. The loop won't terminate because days is still greater than 365. But the leap-year path inside the loop won't decrement the days because days isn't greater than 366. End result: infinite loop on 12/31 of any leap year. This bug should've been caught during standard testing. Leap years are a well-known edge case when dealing with dates, likewise the first and last days of the year and the transition from one year to the next are standard problem points where any errors tend to show up.
Microsoft's proposed solution: wait until sufficiently far into 1/1 of the next year, then force a hard reset of your Zune. Yeah, that'll get your Zune working but it doesn't fix the bug. Bets that Microsoft's fix for this code causes a different kind of failure on 1/1/2013?
Friday, January 2, 2009
JournalSpace mistakes mirroring for backups, dies
JournalSpace is dead.
Short form: JournalSpace depended on drive mirroring for back-ups. Something proceeded to overwrite the drives with zeros, and the mirroring politely overwrote the mirrors with zeros too. Their entire database is gone, all blogs, everything.
Repeat after me: mirroring is not a back-up. RAID and drive mirroring are for reliability and fault-tolerance. They'll protect you against hardware failure. They won't protect you against software doing something stupid or malicious. If the software says "Write this data to this location on the disk.", mirroring software and RAID drivers won't, I repeat will not, not write the data. If you're depending on your mirrors to contain something other than exactly what the main drive contains, well, you'll end up where JournalSpace is. You need point-in-time backups to external media, something that won't duplicate what's on the main drive unless and until you do the duplication yourself. That's the only way to insure that, if your software writes corrupted data to your main disks, the backups don't have the same corruption written to them as long as you catch the problem before you overwrite the backups with new ones. This is also, BTW, why you have more than one set of backup media: so if you do run your backups before catching a problem, you've got older backups to fall back on.
This should also be a cautionary tale for anybody wanting to host their data, application or whatever "in the cloud". If you do, and it's at all important, make sure you a) can and do make your own local backups of everything and b) have a fall-back plan in the event "the cloud" suddenly becomes unavailable. Unless, of course, you want to end up like the people who had their journals on JournalSpace: everything gone, no way to recover, because of somebody else's screw-up.
Short form: JournalSpace depended on drive mirroring for back-ups. Something proceeded to overwrite the drives with zeros, and the mirroring politely overwrote the mirrors with zeros too. Their entire database is gone, all blogs, everything.
Repeat after me: mirroring is not a back-up. RAID and drive mirroring are for reliability and fault-tolerance. They'll protect you against hardware failure. They won't protect you against software doing something stupid or malicious. If the software says "Write this data to this location on the disk.", mirroring software and RAID drivers won't, I repeat will not, not write the data. If you're depending on your mirrors to contain something other than exactly what the main drive contains, well, you'll end up where JournalSpace is. You need point-in-time backups to external media, something that won't duplicate what's on the main drive unless and until you do the duplication yourself. That's the only way to insure that, if your software writes corrupted data to your main disks, the backups don't have the same corruption written to them as long as you catch the problem before you overwrite the backups with new ones. This is also, BTW, why you have more than one set of backup media: so if you do run your backups before catching a problem, you've got older backups to fall back on.
This should also be a cautionary tale for anybody wanting to host their data, application or whatever "in the cloud". If you do, and it's at all important, make sure you a) can and do make your own local backups of everything and b) have a fall-back plan in the event "the cloud" suddenly becomes unavailable. Unless, of course, you want to end up like the people who had their journals on JournalSpace: everything gone, no way to recover, because of somebody else's screw-up.
Labels:
backup,
cloud computing,
software
Subscribe to:
Posts (Atom)