Copland
Apple's Copland versus NeXT: looking back at what happened, and what might have been. Apple is shipping their 3rd version of MacOS X, and I think we can recap and try to learn from history. NOTE: I think Apple's new management was far better than it's old management. And I'm not saying I could have done a better job than they did. But I'm very into honest post-mortem to try to learn the lessons that life is offering.
History
Let's recap. In the mid 1990's Apple had been working hard on Copland (their next version of MacOS -- System 8: the first time). This was a new kernel underneath the MacOS and new UI and features up above. It was bringing MacOS into the 90s (and into the next millennium).
There are three things to be balanced for any project: (1) time (2) features (requirements) and (3)resources.
With Copland, management was putting unrealistic expectations on the team, basically mandating 100% backwards compatibility, while simultaneously completely changing the architecture and functions. There were many on the Copland (Maxwell) team that were stating that there should be a separation between older apps and newer application (like Classic and Carbon), or ways to drop and scale in features. In engineering you can often deliver 90% of the functionality in 10% of the time, and the last 10% of features or functions are what take most of the time; so you need to pick wisely. But management and marketing would not hear of it; engineering had to do everything that marketing wanted, no matter how difficult (or unrealistic) that was. So there was no flexibility in features.
Management was also setting schedules based on what they wanted to happen, instead of listening to what was possible. Just because management wants it in a certain time, doesnít mean it can be delivered in that time. And if you try to hit impossible deadlines, you waste more time because of burn out, loss of team members, lack of communications (as people rush), or too much communications (as management bogs everyone down with meetings to figure out why things arenít hitting their unrealistic schedules; and then wasting hours arguing why when they donít want to accept the reasons), and so on. Many on the Copland team were fighting for more realistic timelines, but there was no flexibility in time.
Resources are not infinite: a team can only develop so fast. More people does not mean more productivity; it means more overhead (communications, training, collisions, etc.), and that can basically level out how fast you can develop. It can get to a point where adding more people only slows down a project. This is the whole ìmythical man-monthî argument; the overhead on larger teams can kill you, and you can really only develop at some maximum speed, after that you crash and burn.
With Copland, the features weren't going to give, the time wasn't going to give, and the resources are basically fixed or maxed out. So what is left? The least understood or discussed part of the equation: quality.
When engineers are forced to hit schedules that are too aggressive, they can not change the physics of the situation. They can't make time slow down, or put more hours in the day. Engineers can only be so productive; if you try to push them beyond that, you actually get less productivity out of them. Since something had to give, and it wasnít time, features or resources, it was quality.
Engineers start taking short cuts, documenting less, hacking more, and doing whatever it takes to get things done as quick as possible, not as well as possible. This is what is known in engineering as a death-march; everyone knows they are doomed because management wonít listen, so they just hack features out to make the schedules look good and stay employed as long as possible, even though they know it is never going to work when put together. It is the old ìthe operation was a success, but the patient diedî kind of thing.
Ironically quality is related to feasibility and time. Failures in one area cascade to other parts of the project, and the whole project becomes less stable and takes more time; with the resulting finger pointing. Then when the project fails to hit the date, you have to start over, and spend a lot of time reworking the screwed up parts, or throwing them out and starting over. So pushing past a certain threshold results in less delivered, even though on project timelines it can look like they are delivering more (early on).
Bail-out
After a few iterations of engineers not being able to deliver the impossible, the last stage of a "death-march" is shopping around.
Management doesn't blame itself for bad management, it blames engineers and shops around for a safety net. The logic is that if they can't build it, they can buy a competing solution quickly, then they'll keep themselves employed and hopefully no one will notice how bad they are screwing up.
This is good business reasoning; if you can't build it, then buy it. It is also a decision that can destroy businesses.
What organizations are actually doing is just giving up all their known evils and well understood issues, for someone else's less known and less well understood issues; trading their sins for someone els'e's (based on what that other person tells you). This is the equivalent of trading in your entire hand in poker and then betting based on what you hope you get. This is not a very safe path, and seldom does it work out well. The company that you are usually going to buy is failing or having problems of their own, which is usually why they are selling. I've seen many organizations put out of business, or set back many years. And set-backs of years are expensive in rapidly changing businesses like computers.
Now of course NeXT was selling themselves; see lying/marketing as I can't often tell the difference. They could deliver everything within a year or two, it would be great, it would slice, it would dice, it would feed the cat. The first promise was within a year theyíd have a beta and soon after a release. It would save Apple, everyone would love it, and of course it flopped.
Apple management bought into the safety net, and tried to buy their way out of their problems; and things went better than normal, but with most of the same issues. They traded their problems for NeXTís. And in a rare move, they traded their management for NeXTís as well; unintentionally. Apple management never had to learn important engineering lessons (how to tell the truth, how to listen to engineering, etc.), and instead learned other business lessons; like other peopleís agendas might be your own or in your (or the organizations) best interests.
Once NeXT took over, they had to deliver; and the truth was NeXT was not in a better place than Apple was before the buyout (short term). There were many long term gains; the engineers, some of the technologies, and the marketing aspects (you bought time on Wall Street; which understands as much about running an engineering business, as a flea understands about Nuclear Physics). NeXT had inherited problems and had to get out from under their lies, without alienating Wall Street, and getting replaced themselves. Rhapsody wasn't going to work without backwards compatibility (the biggest millstone on the Copland project), developers needed some compatibility and some growth path.
After a year of developing Rhapsody, they stopped playing their own death-march and realized that developers were not going to rewrite everything they had just to make Apple's life easier. There needed to be a way to bring Mac Apps forward; and Carbon was added -- which was really something that Copland/Gershwin people had been advocating all along; 90% compatibility was easy, and long before, Apple engineers had already decided which 90%. All the new management had to do was to reject the old requirement of 100% compatibility and accept 90% compatibility as good enough, and Carbon was ready to go.
So Carbon on a new kernel (NuKernal) was done long before the rest of OSX was ready. So what did they need NeXT for?
It took years to get NeXTSTEP and Cocoa and the rest of the OS time to catch up; with new features thrown in to cover the delay. But the primary reason they didn't ship sooner isn't because they couldnít have, it is because NeXT needed or wanted to justify their own existence and guarantee their own technologies future, so they needed to buy time to get their technologies interwoven. Plus those technologies were part of the sales-pitch.
Apple had traded a known evil (a couple years) for an unknown evil that they thought was shorter; since initially NeXT was selling it as one year to 18 months (instead of 2+ years for Copland). What they got was actually twice as bad as what they'd already had in terms of time. I'll go into features, functions and marketing tradeoffs elsewhere; this is just about the timeline. A truism for many project competitions is that when bidding against one another, the biggest liar wins. Copland was a known couple years (and they were being honest), the new team just had to lie (claim to be shorter) and take their shot; which they did.
Some are going to say BeOS would have been different; and they would be wrong. Be had a different set of sins, but sins none-the-less. I have no doubt that had Be been the chosen candidate, then enough new and ugly surprises would have cost Apple enough time, that we would have had roughly the same outcome; years more than expected to actually deliver. And if you doubt that, then you need to be an engineer or manager on a few more projects until you learn how many projects slip and by how much. If Be said it would take them a year or two, you can bet it would have taken at least double that. (A common engineering practice is to take your best guess at how much time it will take, and then multiply that by 3). And if you look at NeXT, and they said a 18-24 month, then the 5 years it has taken is right in the middle of that (3 times) projection.
|
Conclusion
In the end, it took like five years to deliver on a usable version of the next generation Mac OS. What is interesting in that it would have taken only 2 or 3 years for Apple to have delivered on Copland if they followed the initial course. Partly because Copland already had a 2-3 year head start on a similar 5 year delivery as OSX ended up taking. So the numbers didn't change; a 5 year lifecycle for Copland was too long (even with a 2-3 year head-start), but the same 5 year lifecycle for OSX was fine, because management believed they could deliver it in 2 years.
I'm sick of hearing that Copland project failed because of engineering. From what I know of engineering and people inside of Apple, it was mostly because of bad management decisions inside of Apple. Apple could have easily reduced the features of Copland to include 90% compatibility (Carbon) and Classic (for compatibility) and shipped years earlier. And many knew that and had been advocating that all along. Apple already had Classic running before NeXT, via an older project called MAE (Mac Application Environment). Apple already had done the Carbon stuff to decide which API's they needed to drop (which weren't reentrant: able to work in a preemptive environment). Apple already had a kernel and driver model working (and arguably a better one) with NuKernel. In truth, at the time of the buyout, Apple was closer to delivering something usable than NeXT was, but Apple management was too stupid to realize it.
That wasn't the only "blow it". Before that, if Apple management had followed through with Taligent, they would have been ahead of the game years earlier; with probably with a better technology (and better partner). But they chickened out there too. The same for OpenDoc. Apple had the engineering expertise and good solutions; they just didn't have the managerial fortitude to know which path to follow through on.
Some see NeXT or Steve Jobs as a savior on a white horse. The truth is much less black and white.
Apple could have solved their problems without throwing everything away and starting over, but they didn't.
Steve and NeXT have delivered, and I like OS X... but the truth is the technology was not nearly as important as replacing the goals and management.
The single biggest contributions has been to focus and provide vision, then to follow through with it. Apple technically could have done that without Steve (using their own technologies), and it would have taken less time, if they had better management. But they didn't have that, so Steve Jobs had to take them over and fire the incompetent ones to get them focused on a single vision. So NeXT was a solution, but not the only one. Yet, in the end, changing to NeXT was better than no change at all.
The moral of this story? Changing directions on software projects has huge cost. Trying to buy your way out of a solution is a much bigger risk than most realize. And when you bet the farm, sometimes you lose it. But in this case, it turned out for the best.
Written 2002.09.10