Search This Blog

Monday, January 24, 2022

What goes into making an OS to be Unix compliant certified?

Chuyện về việc make MacOS UNIX Compliant 
Terry Lambert 

A lot.

I was the tech lead at Apple for making Mac OS X pass UNIX certification, and it was done to get Apple out of a $200M lawsuit filed by The Open Group, for use of the UNIX™ trademark in advertising.

The lawsuit was filed because the owner of Mac OS X Server kept putting "UNIX" on the web site, and all other marketing collateral for the Server product.

The options were:

Make Mac OS X actually UNIX™, to defang the lawsuit; this would also make The Open Group industry relevant, when at the time they were losing a lot of that to Linux' increasing popularity — which is why it was an option on the table at all
Buy The Open Group for about a billion dollars so that Apple could freely use the Trademark to describe a power cord, if they wanted to; this would not get them out of existing contractual obligations with Sun Microsystems, IBM, and others, who had already licensed the use of the trademark, however
I was asked if I could lead a team to do #1. I said "Yes, under the condition that I could use the compliance project as a hammer to force other parts of the organization to make changes in their own code base, and that I could play it rather loose with commit rules regarding what it said in the bugs database for a given code change, and what the given code change actually did, in addition to what it said in the bugs database".

I was given the "go".

And so we ran the compliance test suite against the existing Mac OS source base, and it immediately errored out because of the header files.

And Ed Moy and I made a two line change that moved a type definition from <stdio.h> to where it was supposed to be, instead. One line of change in <stdio.h>, and another in the file the type was actually supposed to be located in.

And we ran the tests again, and one of the header file errors in the tests went away.

So we did a "world build", where everything that was in Mac OS X, including iTunes, got rebuilt.

That — essentially, one line change — broke 152 (from memory; that number sticks up, but it might have only been 137) projects failed to build.

Including iTunes.

And so Ed and I went through, and fixed every single one of those projects to build with the change, or without the change.

And we did another "world build", and everything built.

Yes, we had access to all of Apple's source code, at that point in the game.

And so we submitted high priority bug fixes to the projects, some of which downgraded the priority immediately, and some of which they simply fixed, since we had provided them with the patch already.

And then the VP of engineering, Bertrand Serlet, re-escalated the priority on the ones which had been downgraded.

And we committed the header file changes.

At this point, we had to go back, and reassess the feasibility of the whole project.

Ed and I felt it was doable in the timeframe, given the preconditions I had already placed on the project.

Ed was willing to say anything directly.

I said "yes", putting my job on the line, were we given the "go".

It escalated up to Steve.

We were given the "go".

It was, after all, saving Apple either $1 billion or $200 million plus revamping all of the Mac OS X Server marketing collateral, after all.

We were promised 1/10th of the $200 million, or $20 million in stock, on completion. $10 million to me, $5 million to Ed, and $5 million to Karen Crippes, who was looking for a home in Mac OS X development, I knew was an amazing engineer, and who could be roped into being technical liaison and periodically kicking off the tests and complaining to Ed and I about things not passing.

I got the $10 million, because it was going to be my job on the line, and potentially, my ability to work in industry at a high level, ever again, in the future.

Also, the tech lead has to fix anything no one else fixes, or no one else can fix, because they are the DRI (Directly Responsible Individual).

I also wasn't just tech lead; I was de facto project manager.

I wore a lot of hats.

It was going to be long slog.

I had estimated a year, for a team of 5 individuals: the three mousekateers (not a misspelling), two contractors — one for mostly user space code; that was Len Lattanzi, and one for full time test automation and bug filing, Jaime Delgadillo, who also contributed patches, where possible.

We had two more temporary contractors; one for tools compliance, and one for the man pages.

And then anyone we could rope in from elsewhere in Apple, on a case by case basis, for a short term.

This was mostly to make sure they were invested in the project; we didn't actually need them to write code.

Our first red letter day was when all the header files passed testing, so that the other tests in the test suite would start running.

We actually committed all the header file changes to the rest of Mac OS X, by that time. The headers were standards compliant By the time Tiger shipped.

This broke the heck out of Code Warrior. I fully intended to fix that, but was never given the opportunity, and Code Warrior was more or less collateral damage.

But it was a red letter day when the header files passed testing, and we celebrated by going out to IL6 — the informal name for the BJ's restaurant, just off the Apple campus.

As far as the rest of Apple was concerned, we had just closed the "Fix Header Files" bug, which encompassed a lot of other bugs that were for individual header files.

We had spend about 3 months doing this. I had promised a year.

How was I going to hit the one year estimate?

I knew going in that forcing the header file changes — and the project changes associated with them — would be the biggest individual part of the project.

Once we could run the other tests, there was a lot of "low hanging fruit" to fix in other areas.

That took about two months, playing fast and loose with the commit rules, and we made short work of them. Ed did most of the libSystem — libc + other system libraries, rolled into one — with assistance from me inmoving things out of the namespace; this is why there are header files in /usr/include/sys that begin with "_", for example.

While waiting on submissions, you could do other work in parallel, and we did.

After the low hanging fruit, there was a lot of other work, like rewriting the signal system in the kernel, which was not that low hanging.

By this time, we had roped in Umesh (I won't give his last name), because he didn't want us touching his pthreads code, and he wanted to make changes there anyway, and having the project as a means of hammering through those changes pleased him greatly.

We bought begrudging buy-in from Mike Smith (yes, *that* Mike Smith) by having him rewrite the file locking code. I'm the one who pushed it out of individual filesystems, and into the layer above, so that it was one common code implementation. But Mike was the one who made it work.

We finally bought Joe Sokol off by asking questions about the trap path, and surrounding the signals system stack frame saves.

Of these, however, it was Umesh who was most helpful in meeting our deadline.

Eventually, we had everything working and passing the tests. We were ready to pull the trigger.

And then they pulled in the Intel code changes, and crapped all over everything, because we were told to wait two weeks.

It was a mess.

And so I spent a three day bender, reintegrating all our patches on the conformance branch into post-Intel kernel code.

By this time, I knew pretty much every one of the 13 million lines of kernel code in the Mac OS X kernel.

And we were back to passing the tests.

And then we were told we could not integrate for Tiger.

That we would miss our self-imposed deadline. Because it was "too much change at once, with the Intel changes".

Tiger dragged on for another 6 months before release, with week-for-week slips. This was because of Intel-specific bugs — not in the kernel.

We could have, in other words, easily shipped in Tiger.

And hit our self-imposed deadline.

If I were asked to do the same thing for Linux, it likely would take five years, and two dozen people.

Linux is pretty balkanize, has a lot of kingdom building, and you have to pee on everything to make it smell like Linux.

I could do the same in FreeBSD in about a year and a half, with a dozen co-conspirators to run the changes through.

A lot of the work would happen in the "ports" tree.

All told, probably 4% of the 6% of the Max OS X kernel that I wrote?

It came from the UNIX conformance changes.

IT came from committing massive signals changes, and attributing them to a simple signal bug resulting in a kernel crash, in the "Radar" bugs database.

A lot of the things Ed did to libc header files, and libc itself, had similar "fibs" in Radar.

We had a lot of gratitude in the Open Source community — particular for our fixes to make bash pass the tests.

You have absolutely no idea how much Apple contributed to the Open Source community, as part of this project, because it was a secret project — at least to people outside Apple — so we didn't advertise the fact.

But I expect we contributed about two million lines of code, to hundreds of Open Source projects, over the course of that year.

A lot of gratitude — but it wasn't collective, and so Apple was still faulted for "using Open Source code, but never contributing back".

We fixed at least 15 major gcc bugs, for example.

You have no idea.

So overall?

It's a pretty big project to get compliance.

And that before all the things that Karen did, on the self-certification, contracts, getting test exception based on existing exception for OSF/1 Mach, and so on.

It was, indeed, a long slog.

No comments:

Post a Comment

PHÂN BIỆT QUẢN TRỊ VÀ QUẢN LÝ

PHÂN BIỆT QUẢN TRỊ VÀ QUẢN LÝ Hội đồng quản trị, tiếng Anh là BOD (Board Of Directors). Còn Ban giám đốc hay Ban quản lý tiếng Anh là BOM (B...