Smoke, Fire, and Recovery
Apollo’s troubles began in September 1965, when NAA’s second stage ruptured during a structural test.91 Engineers pinpointed the fault, and in the process MSFC managers concluded that NAA’s management was to blame for shoddy workmanship. By October, the Industrial Operations manager, Brig. Gen. Edmund O’Connor, told von Braun, ‘‘The S-II program is out of control.’’ He believed its management was to blame. O’Connor was equally blunt in a letter to Space and Information Systems Division (S&ID) President Harrison Storms: ‘‘The continued inability or failure of S&ID to project with any reasonable accuracy their resource requirements, their inability to identify in a timely manner impending problems, and their inability to assess and relate resource requirements and problem areas to schedule impact, can lead me to only one conclusion, that S&ID management does not have control of the Saturn S-II program.’’92
Phillips went immediately to NAA with a ‘‘tiger team’’ of nearly one hundred NASA personnel to ‘‘terrorize the contractor,’’93 reporting the team’s
Apollo with its major contractors identified. Apollo was perhaps the largest single R&D project of all time, integrating many contractors for its stages and requiring massive launch and operations facilities and organizations. Saturn V contractors not identified. Courtesy NASA.
findings in December 1965 in what later became known as the Phillips Report. While writing to NAA that ‘‘the right actions now’’ could improve the program, Phillips privately wrote Mueller that NAA’s president was too passive. Storms, Phillips said, should ‘‘be removed as president of S&ID and be replaced by a man who will be able to quickly provide effective and unquestionable leadership for the organization to bring the division out of trouble.’’94
NAA responded by placing Gen. Robert Greer, retired from the air force, in charge of the S-II program. Greer updated the management control center and ensured more rapid exchange and collection of information through Black Saturday meetings modeled after those in Bernard Schriever’s ballistic missile program. Greer also instituted forty-five-minute meetings every morning, eventually cutting back to twice a week. Greer’s reforms began to take hold but did not prevent the May 1966 loss of another test stage because of faulty procedures. NASA clamped down further, requiring NAA to develop better methods for managing and planning its work. In the summer of 1966, after two years of studies and preparation, NAA deployed work package management for the S-II and Command and Service Module.95
Work package management extended project management to lower project levels and combined accounting and contracting procedures by creating a specific work package for each program task. The company assigned responsibility for each task to one person, a mini project manager for the task who accounted for performance, cost, and schedule in the same way and with the same tools as the overall project manager. Each work package was a ‘‘fundamental building stone,’’ with specifications, plans, costs, and schedules to help managers in their monitoring. Prior to the development of work packages, ‘‘It was difficult to say what manager was responsible for a particular cost increase because there were 10 or 15 functional and subcontractor areas involved.’’96 In later versions, the work package numbering scheme matched that for cost accounting.
Grumman’s difficulties on the lunar module also attracted NASA attention. Troubles first appeared in schedule slips on its ground support equipment in the spring of 1966. Alarmed at Grumman’s growing costs, Phillips sent a management review team to Grumman that summer, prompting Grumman to sack the program manager, establish a program control office, and move Grumman’s vice president to the factory floor to monitor work. By fall, NASA pushed Grumman into adopting work package management.97 It did not immediately solve Grumman’s difficulties. The primary problem was a late start due to NASA’s delayed decision to use lunar orbit rendezvous. However, work package management and the new program control office found and resolved problems more quickly than before.
Despite these difficulties, Apollo moved briskly forward until its most severe crisis struck on January 27,1967. That day, astronauts and KSC personnel were performing tests in preparation for launch of the first manned Apollo mission. At 6:30 that evening, the three astronauts scheduled for that mission, Virgil Grissom, Edward White, and Roger Chaffee, were in the spacecraft command module testing procedures. At 6:31, launch operators heard a cry from the astronauts over the radio, ‘‘There is a fire in here!’’ Those were their last words. All three astronauts died of asphyxiation before launch personnel reached them.98
KSC personnel immediately notified NASA headquarters. Administrator Webb hurriedly planned for the political fallout. He sent Seamans and Phillips to Florida, while he persuaded the president and Congress to let NASA perform the investigation.99 NASA’s investigation concluded that the causes of the disaster were faulty wiring, a drastic underestimation of the dangers of an all-oxygen atmosphere, and a capsule design that precluded rapid escape. No one had realized how dangerous the combination was. NASA had used a pure oxygen atmosphere in all of its prior flights, as did air force pilots in their high-altitude flying. As Col. Frank Borman, one of NASA’s most experienced astronauts, put it during the Senate investigation, ‘‘Sir, I am certain that I can say now the spacecraft was extremely unsafe. I believe what the message I meant to imply was that at the time all the people associated and responsible for testing, flying, building, and piloting the spacecraft truly believed it was safe to undergo the test.’’100
Congress did not prove NASA to be negligent or incompetent. One of the investigation’s important results was a nonfinding. Despite searching long and hard, Congress did not find fault with Phillips’s management system. Phillips had already uncovered problems with NAA and had been working for some time to make improvements to its organization and performance. The management system used to organize the capsule design was NASA’s original
committee-based structure, upon which Phillips had superimposed configuration management. He and his management system came out unscathed.
Congressional investigations did uncover some of NASA’s dirty laundry, particularly problems with command module contractor NAA. Sen. Walter Mondale of Minnesota learned of the Stage II Phillips Report and confronted Webb about problems between NASA and NAA. Caught by surprise, Webb said he did not know of any such report, which at that moment he did not. After the hearing, he found out about it from Mueller and Phillips. Furious, Webb launched a ‘‘paper sweep’’ to search for more skeletons in the closet. The sweep uncovered a memo written by GE to Apollo spacecraft director Joseph Shea, warning Shea of the danger of fire in the command module. Shea had passed the memo on to his safety and quality assurance people, who responded that no significant dangers existed. GE, already in a sensitive situation because MSC considered it to be spying for headquarters, did not push it any further.101
Webb reacted angrily to these revelations. He believed OMSF had been far too independent and secretive. Webb told Seamans, ‘‘You have to penetrate the [OMSF] system, don’t let Mueller get away with bullshit.’’ The problem, according to Webb, was a lack of supervision by NASA’s executive management. Mueller had ‘‘followed the policy in Houston of obtaining the very best men they could for the senior positions, and had, as a part of the process of obtaining them, given assurances that they would have almost complete freedom in carrying out their responsibilities.’’102
After Seamans left NASA in late 1967, Webb expressed shock at the poor management system.103 Webb probably did not realize how decentralized NASA’s management really was. Executive managers routinely delegated most decisions to lower levels. In the wake of the fire, this did not seem wise.
When NAA refused to make swift and comprehensive changes — and even expected to be paid a fee for the burned-out spacecraft—Webb called Boeing to see if it would take the job. Boeing said that although it did not want to take over the job, if pressed it would do so. Webb returned to NAA, demanding that it remove S&ID head Storms, further centralize Apollo project management, eliminate any fee for the failed spacecraft, and pay for improvements. NAA did not take the chance that he was bluffing. NAA was extremely unhappy with the entire situation because from its viewpoint, NASA was at fault. Shortly after contract award, over NAA’s objections, NASA had directed a change from a nitrogen-oxygen atmosphere to an all-oxygen atmosphere.104
One problem uncovered during the investigation was GE’s unwillingness to contest NASA over safety issues with a pure oxygen atmosphere. At the heart of the problem was industry’s reluctance to confront NASA when industry was dependent on government funding. Despite his substantial political acumen, Webb appeared not to comprehend this. He had hired GE and Bell – comm to strengthen headquarters’ ability to monitor the field centers in 1962; after the fire, Webb repeated his mistake by expanding Boeing’s role from integrator of the Saturn V to integrator of the entire Apollo-Saturn system to ‘‘penetrate the OMSF system.’’ Phillips, who understood the political problems inherent in the GE and Boeing integration efforts, revised the Boeing contract to avoid the negative consequences of Webb’s misconception.105 In essence, Webb wanted to use GE, Bellcomm, and Boeing as an arm of NASA headquarters to control MSC, MSFC, and KSC. This could not work because these contractors could not challenge NASA field center personnel for fear of losing their contracts.
Boeing, as part of its contract, further integrated the management system. The ‘‘teleservices network’’ connected NASA project control rooms with hard copy data transmittal, computer data transmission, and the capability to hold a teleconference involving MSC, MSFC, KSC, Michoud (where the Saturn I was manufactured), and Boeing’s facility near Seattle. Boeing copied MSFC’s program control center design at each facility.106
After the fire, NASA placed even more emphasis on achieving high quality and safety through procedural means. In September 1967, NASA set up safety offices at each field center, along with the first project safety plan. The next month, MSC established a Spacecraft Incident Investigation and Reporting Panel to look into anomalies. A month later, NAA created a Problem Assessment Room to report and track problems.
Phillips ordered an astounding array of program reviews to prepare for Apollo’s upcoming missions. He wrote to field center managers to ensure that they used the upcoming Design Certification Reviews to evaluate all potential single-point failures.107 In January 1968, he ordered a complete system safety review, analyzing the interaction of the mission with the hardware, astronauts, ground systems, and personnel. Other reviews included those for quality and metrology, launch vehicle and spacecraft schedules, the communications network, flight readiness, mission planning, subcontractors, site selection, the Lunar Receiving Laboratory, flight evaluations, anomalies, crew safety, interface management, software, and lunar surface activities.108
NAA’s procedures exemplified the upgraded problem reporting system. Engineers reported failures on a Problem Action Record form. Reliability engineers sent failed components to the appropriate organization, which responded by filling out a Failure Analysis Report describing the physical cause of the failure and the corrective actions taken or recommended. If the organization determined that an engineering change was necessary, it submitted a change request to the change boards. The program control center tracked report status, and a centralized reliability ‘‘data bank’’ recorded the problem and its resolution. Follow-up failure reports and dispositions closed all failure reports.109
Another change in the aftermath of the fire was a further strengthening of configuration management, primarily through changing CCB operating procedures. An October 1967 rule disallowed nonmandatory changes for the first command and lunar modules and required the MSC Senior Board to rule on any and all changes to these spacecraft. A February 1968 ruling required managers to consider software changes and their ramifications in CCBs. In May 1968, Apollo Spacecraft Manager George Low specified that the MSC CCB had authority over all design and manufacturing processes.110
By 1968, tough CCB rules slowed the program as trivial changes came to the attention of top managers. Eventually, even Phillips realized that centralization through configuration management could go too far. In September, MSC managers classified changes into two categories: Class I changes, which MSC would pass judgment upon, and Class II changes, which could be approved by the contractors. Classification did not by itself help much, so in October 1968 Phillips gave Level II CCBs more authority, while higher levels ruled on schedule changes.111
The Apollo project met its technical and schedule objectives, landing men on the Moon in July 1969 and returning them safely to Earth. Anchored by configuration management, Phillips’s system weathered the storm of problems uncovered through testing and Apollo’s most severe crisis, the 1967 death of the three astronauts and the ensuing investigations. Despite strenuous efforts, congressional critics did not find many flaws with Phillips’s management scheme and concurred with NASA that the fire resulted from a tragic underestimation of the danger.
Configuration management was Phillips’s most powerful tool. Whenever problems occurred, his almost invariable response was to strengthen configuration management. Having found that his favorite method could be overused, by the end of 1968, Phillips gave lower-level CCBs more authority. Configuration management formed the heart ofApollo’s system and has remained at the core of NASA’s organization ever since.