Eight Themes in System Safety Engineering

I was led recently to think of some of the main issues in safety engineering of systems with computer-based components, when they occurred in the course of a discussion on the University of York safety-critical systems mailing list (look for “Certification of Tools/Components” in the archive). Here are some of these issues and my views on them.

Issue 1. “Where is the hard part of safety engineering? At the requirements level, or below?”

I think there is consensus that many of the most significant failures of the last decades have come down to requirements failures (mismatch between the engineering functional specification and the actual environment in which the system was operating). This was shown most visibly by Robyn Lutz on NASA mission failures some two decades ago. It was also substantiated by an HSE report whose name I don’t recall. And it fits with the experience of most of us.

So here is a heretical thought. Maybe this is changing as the systems become more complex? The 1997 Mars Pathfinder failure occurred because of a known weakness of an operating-system prioritisation scheme. It turned out that the designers didn’t know the right computer science; others did. With simpler systems than a Unix-based real-time kernel with priority schemes, it is likelier that everyone involved would have known the right science and any problems would not have lain in the OS.

The B777 has suffered two significant problems. One came to light in the 2005 pitch-excursion incident in Western Australia. The other was a series of Byzantine failures on the control-system bus which almost led to the airworthiness certificate being withdrawn. Neither of these were requirements failures.

The A330 has also suffered multiple pitch-excursion incidents, to two aircraft in Western Australia in late 2008. That seems to have something to do with the fault-tolerance architectural behavior. No requirements failure there.

Except, of course, that one person’s design is another person’s requirements. Dealing with requirements involves developing one specification. Design involves comparing two. Because of social constraints (subcontracting, proprietary information, and so on), maybe many engineers are working with one specification when they should really be comparing two? That doesn’t make associated failures into requirements problems, although the organisations involved may like to characterise them this way. They are traceability and refinement issues.

Issue 2. “Where should we concentrate effort in safety-critical system design?”

The obvious answer is: anywhere and everywhere there is a weakness. There are weaknesses all the way from initial conception (“requirements elicitation”) down to object code and thence to the HW.

One issue: Almost any code into which anyone has ever looked has bugs in it. Indeed, since Rod Chapman discovered the bug in the Praxis/NSA Tokeneer project, to that point thought likely to be bug-free, while writing the project up for public release, I think I can eliminate the word “almost”. (Recent work at NICTA in Sydney has verified a real-time operating system kernel in C, using Larry Paulson’s Isabelle prover. However, I would like to know what NICTA assumes about the operational semantics of the precise C code they verified, and how they assure that semantics in the compiler they recommend using: if they do so, I’ll put “almost” back in.)

These “bugs” are design features which lead in certain circumstances to the system behaving differently from intended. That is, the SW engineers knew what their programs were supposed to do, but what they delivered does not always do that. Sometimes it can be humdrum, but sometimes it can be very tricky: huge amounts of intellectual effort are expended by some of the best computing-scientific minds on the planet trying to solve such issues as Byzantine disagreement on an abstract level.

And here we come to an issue I address again below: there is some feeling that all appropriate techniques should be usable by an average engineer in the field. My view is rather that if it takes people with PhDs in computer science to do this work of ensuring as-bug-free-as-possible, then, please, let them, indeed, encourage them. I don’t think your average bachelor’s-level engineer can check for or handle Byzantine agreement problems. Indeed, I don’t think your average bachelor’s-level engineer even knows what they are.

I am not very sympathetic to views which suggest that any part of this general problem is not important, but some prominent people in the field appear to hold such views.

Issue 3. “Formal methods are only “math” and not “real world”. ”

This issue comes up regularly, and has come up regularly for the last quarter-century I have been in the field. It rests on a “folk” view of the nature of math, which to my mind has been largely discredited by philosophers of mathematics and science. Nevertheless, it remains current amongst some of us in system engineering.

Consider the numerical mathematicians who have built codes to calculate the exact aerodynamic behavior of modern aircraft before they are even put into a wind tunnel.

Consider the logicians who analyse a requirements specification and point out that such a system cannot be built (because the requirements are inconsistent).

I don’t think you would find either group sympathetic to the suggestion that what they do doesn’t have real-world import. I don’t see why anyone else should be sympathetic to it either.

Issue 4. “There was plenty of rigorous reasoning around before formal logic was invented. So we don’t need formal logical languages to write and check requirements specifications.”

The first statement is true. The very same history, however, gives general reasons to doubt the second. Most system engineers and computer scientists do not have sufficient background in the history of reasoning to draw this conclusion effectively for themselves.

I think there is general agreement that Aristotle’s analysis of syllogistic reasoning was largely accurate. However, for the next couple of thousand years people had difficulty formulating the principles of reasoning with multiple quantifiers (a quantifier is a term such as “every”, “each”, “some”). The medievals had an elaborate theory of “distribution” of terms. It didn’t work. Frege solved the issue with the notion of (what we nowadays call) scope and a formal language which had a means of indicating scope. There is no doubt that, after Frege, humankind had the means of saying rigorously what it meant and being understood. And those means were formal languages based on first-order predicate logic. And then others.

Meanwhile, linguists are still researching the semantic structure of natural language quantifiers. John Barwise discovered thirty years ago that some sentences appear to have a semantics best indicated with parallel sequences of quantifiers. Isn’t it a good idea in safety-critical system specification to stick to language structures we know all about and can handle, for example Fregean scoping, rather than possibly to have to confront all these semantic subtleties unwittingly? How does one check a natural-language requirement for consistency if the linguistic jury is still out on quite what it could mean? Yet some safety-critical system development standards, for example those for European railway electronics, still require natural language requirements specifications.

Another consideration. One of the main assurance issues in design is that of comparing two specifications and showing that the one implements the other. Indeed, this assurance issue arises at every stage of design all the way down to the box in your hand except for one. The work of three out of the four most significant logicians in the history of humankind (Frege, Gödel, Tarski) has shown unequivocally in the last one hundred and some years that the most effective methods for comparing two complex, rigorous statements (which is what specifications are) is through using the formal languages of logic.

A claim that, say, the language of first-order logic is not expressive enough for formulating all the specifications that might be necessary for any engineered system just seems to me to be ludicrous. You can say anything you want in first-order logic, as the great philosopher W. V. O. Quine pointed out regularly for fifty years until his death in 2000. You can’t necessary say it in the slickest possible way – although you can sometimes, despite the plethora of formal languages we have around us nowadays – but you can do it, and then check it rigorously for consistency.

Indeed, we tried this recently. Bernd Sieker derived a system for mimicking the required communications for train dispatching on German railways (these requirements are part of German administrative law) and using formal refinement with hand proofs was able to derive SPARK code for the mimick system from an demonstrably complete set of high-level safety requirements, which were derived and written in first-order logic. It works, just like Quine said. Maybe not always, but a lot more often than people in system engineering seem to think.

Issue 5. “The current formal languages proposed for various system purposes are inadequate.”

Well, sure. Or equally not. Almost any tool, such as a formal language, has areas for which it works well and areas for which it doesn’t. I am not sure how much sense blanket approbation or condemnation makes. And, of course, we can do better. No matter how good or bad we are in any area of human endeavor, we can do better.

For example, SPARK seems to work very well for control of coding at a level corresponding to Ada source code. It is not targeted, for example, at requirements elicitation and I can’t think it would work so well there. Indeed, its producer itself uses other methods for requirements elicitation and analysis.

Z works well for analysis of architecturally-complex systems in which scoping is complex and tricky. There are many of those. I don’t think it works so well for specification and analysis where operator interface plays a significant role.

TLA+ works well for analysis of algorithmically-complex systems in which scoping is tricky. There are many of those also. Same caveat on operator interfaces

SCADE works well (at least, I presume it does, for I have no experience. We can’t run it on any of our machines) for systems which predominantly have state-machine-like functionality (i.e. things Lustre can talk easily about). It might do better on operator interfaces than TLA+ or Z, since many of those have state-machine character even if their designers have no clue what a state machine is.

There are a lot of issues which these formal languages and tools try to solve simultaneously. Having unambiguous meaning; handling complex scoping questions precisely and well; being expressive, especially for those purposes for which they were devised (which might conflict with expressiveness for other purposes); being automatically checkable for consistency; enabling formal manipulation for checking statements for certain properties (safety and liveness, for example); and all of this as easily as possible. When one thinks about all this, it is quite understandable why there is no one simple, best solution to all these varying needs. And, when one thinks about it more, maybe there never will be.

Issue 6. “We need a requirements specification language which is easily understandable by domain experts and systems engineers alike.”

It sure would be nice. And probably more efficient. But I am not sure the issue is language. More than “a language”, I think it is important that all kinds of engineers understand the techniques involved. I mean, French is the language of diplomacy but that doesn’t mean everyone can write it like Moliere or Yazmina Reza. I certainly can’t. But it would be good to have some idea of how they do it and to try to copy it, if I were communicating to folks in that language.

In the aftermath of the Überlingen mid-air collision, I looked hard at the algorithms behind TCAS, including the human procedures (algorithms for humans). I needed to introduce some simple state-machines-evolving-through-time techniques to point out to TCAS domain experts that the cognitive architecture of TCAS had problems. These cognitive-architecture problems are present with all versions of TCAS which emit Resolution Advisories and they were not discovered by previous analysts, some of whom are quite distinguished practioners. I did have the benefit of an actual accident to work from in which the conditions were present for the problems to manifest themselves. The point here is that the domain experts needed to understand a formal means of expressing the problem – the techniques, not the language – before they could acknowledge that a problem was there.

Issue 7: “The techniques we use should be usable by the average engineer with a Bachelor’s or Master’s degree.”

I want the brakes on my car to work (at least, I would if I had one). I really don’t care if they were designed by 20 engineers with Bachelor’s degrees or one genius who had to be hired away from Harvard. And neither will the company if they don’t work, something bad happens, and it gets slapped with the bill. Ask a certain automobile components supplier to whom this happened a few years ago. And this company is stocked to the gills with people with doctorates in computer science, engineering and physics.

Boeing and Airbus got where they are by identifying and using exceptional talent. Airbus develops its SW using tools that have come from an institution, one of whose scientists has won the Turing Award in the area this tool exploits.

It would be great to leverage the work of the average engineer by exploiting tools heshe can use with only a Bachelor’s degree. But it’s not a necessity. If my brake company cannot hire away the Harvard genius, I am quite content for them to stick with hydraulics as long as the brakes will work.

Issue 8: “A sample of students makes more mistakes in specification when trying to do so in a formal language with rigorous semantics than the students make when specifying more “naturally”.”

Maybe, but so what? One could imagine that James Watt and Richard Trevithick would probably have made more mistakes if they’d had to use Sadi Carnot’s thermodynamics to design their steam engines, because there is no evidence that either of them were much good at mathematics.

As for me, I usually discover a lot of issues I had overlooked when I actually do sit down to formulate ideas formally. Martyn Thomas, a founder of the system and tool builders Praxis HIS, says this matches his experience also, as also the continuing experience of Praxis.

It takes many people a dickens of a time formulating their bicycle braking distances in calculus, and most of the ones I have suggested do it (all of them computer science students) turned out to do it better, and more accurately, if I asked them to assess braking distances qualitatively. But I’m not like that. I know how to check my calculus for mistakes and I use it to assess the braking performance of my bicycles and the phenomena involved, rather than the other way round. If there is any issue here, it is that of training the skill in others who apparently don’t have it.

Similarly, I see the main issue that of training the skill set of people who write specifications, not that of adapting the specification languages to their existing, rather fragile skill set.

The Abnormal Distribution

We distribute Thoughts