In October 2013, an Oklahoma civil court found that it was more likely than not that faulty control SW had caused a car crash in which the car accelerated, contrary to the apparent intention of the driver, resulting in death and severe injury to the occupants. Readers may find some discussion of this at the time on the archive of the System Safety Mailing List at http://www.systemsafetylist.org , starting with a post from me on October 31 at http://www.systemsafetylist.org/0668.htm .
The jury reached their conclusion after considering evidence from Michael Barr, an embedded-software expert, and Professor Philip Koopman, a digital-embedded-systems expert at Carnegie Mellon University (PBL Disclosure: I know and admire Phil and his work). Barr had found a fault in the control SW that could have caused the phenomena recounted by the accident investigators. Koopman has extensive material on the case at https://betterembsw.blogspot.de/2014/09/a-case-study-of-toyota-unintended.html
Recently, Rod Chapman drew our attention to a blog post by embedded-software developer David Cummings at https://www.embedded.com/electronics-blogs/say-what-/4459136/Why-every-embedded-software-developer-should-care-about-the-Toyota-verdict . Cummings is highly qualified, with over three decades of experience, some of it at JPL, and higher degrees including a PhD from UCLA. And he criticises the testimony of Koopman and Barr.
I think Cummings’s arguments are poor.
Cummings starts by addressing the alleged quality of the Toyota code, which the Oklahoma plaintiffs asserted was low. He says the plaintiffs have a “flawed causation theory”, which he had addressed in an article for an IEEE magazine.
(This is aside from the arguments Cummings deploys here, but let me address it anyway. I didn’t read any “causation theory” in the Oklahoma testimony when it came out. I read testimony saying that there was a flaw in the code which could have caused the accident. And that neither the manufacturer itself nor a NASA investigation had discovered that flaw, which showed that the manufacturer itself did not know exactly what its code did. Accompanying that point was lots of testimony that, no matter how much effort had been spent, it is likely that this is just one of many undiscovered faults in the SW. And the jury judged that it was more probable that some then-undiscovered SW fault caused the crash than not.)
Cummings’s first point about quality in his blog post is the company’s use of about 10,000 global variables, which Koopman criticised as exhibiting poor quality. Cummings looked at some code on Phil’s site, for a project called Ballista, and noticed use of global variables, as well as some commentary from the authors explaining why they used global variables. Cummings suggests that the ratio of global variables to LOC in Ballista is broadly the same as what the Toyota code has.
His second point concerns violations of MISRA C coding guidelines. He finds two pieces of code on what I presume is Michael Barr’s site, one computing CRCs, and another performing memory testing, amounting to 572 LOC in total. He finds MISRA C violations in that code at a higher violation-to-LOC ratio than in the Toyota code.
His third point is the use in the court testimony of some research on fault density by Roman Obermaisser, an embedded-systems Prof at Uni Siegen https://networked-embedded.de/es/index.php/staff-details/obermaisser.html and former student of Hermann Kopetz in Vienna. Cummings says that Obermaisser characterised 2% of faults as “arbitrary” in his investigations, but that Phil changed this to “dangerous” in his Oklahoma testimony and used this to estimate the temporal occurrence of dangerous failures (let us take here a “failure” as an untoward event caused by a fault).
There are two observations to be made right off.
First, Cummings’s first and second points are straightforwardly ad hominem. They have nothing to do with the quality (high, low, whatever) of Toyota’s code. He is suggesting that people providing small, free, pieces of code on their WWW sites should hold themselves to the same quality metrics to which Toyota’s code is being held. Well, maybe so, maybe not, but that is irrelevant to determining the quality of Toyota’s code, which is what the testimony concerned.
Second, when discussing the use of Obermaisser’s results, Cummings makes no distinction between fault and failure, despite claiming expertise. Arbitrary faults could well cause dangerous failure. They could also cause non-dangerous failure. The point is that no one knows. The Oklahoma testimony apparently said that, according to an application of Obermaisser’s results, one can expect that every week or two an arbitrary failure would occur in the car fleet in question. Cummings objects to the ligature of “arbitrary” with “dangerous”. In fact, it is standard practice in safety-critical systems (at least in Europe) to consider any failure which you don’t know to be safe to be dangerous (there are whole engineering practices sprung up around this, such as the calculation of “safe failure fraction” in IEC 61508-conformant development). The testimony Cummings quotes seems to conform to this practice.
That is surely a very weak collection of technical arguments.
Cummings must know that the quality control an author applies to a few hundred lines of code made freely available on a WWW site is a very different kind of process from the kinds of quality control you need to apply to million-LOC software in a safety-critical application.
Take the few hundred lines of CRC code which Cummings cites as an example. There is a reasonable chance that the author knows (and I mean “knows”) that it works. There is no chance of that with a million-LOC program such as the Toyota SW under consideration in the testimony. The cognitive task is way beyond humans.
Besides, if you were to want to use that CRC code in a safety-critical application, there are reams of inspections and investigations to perform before you do so – IEC 61508 lists close to sixty documentation requirements which are target-system-specific, and I bet ISO 26262 isn’t that far behind (I haven’t explicitly checked ISO 26262).
Further, I don’t think there is any suggestion that any of this free code is specifically being intended for road-vehicle applications, whereas MISRA C coding standards are specifically for road-vehicle applications and it is thereby appropriate to assess road-vehicle software for compliance, as in the Oklahoma evidence.
It does seem to me to be faintly ridiculous to take some academic software (such as the Ballista SW) to compare with a key program component in production-car control SW.
Now time for me to go ad hominem. Why would an experienced and academically highly-qualified SW developer such as Cummings make such poor arguments? One reason may be that he was looking for arguments contra Koopman/Barr and couldn’t find any better ones. But why would he be looking for such arguments in the first place? Phil points out in his rebuttal https://www.embedded.com/electronics-blogs/say-what-/4459140/A-rebuttal-to—-Why-every-embedded-software-developer-should-care-about-the-Toyota-verdict— that Cummings’s company is under contract to a large automobile company itself involved in unintended-acceleration litigation. That might be something for readers to consider. Cummings should have mentioned it.