Senior Engineer, NASA Jet Propulsion Laboratory
The Bugs that went to Mars and Terrorized Earth
Since its dramatic landing in Gale crater in August 2012, the Curiosity Rover has been busy exploring the surface of Mars, looking for evidence of past habitable environments. Having just completed its fourth year on Mars, and with 15 kms on its odometer, Curiosity has already made historic discoveries, finding evidence of an ancient freshwater streambed, organic molecules and other key ingredients necessary for life. Yet, in spite of its great successes, the mission has not been without a few hiccups. In this talk, we discuss the most significant of these: the Sol-200 anomaly, when the failure of a single memory chip uncovered three latent software bugs that nearly killed the mission. We describe how the anomaly manifested itself, how recovery was achieved, and lessons learnt from the experience. The work described in this talk was carried out at Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration.
Rajeev Joshi is a Senior Software and Systems Engineer at the Lab for Reliable Software at NASA's Jet Propulsion Laboratory, where he works on building and applying tools based on formal methods to improve mission software reliability. He was a member of the Curiosity rover flight software development team, as engineer responsible for onboard data management and communication behaviors. He was also a member of the surface operations team, serving as data management chair and supporting anomaly investigations. For his work on Curiosity, he received two JPL Mariner Awards and the NASA Exceptional Achievement Medal. He holds a B.Tech in Computer Science from the Indian Institute of Technology, Bombay, and an MS/PhD (also in Computer Sciences) from the University of Texas at Austin. His previous employment includes 4 years at the DEC/Compaq/HP Systems Research Center (SRC) in Palo Alto, CA, and 2 years at AT&T Bell Labs in Murray Hill, NJ. He is a member (and current secretary) of IFIP Working Group 2.3 on Programming Methodology.
Associate Professor, Carnegie Mellon University
Embedded System Software Quality: Why is it so often terrible? What can we do about it?
Failures of embedded system software increasingly make the news. Everyday products we rely upon are suffering from safety issues, security issues, and just plain bugs. While perfection is unrealistic, surely we can improve this situation. Two key ideas apply: (1) embedded products often aren’t created by computer specialists, and (2) teaching application domain specialists just how to code is more of a problem than a solution.
Dr. Philip Koopman is an Associate Professor of Electrical and Computer Engineering at Carnegie Mellon University, where he has worked in the areas of wearable computers, software robustness, embedded networking, dependable embedded computer systems, and autonomous vehicle safety. Previously, he was a submarine officer in the U.S. Navy, an embedded CPU architect for Harris Semiconductor, and an embedded system researcher at United Technologies. Dr. Koopman has served as a testifying expert witness for automotive unintended acceleration cases on the topic of software safety. In addition to authoring numerous scholarly publications and the book Better Embedded System Software, he is a named inventor on 26 U.S. patents. He has conducted nearly 200 embedded system design reviews for industry projects. He is a senior member of IEEE and the ACM, and a member of IFIP WG 10.4 on Dependable Computing and Fault Tolerance.
R&D Director, ClearSy
Industrial use of Formal Methods for Safe and Secure Systems
Industrial applications involving formal methods are still exceptions to the general rule. Lack of understanding, employees without proper education, difficulty to integrate existing development cycles, no explicit requirement from the market, etc. are explanations often heard for not being more formal. The rare metrics available comparing “traditional” and formal methods are usually specific to a given context and are difficult to transpose to any other situation. Understanding the technical / organisational / economical / regulation reasons leading to their adoption would be more informative and decisive than just comparing source code production ratios.
Summarizing a 20-year return of experience in the effective application of a formal method – namely B and Event-B – in diverse application domains (railways, smartcard, automotive), this talk presents a number of distinct application cases where formalities have been applied, making clear why and where formal methods have been applied and explaining the added value obtained so far. Contribution to design, verification & validation, and third-party assessment are considered with a focus on the aspects that are directly linked to acceptance, especially for highest safety (SIL3/SIL4 EN50128 and IEC61508) and security (EAL6+ Common Criteria 3.1) levels. To obtain such results, a very strong support of public research was required through regional, national and European funded collaborative projects, as well as industrial support from key industrial players.
Thierry Lecomte is R&D Director at ClearSy and is involved in the development of Atelier B since 1995, with a focus on proof and code generation (C, LLVM, VHDL), and in more than 20 collaborative R&D projects (industry, National, European, and Japanese fundings). He has special interests in applying formal methods to safety critical systems (railways, energy, space) and to secure systems (smartcards).