Reliable Systems

Overview | People | Collaborators | Sponsors | Publications | Tools

The Programming Reliable Systems research projects seek to develop approaches which will improve the reliability of complex software systems. The focus is on both the software development process and products.

Go to external website:

The goal is first the identification and characterization of reliability problems within applications by the analysis of the source program, dynamic testing, and user studies. How did the software not work properly? Did it crash or reboot the machine, abort the application, or perhaps silently fail and produced incorrect results. The second goal is to identify common programming problems and then approaches for mitigating the problem without enforcing added complexity or severe restrictions upon the programmer. Many causes of software reliability problems include: non-orthoginal error reporting, lack of knowledge management during the software development processing, and an over-abstraction of the computer system.

FlakyIO is an extreme I/O testing project that examines the ability of applications to handle exceptions. We use callee-generated software exception generation to determine an application's ability to handle error conditions. The concept of the FlakyIO architecture can be applied to any subsystem or module that generates an exception for the caller to handle. We chose to explore I/O because I/O has a well-known standard for error generation. We are able to test a large number of applications without having to customize the exception generation. Guide to writing your own FlakyIO system

Program Data Characteristics is an approach that helps guide a programmer by explaining how data is currently being used in the application. For example, while you may not have identified a variable as a const, does it act like one?

Thread Analysis shows program characteristics to help eliminate thread problems. Our approach is to identify the thread safe code. Rather than force a developer to look at the entire application base, simply eliminate what we know is thread-safe.

Null Pointer Analysis examines the pointers that are passed into subroutines. What subroutines fail to check the pointers to make sure a NULL pointer has not been passed in and indirected upon, causing a crash. If the subroutine does not protect itself, then the caller should. We simply let you know where the vulnerabilities are.

SCUM is a static program analysis mechanism for providing a robustness metric of an application. The identification of general robustness problems can be used to provide feedback to the programmer to direct the manual insertion of error checks into the application code at the most appropriate location.