Gleb Naumovich's Research


Finite State Verification of Concurrent Software

Finite state verification (FSV) is a family of approaches to software quality assurance that attempt to verify correctness of some aspects of the software system under analysis. FSV techniques provide an important alternative to testing. Because exhaustive testing of realistic software systems is impractical, testing approaches does not provide guarantees about the correctness of the program. In contrast, FSV techniques prove, where possible, that all possible behaviors of the software system under analysis conform to a set of requirements, or properties. FSV also provides an alternative to theorem proving techniques in that FSV techniques are generally more automated and easier to use than theorem proving techniques.

A wide variety of FSV techniques have been proposed. I have worked on a FSV technique called FLAVERS (FLow Analysis for VERification of Systems). This technique was originally proposed by Matt Dwyer and Lori Clarke and is based on performing data flow analysis on a representation of a concurrent program, where the data being propagated through this representation are the states of the property being analyzed. FLAVERS has several interesting characteristics that make it stand out among other FSV techniques. First, FLAVERS is capable of improving the precision of verification by incorporating additional information in the analysis. Second, FLAVERS targets analysis of real programming language structures. Finally, FLAVERS uses a simple and easy to use property specification language.

My contributions to the FLAVERS project include numerous optimizations of the technique (many of these are described in the next section of this research overview). I have proposed and implemented a version of FLAVERS that can handle concurrency primitives of Java (originally, FLAVERS was developed for the Ada programming language). Finally, I have carried out several case studies investigating the applicability of FLAVERS to systems from several application domains.

Relevant publications


Optimizing Finite State Verifiers

Finite State Verification (FSV) techniques are important for software quality assurance. Unfortunately, in the worst case, the problem of verifying that a given property holds for a finite representation of a concurrent system is NP-hard in the size of this finite representation. This limitation is known in FSV literature as the state space explosion problem. To make FSV practical, FSV techniques attempt to improve their efficiency by using heuristics to reduce the amount of checking that needs to be done.

I have developed several optimization techniques for the data flow analysis based FSV technique FLAVERS. One of these techniques is based on the general algorithm for determining regions that can be executed in parallel in a concurrent system, described in the next section of this research overview. Another technique uses the partial order-like refinement of the flow graph representation of the system in FLAVERS.

My interest in optimization techniques is not confined to FLAVERS. Recently, I received an NSF CAREER award for investigating applicability of various optimization approaches to different FSV techniques, with the goal of consolidating the knowledge accumulated from doing optimizations on several different FSV techniques. Currently, I am looking for a Ph.D. student who would like to work on this project.

Relevant publications


May Happen in Parallel Analysis

Knowledge about which statements in a concurrent program can possibly execute in parallel is important for program understanding, error detection, and optimization of concurrent programs. For example, such analysis can help detect data races, situations where a memory location is simultaneously modified by two different threads of control. In collaboration with George Avrunin and Lori Clarke, I have developed a novel polynomial-time data flow algorithm for computing a conservative estimate of pairs of statements that may happen in parallel (MHP). We have proposed versions of the MHP analysis for the Ada and Java concurrency models. Our experiments with the MHP algorithm for Ada showed that it is significantly more precise than the most precise of the previously proposed polynomial-time approaches.

At present, our MHP analysis has an important limitation of being intraprocedural, which means that it requires inlinining of the functions called by the threads of control in a program. Also, at present, the implementation of the algorithm assumes that the aliasing information for the program is known. I am working on addressing these limitations. Once an interprocedural, alias-aware MHP algorithm is defined, experiments with an implementation of the algorithm will be performed to determine its performance and precision on concurrent Java programs of realistic size.

Relevant publications


Static Analysis of Application Security

Open distributed systems are becoming increasingly popular. Such systems include components that may be obtained from a number of different sources. For example, Java allows run-time loading of software components residing on remote machines. One unfortunate side-effect of this openness is the possibility that ``hostile'' software components may compromise the security of both the program and the system on which it runs. Java offers a built-in security mechanism, using which programmers can give permissions to distributed components and check these permissions at run-time. This security model is flexible, but using it is not straightforward, which may lead to insufficiently tight permission checking and therefore breaches of security.

I am interested in development of automated analyses of security model usage. I have developed a static data flow algorithm for analysis of the flow of permissions in Java programs. This algorithm produces, for a given instruction in the program, a set of permissions that are checked on all possible executions up to this instruction. This information can be used in program understanding tools or directly for checking properties that assert what permissions must always be checked before access to certain functionality is allowed.

Relevant publications


Replay of FSV Counterexamples

When a FSV technique finds a violation of the property of interest, this violation is reported as a counterexample. Generally, a user of a FSV technique expects that counterexamples correspond to bugs in the system being verified. Unfortunately, this is not always true. Since FSV techniques operate on models of software systems instead of systems themselves, often violations reported by FSV techniques do not correspond to bugs in the actual systems. The reason for this is that, for efficiency reasons, models only approximate the actual behaviors of software systems. The FSV technique FLAVERS advocates adding the analysis precision incrementally, by iteratively constructing and checking more and more precise models, until either the property of interest is proved or the counterexample corresponds to a real bug in the system.

At present, the decision about whether a counterexample corresponds to a bug or is just a result of imprecise modeling is left to the human analyst. Making this decision is far from straightforward and, as a consequence, is laborious and error-prone. In collaboration with Phyllis Frankl, I have proposed a method for combining FSV with dynamic analyses (testing), so that an automated testing process attempts to find an actual execution of the program that corresponds to a counterexample found by a FSV technique. I have implemented a prototype of the test driver that schedules threads in a concurrent program according to the counterexample. We are looking for graduate students interested in working on this project.

Relevant publications


Software Obfuscation

With commercial software increasingly distributed in forms from which source code can be extracted easily, protection of intellectual property on software is of great importance. Obfuscations are program transformations that preserve the program functionality while obscuring the code, thereby protecting the program against reverse engineering. Unfortunately, the existing obfuscation techniques are limited to obscuring variable names, transformations of local control flow, and obscuring expressions using variables of primitive types.

In collaboration with Nasir Memon, Heather Yu, and Mikhail Sosonkin, I developed three techniques for obfuscation of program design. The class coalescing obfuscation replaces several classes with a single class. The class splitting obfuscation replaces a single class with multiple classes, each responsible for a part of the functionality of the original class. The type hiding obfuscation uses the mechanism of interfaces in Java to obscure the types of objects manipulated by the program. We show the results of our initial experiments with a prototype implementation of these techniques. In particular, we show that the runtime overheads of these obfuscations tend to be small.


Software Watermarking

Software watermarking has been recently proposed for proving authorship of software. In collaboration with Nasir Memon and Heather Yu, I am working on a dynamic watermarking technique for software programs. In this approach, dynamic data structures of a program are used to embed a message that identifies the author (owner) of this program.

A problem related to watermarking is obfuscation of software programs, making programs difficult to reverse-engineer. In the context of watermarking, obfuscation makes it difficult for software pirates to detect and remove the watermark.