With funding from Office of Naval Research we are working on a project to determine whether a system's security may be enhanced by its dynamic reconfiguration at moments when intrusion attempts are detected. The idea itself is quite simple: since an intruder must commonly know something about a target system in order to access its software in some unanticipated way, it seems likely that the intruder could be foiled if we rapidly change a system's underlying components to ones that are implementationally different but functionally equivalent. This approach has been applied by others with success at a small scale, by means of code randomization, so the essence of our work is to consider what added value accrues from reconfiguration at the systems level.

Evaluation of this idea thus involves several key activities.

  1. Leveraging our previous experiences in development of a dynamic reconfiguration mechanism, we are crafting appropriate experimental apparatus to serve our evaluation of change that would be triggered by distributed detection of potential intrusions.
  2. We are developing the analytic tools needed to predict and recognize potential intrusion attempts, and to suggest likely alternate configurations which would disrupt them.
  3. We are deriving the statistical tools needed to evaluate whether the approach does indeed yield an improvement in security, and if so, by how much in likely scenarios. This thrust in particular requires development of a serious corpus of exploits to be used as basis for objective evaluation of the approach.

BugBox logo

BugBox is a security vulnerability dataset designed to support the evaluation of techniques for responding to, mitigating, or preventing exploits. Many methods for improving the security of software function in either a static manner – locating or reducing the risk of vulnerabilities at development time – or a dynamic manner – mitigating or preventing exploits at runtime. Evaluating static techniques requires a program's source code or binaries, while evaluating dynamic techniques can only be done in the context of a running system.

By coupling an automated exploit mechanism, a reproducible runtime environment, and a corpus of open-source applications with known vulnerabilities, BugBox allows for both varieties of security enhancements to be tested and evaluated. A virtual machine test harness automatically stages vulnerable applications and invokes scripts which exploit these known vulnerabilities. Planned enhancements for BugBox include the addition of static information on vulnerabilities, allowing for static and dynamic techniques to be directly compared. Please contact us for access to the updated images.


This dataset contains security vulnerability data and computed machine learning features for multiple versions of three PHP web applications: PHPMyAdmin, Moodle, and Drupal. This data was collected for a vulnerability prediction study; however, it can also be used for other empirical security vulnerability research not related to prediction.

All vulnerabilities in this dataset were verified and localized to individual files by hand. In cases where multiple releases of an application were studied, the origin of each vulnerability and the path of each vulnerability's migration through the code over time is also recorded.

Download the dataset


Numerous methods for assessing the security of software have been proposed, such as measurement of attack surface, statistical prediction, and the analysis of potential attack paths. However, these assessment methods are not always formulated in a way that is actionable (and verifiably effective) for developers or end users. Some models and methods can be trained and evaluated with easy-to-obtain empirical data but can only be acted upon in limited or unfeasible ways. Other security assessment methods appear to be highly actionable but have not undergone an extensive empirical evaluation which would demonstrate their value in practice.

Our goal is the ability to generate assessments (or measurements) of software security which can guide developers and end users through feasible mitigations (such as reconfiguration) which would have demonstrably improved security in the context of known security issues which had occurred in the past. Current and planned activities toward this goal include:

  • Security vulnerability data collection: Datasets of security vulnerabilities with extensive metadata (such as the vulnerability's manifestation at runtime and the circumstances and root cause of the vulnerability's introduction) can be used to evaluate software security assessment methods. However, these datasets are uncommon, due to the time and expertise required to collect them. We are developing a public security vulnerability dataset, and techniques for automated derivation of additional metadata data are being investigated.
  • Deriving guidance from predictive models: It has been demonstrated that developers can use predictive models utilizing a variety of features to locate source code which is more likely to contain vulnerabilities. We are seeking alternative ways to leverage these models and features to motivate activities which end users are capable of performing, such as reconfiguration.
  • Supporting reproducible research: In order to facilitate better comparison of software security models and methodologies, we are currently developing shared tools which complement shared datasets by more easily replicating the methodology used in software security experiments.
  • Improving prediction methodologies: As an additional byproduct of our research, we are also developing methodological improvements related to predictive models, such as improving cross-product prediction performance, identifying software attributes likely to improve prediction, and investigating the reasons for observed relationships between software attributes and the presence of vulnerabilities.

Project Lead: Jeff Stuckman


Facial recognition has been a rapidly advancing field of research over the past several years. An important factor in all existing facial recognition algorithms is a strong database: in order to recognize a person, a machine needs reliable data about a person's facial features. This problem becomes particularly apparent when the desired subject is a stranger, a pedestrian passing by in day-to-day life. Project Dataface strives to meet this need by using lightweight and camera-equipped wearable devices-specifically Google Glass-to collect large amounts of data in a nonintrusive manner. Democratized surveillance, the idea that any citizen can freely surveil the public domain, drives this project. Recently, this project has investigated the plausibility of Google Glass as a platform for facial data collection, and prototyped an application for collecting such data. Project Lead: Jeremy Krach


Attackers often target individual systems to gain access to an entire network by exploiting vulnerable service-level applications. Since a server is required to deliver content to clients indiscriminately, it is difficult to configure these systems to serve legitimate content to a legitimate client and false data to a malicious client without modifying the application server. In response, we have developed a container-based reconfiguration model to mitigate security threats posed by malicious clients while maintaining service to all clients, hostile and benign. Live session migration occurs seamlessly during real-time client-server sessions and is the central component of our container based reconfiguration model. Our system provides a layer between individual server applications and their clients to mitigate complex attacks by performing internal reconfiguration of isolated containers hosting modified instances of the server application. The system in development is flexible to the application target and provides API hooks that allow cyber systems engineers to implement dynamically reconfigurable applications without modifying the application server code. Project Lead: Greg Bekher