HexHive PhD, MSc, BSc projects
This is a list of possible open, unassigned BSc or MSc research projects in the HexHive group for EPFL students.
The projects are designed to be adjustable and scalable according to the type of BSc, MSc, or short PhD research project depending on the depth of the evaluation and exploration. For all projects we expect an open mind, good coding skills (especially in C/C++), and the willingness to learn. Previous experience with compilers (LLVM), build systems, and reverse engineering helps but is not required.
If you are interested in any of these topics then contact the doctoral students assigned with the project. Please start your email with "[(MSc|BSc) (Project|Thesis)]", e.g., "[MSc Project] LLVM-based code analysis". Please explain your background such as particular coding skills or classes you took in sufficient detail, also highlight why you're interested in the project.
In the HexHive we welcome independent ideas from students as well, as long as they focus on the topics of system and software security, especially (but not exclusively) in sanitization of unsafe code, interactions between different components, mitigations, compiler-based security analyses, or fuzzing. So if you have an idea for your own project let us know and we can discuss! Reach out to the people most closest to your ideas and we'll let the ideas bubble up.
Forensic analysis of memory corruption errors
- Point of contact: Flavio Toffalini
- Keywords: Memory forensic, memory corruption
Memory corruption errors may lead to control-flow/data-only attacks that hijack the program execution flow. This project wants to investigate novel approaches to observe such attacks and eventually provide forensic evidence for future analyses. This work investigates this challenge in the context of event-based programs (e.g., Web APIs). To sum up, we want to investigate the process memory before and after serving a request and identify memory regions erroneously written.
The research questions are:
- Can we identify memory corruption errors for forensic analysis?
- Can we use such information to derive the root-of-cause?
This is preliminary work, the student will need to instantiate a minimum set-up in which to verify future hypotheses. Specifically, the student will pursue the following (or a subset of) tasks:
- Setting up the environment (a VM) and installing a minimum set of application
- Developing a snapshotting mechanism to extract the memory overwritten after a request.
Technical skills (not all needed):
- C/C++ userspace
- System manager
- kernel/event based
- Python (eventually)
- Point of contact: Flavio Toffalini
- Keywords: Linux, library, fuzzing
Unlike fuzzing CLI programs, whose input is modelled as a stream of bytes, fuzzing libraries requires drivers (library consumers) to bridge an input into a sequence of APIs. The code coverage and error discovery depend on the API combinations within the driver. Therefore, it is crucial having interesting drivers to deeply test a target library. Unfortunately, building such drivers is challenging due to a lack of semantic information about the APIs and their usage. Moreover, insidious errors may appear only with rare API sequencies. Current techniques infer API usage from already-existing programs, however, the quality of the new drivers is inevitably limited by the existing consumers. In this project, we aim at generating library drivers without looking into existing consumers. Precisely, we use a combination of static analysis and automatic testing to mine the API usage and automatically build drivers able to explore a vaster library portion of code and trigger more complex errors.
The research questions in this project are:
- how can we design static analysis to infer API dependency information and use them to build interseting drivers?
- how can we use feedback from automatic testing to refine the driver generation (e.g., remove incorrect API sequences)?
The candidate will require to assist the design and develop of a prototype for testing different driver building strategies. The prototype will be a combination of different technologies, such as static analysis over LLVM IR, Python modules for the driver generation, and fuzzer for the automatic testing.
A candidate should be interested in (or familiar with) at least one of the following topics.
- LLVM/Clang (also C/C++ will help)
- Basic knowledge of static analysis
- Python and OOP
- Point of contact: Luca Di Bartolomeo
- Keywords: Android, GUI, fuzzing
Fuzzing has recently seen a lot of use when analyzing libraries and different CLI applications. So far, GUI applications have remained elusive due to the complexity of interactions and different entry points into the code. Especially the complex interaction patterns with the code makes fuzzing GUI applications challenging. Additionally, a driver needs to create events for the individual aspects of the application. Similarly, modern applications frequently interact with a server whose state may be unknown.
The key research questions in this project are:
- How can we design a fuzzer that handles event-based systems
- When fuzzing event-based systems, how can the input generation keep track of fired events and interactions with the GUI app
- If the app has a remote component, how can we model the abstract space of that component
After reading into the topic of fuzzing and event-based state machines, the student will develop a model for event-based fuzzing with GUI interactions as the key part of the project. To evaluate the project, the student will then evaluate the newly generated fuzzer using a set of different GUI applications, targeting the exploration of Linux GUI apps. As a second goal, if time permits, the student will extend the exploration to apps that have remote components.
A candidate should be interested in (or familiar with):
- GUI programming
- C/C++ programming and event-based systems
Reverse-engineering CPU microarchitecture
- Point of contact: Atri Bhattacharyya
- Keywords: microarchitectural attacks, side channels
A number of microarchitectural side-channel attacks have emerged since 2018, of which Spectre and Meltdown surfaced in January. A critical step in all of these attacks has been a reverse-engineering effort to determine the exact behavior of components of the microarchitecture: the cache organization and replacement policy, the branch prediction algorithm and structures.
We would need to characterize the behavior of these components by designing experiments which target specific behavior, and can leak the structure of internal components. An example is the associativity of caches. An expt which leaks this information is the size of eviction sets. By accessing a set of memory addresses which we increase until we have cache evictions, we can figure out associativity. By modifying the access patterns, we can also infer the cache’s eviction policy. A great example on reverse engineering branch predictors on older Intel processors is this blog post series.
We propose a semester project for masters students or dedicated bachelors students in which we tackle this challenge. An ideal candidate have taken a course on computer architecture and be familiar with the microarchitectural components. Experience with coding in x86 or ARM assembly will be very useful (but is not essential).
Ground-up software stack for compartmentalized systems
- Point of contact: Florian Hofhammer
- Suitable for: BSc/MSc semester project
- Keywords: compartmentalization, operating systems, runtime environment
Compartmentalization is the principle of dividing a software up into smaller modules (the so-called compartments) which are isolated from each other with respect to their view on memory and which can only communicate via well-defined interfaces. The goal of this principle is to minimize attack surface and impact of successful attacks against the compartmentalized software by having each compartment operate only according to the principle of least privilege.
In the HexHive lab, we built a system supporting efficient compartmentalization architecturally and adapted a microkernel operating system to support this custom architecture. We propose a semester project in which the student will extend the ecosystem around this architecture and operating system to make it more generally applicable and to get more complex software running on this system.
This includes, but is not limited to, device drivers, development frameworks, and system libraries. Ideally, at the end of the project we will be able to run extensive benchmarks on our system to evaluate both its performance and security guarantees. The project’s scope therefore comprises the full software stack from the operating system and its components up to general-purpose userspace software.
Software Compartmentalization Benchmark suite
- Point of contact: Andrés Sánchez
- Keywords: compartmentalization, modularity, web applications
Compartmentalization is a software-development principle to reduce a program’s attack surface, and limit the exploitability of bugs. A compartmentalized program is separated into a number of compartments, each of which executes with minimal privileges and rights, and communicates through structured API only. Essentially, an exploit in one compartment should not trivially compromise other compartments.
We propose a semester/thesis project for masters students, and skilled bachelors students with software development expertise in which we will compartmentalize high-risk software. Prime examples of such software are webservers, browsers and operating systems. We are open to other suggestions. We would like to eventually have a set of representative software comprising a benchmark suite against which to evaluate the different compartmentalization techniques.
A benchmark suite would preferably be portable, running on different operating systems/libraries, hardware, and be amenable to be ported onto hardware or software research proposals for better compartmentalization.
API flow graph sanitization
- Point of contact: Nicolas Badoux
- Keywords: sanitizer, API graph, component analysis, compiler analysis
Software components (e.g., libraries) often expose a complex API. When using a component, sequences of API calls will build up deep API state that is custom to the component. These API calls are often highly complex and dependent on each other. API call sequences for high level languages (Java) have been analyzed in software engineering but C/C++ libraries so far were neglected due to lack of type information.
In this project, we will build an API inference graph that, using a whole system analysis, collects all uses of a library (starting with a simple file-based search but potentially moving to a compiler-based analysis). This API graph encodes dependencies between individual API calls (e.g., read can only be called after open) and parameters for API calls (e.g., open may specify if the file can be written or not). The API graph encodes all possible API interactions and defines the parameter space for each method.
Based on the API graph we will then search for violations of the inferred values in real applications by implementing a sanitizer that tracks the API state and searches for violations of the valid state.
Incremental fuzzing and sanitization
- Point of contact: Nicolas Badoux
- Keywords: fuzzing, sanitization
Dynamic software testing is currently an all or nothing approach. Software is compiled with sanitization and fuzzing and then evaluated. As software is developed in small steps, it would be advantageous to fuzz the delta that was added. This project will focus on developing a software testing environment that keeps track about areas that were already fuzzed or tested and which test cases had what coverage. This allows that, whenever a part of the software is changed, we can focus on that new part.
- Develop a mechanism to track coverage across input and sanitizer checks
- Correlate the coverage to specific areas of software
- Focus testing on tests that explored the changed area and see if you can focus testing on only the changed part
Evaluation targets are common software repositories for open source projects.
Mining Specification from Blob Dissectors
- Point of contact: Ahmad Hazimeh
- Suitable for: MSc semester project/thesis, internship
- Keywords: fuzzing, grammar, static analysis
Description: Software testing often requires identifying the input vectors of a system and supplying it with meaningful data that exercises its different functionalities. Fuzz testing is one such technique which heavily relies on generating structurally-valid inputs, albeit mostly through random mutations and strokes of luck. Fuzzing can benefit greatly from a target-specific description of the input formats and sequences in order to evaluate the system-under-test (SUT) more effectively. However, writing these specifications is a time-consuming task and poses a daunting challenge for most developers. One key insight, nonetheless, is that specification is all around us: all protocol-compliant software presents a machine-language encoding of some specification. Developers have already done the task of translating human-readable standards and descriptions into machine code, and as such, have already made a leap closer to machine-parsable specification which fuzzers could leverage. The data is there, in the source code, and all that’s left is to trim away the application-specific code that consumes the parsed data.
One good starting point would be blob dissectors, such as those available in Wireshark, Suricata, and Zeek IDS, which mostly encode the bare specification without additional processing. The lack of “clutter” in the code should make it easier to identify the different fields in the data blobs, how to parse them, and what dependencies exist among them. These dissectors also often encode the (partial) protocol state machines for sanity-checking the sequence of messages and tracking requests and sessions. Although the encoded specification may not be entirely complete, it still serves as a goldmine for fuzzers since it should enable more concise and state-aware input generation that could evaluate the SUT more efficiently and more comprehensively.
The project would take place across several phases:
- Reconnaissance: Research what’s already been done in the field of static analysis and extraction of code semantics, and more specifically, work on dissector synthesis and analysis.
- Enumeration: Inspect the existing implementations of dissectors, how they interface with the software (e.g. Wireshark), how they encode specification, and how to manually map source code to a set of rules (i.e. parsing and state-tracking).
- Abstraction: Lift the source code into a higher-level abstraction of the specification, and systematize the specification extraction process. Static code analysis may come in handy at this phase, to help you reason about code from a semantic viewpoint.
- Assessment: Test your system against existing dissectors and identify its shortcomings and how we could address them.
- Packaging: Translate the extracted specification into a fuzzer-readable encoding that enables (optionally, state-aware) input generation with dependency resolution primitives.
Maintaining Magma: A Ground-Truth Fuzzing Benchmark
- Point of contact: Ahmad Hazimeh
- Suitable for: BS semester project, MSc optional project
- Keywords: fuzzing, evaluation, benchmark
Magma is a fuzzer evaluation framework that enables accurate performance measurements by leveraging ground-truth information on bugs in real software. Magma includes a library of real targets (e.g. libpng, libtiff, openssl, etc…) with real bugs that have been re-introduced into those targets based on previous bug reports and fix commits. By reverse-engineering the commit which fixed a certain bug, we can identify what the root cause of the bug was, reintroduce it, and add a check (a canary) to determine when that bug is triggered, based on program state information available at runtime (i.e., variable values).
As fuzzers are tuned and improved on a regular basis, the benchmark upon which they’re evaluated must equally be upgraded, to keep up with the progress and avoid becoming out-dated. To achieve this, new targets and bugs must be added frequently, and old targets and bugs must be checked again for relevance, in case some bugs become unreachable/untrigerrable, or in case the target’s source code has changed enough to disallow the reintroduction of some bug without reintroducing old code functionality.
For this project, you are expected to:
- Identify good target additions to Magma, justify their impact on the evaluation framework, and study the viability of their bug history;
- Enumerate the bugs for a selected target and process them individually, by decreasing order of recency, reintroduce them into the target, and provide the canaries that determine whether or not a bug is triggered.
- Add the target to the Magma code base and evaluate it against the available set of fuzzers to identify the feasibility of fuzzing it and triggering the injected bugs.
As a tangent, Magma could also benefit greatly from functional improvements:
- automatic distributed task scheduling, work collection, and post-processing pipelines
- periodic scheduling of fuzzing campaigns with real-time monitoring of progress through the web-based reports page
- CI/CD pipeline for evaluating new targets and bugs, identifying
build errors on the
devbranch or regression-testing old bugs that can no longer be triggered due to updated target code bases
Extracting Bug Constraints from Execution-Flow Deltas
- Point of contact: Ahmad Hazimeh
- Suitable for: MSc semester project/thesis, internship
- Keywords: root cause, constraints, control-flow, delta analysis
Patch mining is the process of analyzing a fix in program code to identify the semantics of the change and reconstruct the root cause of the bug based on the applied modifications. This method could mainly be used for the development of 1-day exploits , or for program repair [2,3]. Nevertheless, patch mining has also been used for the development of Magma. By inspecting previous bug fixes, we can identify and localize the changes in source code responsible for the fix, and consequently infer the original root cause of the bug, reintroduce it, and provide a set of constraints which determine whether or not the bug is triggered at runtime. However, this process was applied manually during Magma’s development, making it the most time-consuming aspect of this benchmark. In order to enable faster updates and more swift maintenance of Magma, automating the patch mining and constraint extraction process is key.
By examining the flow of control and data before and after the patch and identifying the differences, we should be able to reconstruct the original constraints for triggering the bug.
For this project, you are thus expected to:
- Research methods of statically analyzing code snippets and constructing partial CFGs describing the operations and their context within the modified code
- Establish a method for collecting symbolic constraints from these partial graphs, and calculate the diff before and after the patch
- Translate the diff in constraints into a (set of) boolean statement(s) that could determine whether the bug is triggered in the original (unfixed) program
- Evaluate your approach on existing bugs in Magma (as control), and on new bugs handpicked from the plethora of CVEs published daily
ARM64 Kernel Driver Retrowriting
- Point of contact: Luca Di Bartolomeo
- Keywords: Retrowrite, binary rewriting, mobile reverse engineering
A common feature of the Android ecosystem are proprietary binary blobs. Vendors may not update these and may not compile them with the latest exploit mitigations. A particular cause of concern are kernel modules given their privileged access.
Hexhive’s Retrowrite project is a state-of-the-art binary rewriting tool that can retrofit mitigations to legacy binaries without the need for source code. This currently works on ARM64 and x86-64 platforms, and x86-64 in kernel mode. The goal of this project would be to target ARM64 kernel modules, with the ability to add for example kASAN. We would aim to:
- Identify kernel modules of particular interest, including open source modules to act as ground truth.
- Produce a framework to evaluate the effectiveness of binary rewriting these modules by exercising their functionality, using fuzzing where appropriate.
- Modify Retrowrite to support ARM64 kernel modules.
- Evalaute the implementation against ground truth targets and against targets of interest. Evaluate the cost of instrumentation passes.
Students should have a basic understanding of how Linux kernel modules are built and loaded, and a good grasp of Linux internals. Ambitious students may also have Android Internals knowledge and be interested in testing their work on Android hardware.
Leveraging application security through memory tagging
- Point of contact : Andrés Sánchez
- Keywords: Software development, virtual memory, compilers
Memory tagging is a hardware extension that adds a level of restriction when dereferencing memory addresses: the key held should match the memory key. This extension can be found implemented both by Memory Protection Keys (MPK) and Memory Tagging Extension (MTE), corresponding respectively to x86-64 and ARM64 architectures, which have a different granularity (page vs 16 bytes) and way to store the key (register or per-pointer), resulting in a substantially different programming model.
The adoption of such a technology would be decisive for finding memory safety bugs in existing pieces of code such as databases, cryptographic toolkits, operating system kernels, web servers, web browsers… Albeit this technologies are acknowledged (like MPK for which the Linux kernel provides an interface), their adoption from the application side requires a previous study which remains to be done.
This project includes: * Acquisition of familiarity with a relevant program source code base that would benefit through memory tagging. * Source code modification of the codebase to include support to memory tagging. * Functionality testing and performance impact benchmarking * Potential adoption of the source code modification in the project upstream
This project can be performed by either bachelor or master students, as there are different challenging codebases that can be addressed. It is also possible to do a master thesis out of it by creating a compiler-based framework that outlines in a sound way the possible protections an application can receive and analyzes them.
Improving Clang Static Analyzer’s False-positive and False-negative Rates
- Point of contact: Gwangmu Lee
- Keywords: static analyzer, symbolic analysis, software analysis
Modern static analyzers like Clang Static Analyzer partially incorporate symbolic execution to improve the correctness of analysis. In particular, they provide generic symbolic analysis APIs to “checkers”, so that they can perform simple symbolic analysis on the source code to discover a dedicated kind of issues.
However, static analyzers are still known for its high false-positive and false-negative rate, meaning that either most issues reported by the analyzers are non-existent (high false-positive) or most real issues are not reported at all (high false-negative). In this project, we will find how severe those limitations are and identify the reasons that contribute to these limitations. Finally, we will try to improve the issue-detecting capability by addressing the reasons.
To be specific, you are expected to do the followings:
- Understand the inner workings of Clang Static Analyzer.
- Given a selected set of checkers, find the false-positive and false-negative rates on various source samples.
- Identify the reasons for false-positives and false-negatives with collected samples.
- Improve the issue-detection capability by addressing some of the identified reasons.
Several other projects are possible in the areas of software and system security. We are open to discuss possible projects around the development of security benchmarks, using machine learning to detect vulnerabilities, secure memory allocation, sanitizer-based coverage tracking, and others.