HexHive

HexHive PhD, MSc, BSc projects

This is a list of possible open, unassigned BSc or MSc research projects in the HexHive group for EPFL students.

The projects are designed to be adjustable and scalable according to the type of BSc, MSc, or short PhD research project depending on the depth of the evaluation and exploration. For all projects we expect an open mind, good coding skills (especially in C/C++), and the willingness to learn. Previous experience with compilers (LLVM), build systems, and reverse engineering helps but is not required.

If you are interested in any of these topics then apply through our application form. All project applications have to go through this form and we will internally discuss all applications and then invite students for interviews. The first application deadline for projects in fall '24 is July 04 and the second deadline is August 31, 2024.

In the HexHive we welcome independent ideas from students as well, as long as they focus on the topics of system and software security, especially (but not exclusively) in sanitization of unsafe code, interactions between different components, mitigations, compiler-based security analyses, or fuzzing. So if you have an idea for your own project let us know and we can discuss! Reach out to the people most closest to your ideas and we'll let the ideas bubble up.

Library Fuzzing

Point of contact: Flavio Toffalini
Keywords: Linux, library, fuzzing

Unlike fuzzing CLI programs, whose input is modeled as a stream of bytes, fuzzing libraries requires drivers (library consumers) to bridge an input into a sequence of APIs. The code coverage and error discovery depend on the API combinations within the driver. Therefore, it is crucial having interesting drivers to deeply test a target library. Unfortunately, building such drivers is challenging due to a lack of semantic information about the APIs and their usage. Moreover, insidious errors may appear only with rare API sequences. Current techniques infer API usage from already-existing programs, however, the quality of the new drivers is inevitably limited by the existing consumers. In this project, we aim at generating library drivers without looking into existing consumers. Precisely, we use a combination of static analysis and automatic testing to mine the API usage and automatically build drivers able to explore a vaster library portion of code and trigger more complex errors.

The research questions in this project are:

how can we design static analysis to infer API dependency information and use them to build interesting drivers?
how can we use feedback from automatic testing to refine the driver generation (e.g., remove incorrect API sequences)?

The candidate will require to assist the design and develop of a prototype for testing different driver building strategies. The prototype will be a combination of different technologies, such as static analysis over LLVM IR, Python modules for the driver generation, and fuzzer for the automatic testing.

A candidate should be interested in (or familiar with) at least one of the following topics.

LLVM/Clang (also C/C++ will help)
Basic knowledge of static analysis
Python and OOP

Android acropalypse

Point of contact: Luca Di Bartolomeo
Suitable for: Msc Semester Project / Thesis
Keywords: Reverse Engineering, Static Analysis

You might have heard about the recent security disaster that is aCropalypse. Well, it turns out that the reason behind this bug is Google silently updating some Android’s API for opening files which causes files not to be truncated anymore when opening them.

This is pretty wild and we think that there might be many more applications of aCropalypse, not just cropped screenshots. This project is about writing tooling to automatically analyze Android apks and searching for potential alternative data leaks.

A candidate should be interested in:

Android application reverse engineering
Static analysis tooling for Android apks

Type confusion test suite

Point of contact: Nicolas Badoux
Keywords: sanitizer, type confusion, test suite

Type confusion is a common vulnerability in C/C++ programs. It occurs when a type is incorrectly casted to another type. This can lead to memory corruption and code execution. HexHive has published a number of works trying to detect and mitigate the impact of type confusions. The goal of this project is to create a test suite for type confusion detection tools. Recent works have been evaluated on a common run time performance benchmark but they miss a validation on a common set of type confusion bugs. The test suite will be composed of a set of programs and unit test with type confusion bugs. Some bugs should be based on real world vulnerabilities while others can be purely synthetic.

We would aim to:

Identify a set of type confusion bugs in real world programs. Create a set of
synthetic type confusion bugs. Create a representative set of unit tests for
type confusion detection tools. Evaluate state-of-the-art type confusion
detection tools on the test suite.

Students should have a basic understanding of how C/C++ programs are built and a good grasp of Linux internals.

Fuzzing C++ libraries

Point of contact: Nicolas Badoux
Keywords: library fuzzing, fuzzing, C++

The candidate will be required to identify the necessary adaptations to the existing C library fuzzing tool as well as implement support for them in the existing framework. The prototype will be a combination of different technologies, such as static analysis over LLVM IR, Python modules for the driver generation, and fuzzer for the automatic testing. The candidate will also be in charged of finding and motivating the choice of suitable C++ libraries to test.

A candidate should be interested in (or familiar with) the following topics.

LLVM/Clang (also C/C++ will help)
Python

ARM64 Kernel Driver Retrowriting

Point of contact: Luca Di Bartolomeo
Keywords: Retrowrite, binary rewriting, mobile reverse engineering

A common feature of the Android ecosystem are proprietary binary blobs. Vendors may not update these and may not compile them with the latest exploit mitigations. A particular cause of concern are kernel modules given their privileged access.

Hexhive’s Retrowrite project is a state-of-the-art binary rewriting tool that can retrofit mitigations to legacy binaries without the need for source code. This currently works on ARM64 and x86-64 platforms, and x86-64 in kernel mode. The goal of this project would be to target ARM64 kernel modules, with the ability to add for example kASAN. We would aim to:

Identify kernel modules of particular interest, including open source modules to act as ground truth.
Produce a framework to evaluate the effectiveness of binary rewriting these modules by exercising their functionality, using fuzzing where appropriate.
Modify Retrowrite to support ARM64 kernel modules.
Evaluate the implementation against ground truth targets and against targets of interest. Evaluate the cost of instrumentation passes.

Students should have a basic understanding of how Linux kernel modules are built and loaded, and a good grasp of Linux internals. Ambitious students may also have Android Internals knowledge and be interested in testing their work on Android hardware.

Benchmarking Fuzzers for Structured Text Input Software

Point of contact: Chibin Zhang
Suitable for: Master Thesis Project
Keywords: fuzzing, benchmark, compilers, data analysis

Fuzzing is an effective technique for finding bugs in software. Prior works have created benchmarks to assess the performance of fuzzers. However, these benchmarks are biased towards targets that accept binary inputs and towards fuzzers that mutate at the byte level. Additionally, they suffer from saturation, meaning the performance differences between top fuzzers are often insignificant. It is a known issue that existing byte-level fuzzers do not perform well on targets accepting structured text inputs. Current fuzzing benchmarks do not include state-of-the-art structure-aware fuzzers, such as grammar fuzzers, in their baselines. This is due to the fact that these fuzzers typically require additional grammars, dictionaries, or large seed corpora. Furthermore, existing structure-aware fuzzers have been evaluated on a limited set of disparate targets, run with different specifications, making it challenging to compare their performance quantitatively or even qualitatively.

In this project, you will create an extensive benchmark for targets that accept structured text inputs. You are expected to integrate at least 8 structure/syntax-aware fuzzers and 16 new targets (latest version), along with the required grammars, dictionaries, and corpora. It is suggested to use the Nix build system, as its build configurations are written declaratively and build artifacts are deterministic. This choice is anticipated to streamline the benchmarking process and ensure reproducibility. You will then conduct fuzzing campaigns and analyze the results quantitatively. A potential focus could be assessing the impact of the provided grammars, dictionaries, and corpora on the performance of the fuzzers. The build, run, and analysis scripts will be open-sourced to facilitate future research.

Examples of interesting fuzzers and targets for integration:

Fuzzers: AFL++ with cmplog and autodict, Token-level AFL, Gramatron, Nautilus, Grimoire, Superion, Polyglot, CSmith.
Targets:
- All targets included in fuzzbench.
- Compilers/Interpreters/Assemblers accepting code inputs: clang, hotspot, python, php, ruby, v8, JavaScriptCore, SpiderMonkey.
- Document formats: html, postscript, word, rtf, roff, markdown.
- Data (interchange) formats and their processors: json, yaml, toml, xml, csv, tsv, jq, yq, sqlite.

Recommended Background:

Completion of compiler and software security-related courses.
Familiarity with NixOS and Nix-based build tools.
Experience with fuzzing and triaging compiler/interpreter bugs.

Exploring Proprietary Android System Services

Point of contact: Philipp Mao
Suitable for: MSc project, MSc semester project
Keywords: Android, reverse engineering, frida

Android system services (high-privileged userspace processes) are an important piece of Android’s security architecture. Smartphone vendors usually modify Android and add their own features, which often include additional system services. These system services are an interesting target for malicioius apps, since the services’ API is usually accessible to an app. While past research has predominantly focused on native (C++) system services, this project aims to investigate the system services that appear to be implemented in Java.

The objective of this project is to develop tools for analyzing proprietary Java system services. We aim to understand how these services operate, their privilege levels, and any potential vulnerabilities they may have. An ideal outcome of the project is a tool that can be deployed against a phone to then automatically analyze all system services.

Project tasks (in no particular order): - Writing frida hooks to dynamically analyze running proprietary Java services - Reverse-engineering proprietary Java services

Students interested in this project should have written at least one frida script to hook an Android app.

Why is QEMU’s plugin interface widely ignored?

Point of contact: Florian Hofhammer
Suitable for: MSc semester project
Keywords: Emulation, dynamic binary instrumentation

Many dynamic analysis projects leveraging the QEMU emulator for binary instrumentation heavily patch or fork QEMU to add their instrumentation. For example, Cannoli, a tracing engine for qemu-user, maintains a set of patches that needs to be applied to a certain commit of QEMU. Similarly, the AFL++ maintainers forked QEMU to add coverage instrumentation into binaries at runtime.

The approach of forking or patching QEMU has a significant drawback: the patches need to be meticulously updated every time QEMU touches the affected files, and a user needs to compile QEMU themselves.

QEMU has support for plugins. This raises the questions why projects are forking or patching QEMU instead of creating plugins that can be shipped independently of QEMU. The goal of this project is to evaluate the limitations of QEMU’s plugin system along two main questions:

Can common dynamic binary instrumentation use cases such as AFL++’s coverage instrumentation also be implemented as a QEMU plugin instead of a set of patches to QEMU’s source code?
If no, what are the limitations of QEMU’s plugin system that prevent the usage of plugins for dynamic binary analysis?
If yes, what are the reasons for forking or patching QEMU instead of using plugins? Is there a significant performance overhead when using plugins instead of directly patching QEMU?

An interested student should be comfortable with C code in potentially large code bases such as QEMU. The student should also be comfortable with reading and writing assembly (potentially multiple architectures) for instrumentation purposes. Already having basic familiarity with dynamic instrumentation use cases such as coverage instrumentation for fuzzing or memory tracing is a plus.

Creating a cycle-accurate multi-architecture simulator

Point of contact: Florian Hofhammer
Suitable for: MSc semester project
Keywords: Microarchitecture, simulation

r2wars is a game in which participants write small assembly bots that execute in the same address space. Whichever bot crashes first (e.g., by being overwritten by the competing bot) loses. This game is an adaptation of the original Corewars idea but with a twist: instead of being based on a programming language similar to assembly designed specifically for this kind of game, r2wars builds on top of the radare2 reverse engineering tooling and allows bots to be written in real-world architecture assembly (e.g., x86, Arm, RISC-V, Mips, …). However, while the supported ISAs are taken from the real world, the execution model is far from close to reality. Most instructions take the same amount of time, no matter the ISA or their complexity. In a real system, more complex instructions would execute in more cycles, and microarchitectural state such as pipeline state, caches, etc. would affect performance.

In this project, we aim to model r2wars “closer to the real world”, i.e., execute instructions in a cycle-accurate simulation. For this purpose, we can leverage a cycle-accurate simulator such as gem5.

The project requires familiarity with assembly code (at least reading and understanding) as well as basic knowledge about microarchitectural state.

SECCOMP implementation for double fetch protection

Point of contact: Luca Di Bartolomeo
Suitable for: MSc thesis
Keywords: kernel security, data race protection, security policy

System call filtering is a crucial part of protection policies ubiquitous in cloud, desktop and mobile environments (Android, Docker, etc.). The existing SECCOMP filter system is unable to inspect arguments passed by reference since the user can modify the values in memory, resulting in a TOCTTOU exploit.

Midas is a novel mitigation for TOCTTOU bugs in the kernel, exploiting the user memory access API to provide double fetch protection. In this project, you will implement and evaluate SECCOMP filtering for system call arguments passed by reference, leveraging Midas to protect the kernel from the double fetch introduced in the process.

This project requires:
- Expert experience in C development
- Experience with standard C/GNU build, development and debug tools (gdb, Makefiles)
- Understanding of OS principles
- Basic experience of OS coding/course project
- Understanding of the x86 architecture and assembly coding/debugging

Leveraging Static Analysis on Binaries to Uncover Time-of-Check-Time-of-Use Bugs

Point of contact: Marcel Busch
Suitable for: MSc semester project
Keywords: software engineering, reverse engineering, binary analysis, static analysis

TOCTOU bugs can lead to severe memory corruptions. These memory corruptions might allow adversaries to compromise and take full control of the affected system. In this project, we want to port and adapt an exisiting binary static analysis to uncover TOCTOU bugs in proprietary real-world software.

A candidate should be interested in (and ideally already be familiar with):

Python
Ghidra/Ghidrathon and/or angr
ARM assembly
Static analysis (e.g., RDA)

Evaluation on Syscall Filtering Techniques

Point of contact: Zhiyao Feng
Keywords: syscall filtering, Linux, Android, 0-day exploit

It is common that the OS kernels (e.g., Android) do not do security updates for years, due to labor-intensive update costs and limited software support lifespan. This leaves those kernels vulnerable to exploits. To protect them, we can utilize syscall filtering techniques to block potentially malicious syscall sequences before causing any damage. So the kernels remain safe, even if they are not updated.

There are some existing techniques or features that can be used for this purpose, like Seccomp, Seccomp-cBPF, and Seccomp Notify provided by the Linux kernel, along with some methods from research papers. They offer various capabilities and trade-offs in filtering syscalls for certain vulnerabilities.

In this project, we will evaluate these syscall filtering techniques, by reproducing some known 0-day exploits, applying the syscall filtering techniques, and checking if the exploits can be successfully blocked.

A candidate should be proficient in C programming and have a good grasp of Linux internals.

Benchmarking Fuzzers For Seed Selection Capability

Point of contact: Han Zheng
Keywords: Benchmark, Fuzzing

Fuzzing is an efficient software testing technique to reveal bugs. Therefore it has been widely investigated both in academia and industry. Despite the growth of the newly proposed fuzzing prototypes, evaluating the fuzzer’s coverage capability is still challenging.

Existing platforms like fuzzbench pick the well-constructed harness, which enable the fuzzers to iterate over each seed in the queue exhaustively.
Nevertheless, real-world scenarios might deviate from this ideal: seed explosion widely exists, fuzzer’s seed selection capability is critical and should not be deprioritize in the evaluation.

In this project, we will extend fuzzbench to more complex targets, which allows a more thorough assessment of fuzzer’s seed selection capability.

The goal of this project:

design a metric to define and select the “complex” targets
integrate the target into the fuzzbench and evaluate existing fuzzers
propose some metrics other than coverage to assess the seed selection capability.

A candidate should be interested in (ideally familiar with) the following: * Python * Basic knowledge of configure/cmake/make * Experience with Coverage Guided Greybox Fuzzer (e.g., AFL/AFL++)

A Ground-Truth Fuzzing Benchmark for Stateful Protocols

Point of contact: Qiang Liu
Suitable for: BSc semester project, MSc semester project
Keywords: Benchmark, Protocols, Fuzzing

Protocols are important in facilitating seamless and secure communication among multiple parties, users, hardware, and software. Serving as the foundational trust upon which complex systems and networks are connected, they are essential for ensuring security across various domains and industries. In recent years, there has been a sudden increase in the development of protocol fuzzers. However, none of them has provided a comprehensive and empirical comparison to each other. Therefore, without sufficient data support, we cannot draw a strong conclusion regarding what constitutes the state-of-the-art or determine the next steps to take.

In this project, specifically, we are going to answer three questions.

Regarding code coverage and bug discovery, how effective are existing fuzzers?
Regarding code coverage and bug discovery, how much improvement can be achieved through the methodology of existing fuzzers?
Regarding other metrics, what are the answers to the above two questions?

We are building a new benchmark based on Magma. A candidate is required to have experience with programming (Python and Bash), compiling (gcc or clang), and testing (integration testing), not necessarily but better to have experience with Docker container, git, fuzzing, and software security. A candidate can choose one of the following tasks.

Inject old bugs into the code space of a protocol implementation.
Add more fuzzers and more protocol implementation to the benchmark.
Collect data based on new metrics and draw figures based on the new data.

Hyper-Cube2: A Scalable, Performant, Multi-thread Blackbox Virtual Device Fuzzer

Point of contact: Qiang Liu
Suitable for: Master Semester/Thesis Project
Keywords: Blackbox, Virtual Device, Fuzzing

Virtual devices remain the main attack surface to hypervisors. Vulnerabilities in virtual devices lead to denial of service, data breaches, execution hijacking, and other security problems. Hyper-Cube was proposed to fuzz virtual devices. It is a blackbox fuzzer and has very high throughput. Hyper-Cube has great usability but suffers from the following two problems:

Hyper-Cube is only compatible with hypervisors that support a x86 32-bit OS.
Hyper-Cube cannot discover race conditions in virtual devices.

In this project, we aim to to achieve three specific goals.

Implement Hyper-Cube2: Implement a userspace virtual device fuzzer (based on libFuzzer) running on top of a 64-bit OS for both x64-64 and ARM (e.g., Ubuntu)
Optimize Hyper-Cube2: Optimize Hyper-Cube2 regarding performance (by choosing a small and fast 64-bit OS and reducing code complexity)
Extend Hyper-Cube2: Implement a multi-thread mode for race conditions

We are building a new blackbox virtual device fuzzer. A candidate is required to have experience with programming in C, compiling with Clang, programming profiling (e.g., FlameGraph), and algorithm optimization. It is preferable but not necessary to have experience with fuzzing, the development of operating systems and virtual devices, ARM, Hyper-V, VMWare Esxi, and macOS.

Other projects

Several other projects are possible in the areas of software and system security. We are open to discussing possible projects around the development of security benchmarks, using machine learning to detect vulnerabilities, secure memory allocation, sanitizer-based coverage tracking, and others.