HexHive

HexHive PhD, MSc, BSc projects

This is a list of possible open, unassigned BSc or MSc research projects in the HexHive group for EPFL students.

Check out our list of completed projects to get an idea of past projects.

The projects are designed to be adjustable and scalable according to the type of BSc, MSc, or short PhD research project depending on the depth of the evaluation and exploration. For all projects we expect an open mind, good coding skills (especially in C/C++), and the willingness to learn. Previous experience with compilers (LLVM), build systems, and reverse engineering helps but is not required.

If you are interested in any of these topics then apply through our application form. All project applications have to go through this form and we will internally discuss all applications and then invite students for interviews. Apply early as spots are limited.

In the HexHive we welcome independent ideas from students as well, as long as they focus on the topics of system and software security, especially (but not exclusively) in sanitization of unsafe code, interactions between different components, mitigations, compiler-based security analyses, or fuzzing. So if you have an idea for your own project let us know and we can discuss! Reach out to the people most closest to your ideas and we'll let the ideas bubble up.

Android acropalypse

Point of contact: Luca Di Bartolomeo
Suitable for: Msc Semester Project / Thesis
Keywords: Reverse Engineering, Static Analysis

You might have heard about the recent security disaster that is aCropalypse. Well, it turns out that the reason behind this bug is Google silently updating some Android’s API for opening files which causes files not to be truncated anymore when opening them.

This is pretty wild and we think that there might be many more applications of aCropalypse, not just cropped screenshots. This project is about writing tooling to automatically analyze Android apks and searching for potential alternative data leaks.

A candidate should be interested in:

Android application reverse engineering
Static analysis tooling for Android apks

ARM64 Kernel Driver Retrowriting

Point of contact: Luca Di Bartolomeo
Keywords: Retrowrite, binary rewriting, mobile reverse engineering

A common feature of the Android ecosystem are proprietary binary blobs. Vendors may not update these and may not compile them with the latest exploit mitigations. A particular cause of concern are kernel modules given their privileged access.

Hexhive’s Retrowrite project is a state-of-the-art binary rewriting tool that can retrofit mitigations to legacy binaries without the need for source code. This currently works on ARM64 and x86-64 platforms, and x86-64 in kernel mode. The goal of this project would be to target ARM64 kernel modules, with the ability to add for example kASAN. We would aim to:

Identify kernel modules of particular interest, including open source modules to act as ground truth.
Produce a framework to evaluate the effectiveness of binary rewriting these modules by exercising their functionality, using fuzzing where appropriate.
Modify Retrowrite to support ARM64 kernel modules.
Evaluate the implementation against ground truth targets and against targets of interest. Evaluate the cost of instrumentation passes.

Students should have a basic understanding of how Linux kernel modules are built and loaded, and a good grasp of Linux internals. Ambitious students may also have Android Internals knowledge and be interested in testing their work on Android hardware.

Towards Deterministic Dynamic Analysis of Inter-connected Services

Point of contact: Florian Hofhammer
Suitable for: MSc thesis
Keywords: Emulation, dynamic binary instrumentation, re-hosting, firmware

Previous re-hosting work has mainly focused on bringing a single firmware image up in a virtualized environment for various dynamic analysis use cases. This approach is certainly valid for use cases such as fuzzing, where we want to simply throw as many inputs at a target’s exposed interfaces as possible, not caring too much about how realistic those inputs are.

Take as an example a car’s engine control unit (ECU): if we want to find bugs via fuzzing, we “simply” (a pretty bold claim in the re-hosting world :)) execute it over and over again with randomized inputs to cover as much of its code as possible. The input interfaces might include temperature sensors inside the car’s engine. Assuming the firmware crashes when handling a reported temperature of -999°C, such a crash is certainly an interesting finding but far off from reality and therefore potentially a false positive bug (I’d like to actually see a combustion engine running at negative temperatures, especially at temps below 0K before I consider this a true positive :)).

To weed out false positives in fuzzing or also for other dynamic analysis use cases (could be as simple as quick debugging iteration during firmware development), we would like to re-host firmware in a more realistic environment. One approach for this aim would be to constrain the input spaces to realistic values. Another approach (and the one we want to target in this project) is to model a full system instead of only single components. In the above example, this could mean re-hosting both the ECU firmware and potential sensor endpoints so that the sensor endpoints would provide the ECU with more realistic data based on their internal, potentially opaque calculations. However, we need to ensure deterministic execution of such a composed system to, e.g., reliably reproduce bugs. This project therefore has two goals:

Build a multi-system dynamic analysis framework
Design the framework in such a way that execution is deterministic

A student working on this project should have:

Experience with emulation via tools such as QEMU, PANDA, Unicorn, or similar
Strong systems programming skills (at least C, potentially also other languages such as Rust or Zig)
Reverse engineering experience
Experience with/the willingness to learn about embedded systems (both microcontroller and application processor-based (e.g., Arm Cortex-M and Cortex-A)

Building Better Benchmarks

Point of contact: Florian Hofhammer
Suitable for: MSc semester project, potentially also BSc semester project
Keywords: benchmarking, evaluation, CI/CD, devops

The replication crisis in academia in general and CS in particular has caused recent developments such as conferences requiring to submit paper artifacts and awarding badges for reproduced results.

While this is an important step forward in making research results more reproducible, the results are still not necessarily comparable. Different papers on the same subject may use different benchmarks to drive their point home, or use the same benchmarks but report different metrics obtained from those benchmarks (e.g., one paper focuses on compute overhead whereas another paper focuses on memory overhead).

We are aiming to create a standardized set of benchmarks and metrics to evaluate papers’ artifacts and compare their results fairly. In this project, you will integrate existing research artifacts into an evaluation pipeline and extend this pipeline with additional benchmarks. You will then process the benchmark results into a standardized set of metrics.

This project requires:

A good understanding of build systems and how to build and run systems software (GNU make, cmake,…)
Experience with containerization tools such as Docker or Podman
Scripting/data processing experience (Python, Bash, or your favorite other language if applicable)
Experience with CI/CD pipelines for automated benchmark running is a plus

SECCOMP implementation for double fetch protection

Point of contact: Luca Di Bartolomeo
Suitable for: MSc thesis
Keywords: kernel security, data race protection, security policy

System call filtering is a crucial part of protection policies ubiquitous in cloud, desktop and mobile environments (Android, Docker, etc.). The existing SECCOMP filter system is unable to inspect arguments passed by reference since the user can modify the values in memory, resulting in a TOCTTOU exploit.

Midas is a novel mitigation for TOCTTOU bugs in the kernel, exploiting the user memory access API to provide double fetch protection. In this project, you will implement and evaluate SECCOMP filtering for system call arguments passed by reference, leveraging Midas to protect the kernel from the double fetch introduced in the process.

This project requires:
- Expert experience in C development
- Experience with standard C/GNU build, development and debug tools (gdb, Makefiles)
- Understanding of OS principles
- Basic experience of OS coding/course project
- Understanding of the x86 architecture and assembly coding/debugging

Maintaining Magma: A Ground-Truth Fuzzing Benchmark

Point of contact: Qiang Liu
Suitable for: BSc semester Project, potentially MSc semester project
Keywords: Fuzzing, Evaluation, Benchmark

Magma is a fuzzer evaluation framework that enables accurate performance measurements by leveraging ground-truth information on bugs in real software. Magma includes a library of real targets (e.g. libpng, libtiff, openssl, etc…) with real bugs that have been re-introduced into those targets based on previous bug reports and fix commits. By reverse-engineering the commit which fixed a certain bug, we can identify what the root cause of the bug was, reintroduce it, and add a check (a canary) to determine when that bug is triggered, based on program state information available at runtime (i.e., variable values).

As fuzzers are tuned and improved on a regular basis, the benchmark upon which they’re evaluated must equally be upgraded, to keep up with the progress and avoid becoming out-dated. To achieve this, new targets and bugs must be added frequently, and old targets and bugs must be checked again for relevance, in case some bugs become unreachable/untrigerrable, or in case the target’s source code has changed enough to disallow the reintroduction of some bug without reintroducing old code functionality.

For this project, you are expected to:

Finish collecting the Proof-of-Concepts to trigger the injected bugs
Potentially automate valid input construction

Legacy Rebooted: A Comparative Study of Unix Utilities in Rust and C

Point of contact: Rafaila Galanopoulou
Suitable for: MSc semester project or thesis, potentially also BSc semester project
Keywords: systems security, language safety, vulnerability analysis

Developed for over 30 years, Linux has already become the computing foundation for today’s digital world; from gigantic, complex mainframes (e.g., supercomputers) to cheap, wimpy embedded devices (e.g., IoTs), countless applications are built on top of it. Rust is a statically and strongly-typed language. In short, its safety model regulates the accesses to memory locations: at one given time, only one variable may write to a memory location but many may read from it.

We focus on a selection of widely used Unix utilities such as sed, grep, tar, find, netcat, and sort. For each utility, we will either port an existing Rust-based clone or develop a minimal functional reimplementation in Rust using automated translation tools. The original GNU or BSD implementations written in C will serve as the baseline for comparison.

For this project, the objectives are to:

Identify and compare the types and historical evolution of vulnerabilities in C and Rust implementations.
Analyze how Rust’s safety guarantees mitigate specific classes of bugs.
Evaluate the practicality and trade-offs of rewriting or porting Unix utilities from C to Rust.
Investigate the impact of feature set size and code complexity on the overall security surface.

How deep is your love?

Point of contact: Rafaila Galanopoulou
Suitable for: MSc semester project or thesis, potentially also BSc semester project
Keywords: cross-language analysis, dependency mapping, software supply chain security

Modern software ecosystems increasingly rely on cross-language architectures. These dependencies are often deeply nested and opaque, making it difficult to fully understand control flow, data flow, and potential risk boundaries. Using multiple programming languages and libraries has other challenges, i.e., increasing the complexity and support issues. As a result, developers need to track versions, bugs, and compatibility with multiple compilers and runtimes rather than a single set.

We focus on a curated set of 10–50 real-world PyPI software projects, selected based on factors such as popularity and maturity. For each project, we will create the call graphs using either static analysis or dynamic tracing. These graphs aim to capture the complete invocation paths, including transitions between high-level code and native components written in other programming languages.

For this project, the objectives are to:

Select and analyze a diverse set of Python projects using cross-language components.
Build static or dynamic call graphs that trace API and function invocations across language boundaries.
Annotate and classify unresolved edges, particularly those crossing into foreign code, to identify fragile or opaque integration points.
Visualize and quantify key metrics: total functions, call graph coverage,language breakdown, and which functionalities are implemented in each language.

Synthesizing Peripheral Models from MCU Manuals with LLMs

Point of contact: Qinying Wang
Suitable for: MSc semester project, or thesis
Keywords: embedded system, emulation, LLM tooling

Firmware bring-up and system emulation depend on accurate peripheral models. Register maps (SVD/headers) usually exist, but side-effect semantics, including state transitions, IRQ raise/clear, cross-register dependencies, are still hand-coded from the manual’s prose. This is slow, error-prone, and hard to maintain across silicon revisions. Prior work uses fuzzing and symbolic execution [1][2] to explore register read/write patterns, but coverage of side effects and timing remains limited; some early NLP attempts [3] exist. Large Language Models (LLMs) can reliably transform semi-structured PDFs into structured specs and code when constrained by schemas, validators, and tests. We propose going further: couple LLM extraction with verification loops (e.g., counterexample-guided refinement) to turn manual text into a loadable YAML specification expressed in our vendor-agnostic description language.

For this project, the objectives are

Create a compact LLM pipeline that ingests the MCU manual and emits our DSL YAML: (a) transcribe register tables, (b) extract side-effect rules, (c) run validators. Feed the validated rules into our existing Unicorn stub.
Test on one Nordic and one STM32 MCU and evaluate: (a) exact-match rate for fields/values, (b) side-effect test pass rate (state transitions, IRQs), (c) developer time saved vs. hand coding.
Produce a report (prompts used, checks, what failed and why, best practices). Ship the validated YAML files, the test scripts, and any small firmware demos that prove the stub runs.

This project requires: interest in embedded systems, emulation, and LLM tooling; ability to read MCU manuals; Python (pipelines + Unicorn); optionally C/Rust for headers or alternative backends.

[1] P2IM: Scalable and hardware-independent firmware testing via automatic peripheral interface modeling. USENIX Security 20.

[2] Fuzzware: Using precise {MMIO} modeling for effective firmware fuzzing. USENIX Security 22.

[3] What your firmware tells you is not how you should emulate it: A specification-guided approach for firmware emulation. ACM CCS 2022.

Other projects

Several other projects are possible in the areas of software and system security. We are open to discussing possible projects around the development of security benchmarks, using machine learning to detect vulnerabilities, secure memory allocation, sanitizer-based coverage tracking, and others.