HexHive PhD, MSc, BSc projects

This is a list of possible open, unassigned BSc or MSc research projects in the HexHive group for EPFL students.

Check out our list of completed projects to get an idea of past projects.

The projects are designed to be adjustable and scalable according to the type of BSc, MSc, or short PhD research project depending on the depth of the evaluation and exploration. For all projects we expect an open mind, good coding skills (especially in C/C++), and the willingness to learn. Previous experience with compilers (LLVM), build systems, and reverse engineering helps but is not required.

If you are interested in any of these topics then apply through our application form. All project applications have to go through this form and we will internally discuss all applications and then invite students for interviews. Apply early as spots are limited.

In the HexHive we welcome independent ideas from students as well, as long as they focus on the topics of system and software security, especially (but not exclusively) in sanitization of unsafe code, interactions between different components, mitigations, compiler-based security analyses, or fuzzing. So if you have an idea for your own project let us know and we can discuss! Reach out to the people most closest to your ideas and we'll let the ideas bubble up.

Android acropalypse
  • Point of contact: Luca Di Bartolomeo
  • Suitable for: Msc Semester Project / Thesis
  • Keywords: Reverse Engineering, Static Analysis

You might have heard about the recent security disaster that is aCropalypse. Well, it turns out that the reason behind this bug is Google silently updating some Android’s API for opening files which causes files not to be truncated anymore when opening them.

This is pretty wild and we think that there might be many more applications of aCropalypse, not just cropped screenshots. This project is about writing tooling to automatically analyze Android apks and searching for potential alternative data leaks.

A candidate should be interested in:

  • Android application reverse engineering
  • Static analysis tooling for Android apks
ARM64 Kernel Driver Retrowriting
  • Point of contact: Luca Di Bartolomeo
  • Keywords: Retrowrite, binary rewriting, mobile reverse engineering

A common feature of the Android ecosystem are proprietary binary blobs. Vendors may not update these and may not compile them with the latest exploit mitigations. A particular cause of concern are kernel modules given their privileged access.

Hexhive’s Retrowrite project is a state-of-the-art binary rewriting tool that can retrofit mitigations to legacy binaries without the need for source code. This currently works on ARM64 and x86-64 platforms, and x86-64 in kernel mode. The goal of this project would be to target ARM64 kernel modules, with the ability to add for example kASAN. We would aim to:

  • Identify kernel modules of particular interest, including open source modules to act as ground truth.
  • Produce a framework to evaluate the effectiveness of binary rewriting these modules by exercising their functionality, using fuzzing where appropriate.
  • Modify Retrowrite to support ARM64 kernel modules.
  • Evaluate the implementation against ground truth targets and against targets of interest. Evaluate the cost of instrumentation passes.

Students should have a basic understanding of how Linux kernel modules are built and loaded, and a good grasp of Linux internals. Ambitious students may also have Android Internals knowledge and be interested in testing their work on Android hardware.

Investigating the RP2350 bootrom
  • Point of contact: Florian Hofhammer
  • Suitable for: MSc semester project
  • Keywords: Emulation, dynamic binary instrumentation, re-hosting

The new RP2350 microcontroller has a wide range of capabilities, including Arm TrustZone-M for its Cortex-M33 cores, dual-architecture (RISC-V/Arm) support, special RPi-only peripherals (HSTX, PIO), etc. Furthermore, it supports secure boot and booting from encrypted flash. These security features are new additions in comparison to the features of the (older and less capable) RP2040 microcontroller.

In order to properly support all those features, not only the hardware but also the bootrom needed to be redesigned. Raspberry Pi luckily open-sourced the code for the bootrom, which allows for static analysis of the code. However, static analysis only gets us so far; dynamic analysis provides us with even more insights into its functionality and run-time values for variables, peripheral state, etc.

In this project, we aim to re-host the RP2350 bootrom into a virtual environment, which allows us to step through the bootrom’s code at runtime and interact with its runtime state.

An interested student ideally has experience with embedded development for microcontrollers and is comfortable with reading/understanding Arm and/or RISC-V assembly. Familiarity with emulation frameworks such as Unicorn is a plus.

Creating a cycle-accurate multi-architecture simulator
  • Point of contact: Florian Hofhammer
  • Suitable for: MSc semester project
  • Keywords: Microarchitecture, simulation

r2wars is a game in which participants write small assembly bots that execute in the same address space. Whichever bot crashes first (e.g., by being overwritten by the competing bot) loses. This game is an adaptation of the original Corewars idea but with a twist: instead of being based on a programming language similar to assembly designed specifically for this kind of game, r2wars builds on top of the radare2 reverse engineering tooling and allows bots to be written in real-world architecture assembly (e.g., x86, Arm, RISC-V, Mips, …). However, while the supported ISAs are taken from the real world, the execution model is far from close to reality. Most instructions take the same amount of time, no matter the ISA or their complexity. In a real system, more complex instructions would execute in more cycles, and microarchitectural state such as pipeline state, caches, etc. would affect performance.

In this project, we aim to model r2wars “closer to the real world”, i.e., execute instructions in a cycle-accurate simulation. For this purpose, we can leverage a cycle-accurate simulator such as gem5.

The project requires familiarity with assembly code (at least reading and understanding) as well as basic knowledge about microarchitectural state.

SECCOMP implementation for double fetch protection
  • Point of contact: Luca Di Bartolomeo
  • Suitable for: MSc thesis
  • Keywords: kernel security, data race protection, security policy

System call filtering is a crucial part of protection policies ubiquitous in cloud, desktop and mobile environments (Android, Docker, etc.). The existing SECCOMP filter system is unable to inspect arguments passed by reference since the user can modify the values in memory, resulting in a TOCTTOU exploit.

Midas is a novel mitigation for TOCTTOU bugs in the kernel, exploiting the user memory access API to provide double fetch protection. In this project, you will implement and evaluate SECCOMP filtering for system call arguments passed by reference, leveraging Midas to protect the kernel from the double fetch introduced in the process.

  • This project requires:
    • Expert experience in C development
    • Experience with standard C/GNU build, development and debug tools (gdb, Makefiles)
    • Understanding of OS principles
    • Basic experience of OS coding/course project
    • Understanding of the x86 architecture and assembly coding/debugging
Benchmarking Fuzzers For Seed Selection Capability
  • Point of contact: Han Zheng
  • Keywords: Benchmark, Fuzzing

Fuzzing is an efficient software testing technique to reveal bugs. Therefore it has been widely investigated both in academia and industry. Despite the growth of the newly proposed fuzzing prototypes, evaluating the fuzzer’s coverage capability is still challenging.

Existing platforms like fuzzbench pick the well-constructed harness, which enable the fuzzers to iterate over each seed in the queue exhaustively.
Nevertheless, real-world scenarios might deviate from this ideal: seed explosion widely exists, fuzzer’s seed selection capability is critical and should not be deprioritize in the evaluation.

In this project, we will extend fuzzbench to more complex targets, which allows a more thorough assessment of fuzzer’s seed selection capability.

The goal of this project:

  • design a metric to define and select the “complex” targets
  • integrate the target into the fuzzbench and evaluate existing fuzzers
  • propose some metrics other than coverage to assess the seed selection capability.

A candidate should be interested in (ideally familiar with) the following: * Python * Basic knowledge of configure/cmake/make * Experience with Coverage Guided Greybox Fuzzer (e.g., AFL/AFL++)

Hyper-Cube2 for 64-bit Hypervisors
  • Point of contact: Qiang Liu
  • Suitable for: Master Semester Project
  • Keywords: Blackbox, Virtual Device, Fuzzing

Virtual devices remain the main attack surface to hypervisors. Vulnerabilities in virtual devices lead to denial of service, data breaches, execution hijacking, and other security problems. Hyper-Cube was proposed to fuzz virtual devices. It is a blackbox fuzzer and has very high throughput. Hyper-Cube has great usability but suffers from the following two problems:

  • Hyper-Cube is only compatible with hypervisors that support a x86 32-bit OS.

In this project, we aim to achieve three specific goals.

  • Adjust Hyper-Cube OS to be 64-bit compatible.
  • Optimize the implementation to achieve higher throughput.
  • Analyze found bugs, report them, and help to fix.

We are building an extensive blackbox virtual device fuzzer based on Hyper-Cube. A candidate is required to have experience with programming in C, compiling with Clang, programming profiling (e.g., FlameGraph), and algorithm optimization. It is preferable but not necessary to have experience with fuzzing, the development of operating systems and virtual devices, ARM, Hyper-V, VMWare Esxi, and macOS.

Recommended readings:

Maintaining Magma: A Ground-Truth Fuzzing Benchmark
  • Point of contact: Qiang Liu
  • Suitable for: BS/Master semester Project
  • Keywords: Fuzzing, Evaluation, Benchmark

Magma is a fuzzer evaluation framework that enables accurate performance measurements by leveraging ground-truth information on bugs in real software. Magma includes a library of real targets (e.g. libpng, libtiff, openssl, etc…) with real bugs that have been re-introduced into those targets based on previous bug reports and fix commits. By reverse-engineering the commit which fixed a certain bug, we can identify what the root cause of the bug was, reintroduce it, and add a check (a canary) to determine when that bug is triggered, based on program state information available at runtime (i.e., variable values).

As fuzzers are tuned and improved on a regular basis, the benchmark upon which they’re evaluated must equally be upgraded, to keep up with the progress and avoid becoming out-dated. To achieve this, new targets and bugs must be added frequently, and old targets and bugs must be checked again for relevance, in case some bugs become unreachable/untrigerrable, or in case the target’s source code has changed enough to disallow the reintroduction of some bug without reintroducing old code functionality.

For this project, you are expected to:

  • Add a few new fuzzers to Magma
  • Port existing bug oracles to recent targets
  • Develop CI/CD to handle third-party testing requests
Other projects

Several other projects are possible in the areas of software and system security. We are open to discussing possible projects around the development of security benchmarks, using machine learning to detect vulnerabilities, secure memory allocation, sanitizer-based coverage tracking, and others.