HexHive PhD, MSc, BSc projects

This is a list of possible open, unassigned BSc or MSc research projects in the HexHive group for EPFL students.

The projects are designed to be adjustable and scalable according to the type of BSc, MSc, or short PhD research project depending on the depth of the evaluation and exploration. For all projects we expect an open mind, good coding skills (especially in C/C++), and the willingness to learn. Previous experience with compilers (LLVM), build systems, and reverse engineering helps but is not required.

If you are interested in any of these topics then apply through our application form. All project applications have to go through this form and we will internally discuss all applications and then invite students for interviews. The application deadline for projects in spring '24 is December 15, 2023.

In the HexHive we welcome independent ideas from students as well, as long as they focus on the topics of system and software security, especially (but not exclusively) in sanitization of unsafe code, interactions between different components, mitigations, compiler-based security analyses, or fuzzing. So if you have an idea for your own project let us know and we can discuss! Reach out to the people most closest to your ideas and we'll let the ideas bubble up.

Library Fuzzing

Unlike fuzzing CLI programs, whose input is modeled as a stream of bytes, fuzzing libraries requires drivers (library consumers) to bridge an input into a sequence of APIs. The code coverage and error discovery depend on the API combinations within the driver. Therefore, it is crucial having interesting drivers to deeply test a target library. Unfortunately, building such drivers is challenging due to a lack of semantic information about the APIs and their usage. Moreover, insidious errors may appear only with rare API sequences. Current techniques infer API usage from already-existing programs, however, the quality of the new drivers is inevitably limited by the existing consumers. In this project, we aim at generating library drivers without looking into existing consumers. Precisely, we use a combination of static analysis and automatic testing to mine the API usage and automatically build drivers able to explore a vaster library portion of code and trigger more complex errors.

The research questions in this project are:

  • how can we design static analysis to infer API dependency information and use them to build interesting drivers?
  • how can we use feedback from automatic testing to refine the driver generation (e.g., remove incorrect API sequences)?

The candidate will require to assist the design and develop of a prototype for testing different driver building strategies. The prototype will be a combination of different technologies, such as static analysis over LLVM IR, Python modules for the driver generation, and fuzzer for the automatic testing.

A candidate should be interested in (or familiar with) at least one of the following topics.

  • LLVM/Clang (also C/C++ will help)
  • Basic knowledge of static analysis
  • Python and OOP
Android acropalypse
  • Point of contact: Luca Di Bartolomeo
  • Suitable for: Msc Semester Project / Thesis
  • Keywords: Reverse Engineering, Static Analysis

You might have heard about the recent security disaster that is aCropalypse. Well, it turns out that the reason behind this bug is Google silently updating some Android’s API for opening files which causes files not to be truncated anymore when opening them.

This is pretty wild and we think that there might be many more applications of aCropalypse, not just cropped screenshots. This project is about writing tooling to automatically analyze Android apks and searching for potential alternative data leaks.

A candidate should be interested in:

  • Android application reverse engineering
  • Static analysis tooling for Android apks
Software Compartmentalization Benchmark suite
  • Point of contact: Andrés Sánchez
  • Keywords: compartmentalization, modularity, web applications

Compartmentalization is a software-development principle to reduce a program’s attack surface, and limit the exploitability of bugs. A compartmentalized program is separated into a number of compartments, each of which executes with minimal privileges and rights, and communicates through structured API only. Essentially, an exploit in one compartment should not trivially compromise other compartments.

We propose a semester/thesis project for masters students with software development expertise to compartmentalize high-risk software. Prime examples of such software are webservers, browsers and operating systems. We are open to other suggestions. We would like to eventually have a set of representative software comprising a benchmark suite against which to evaluate the different compartmentalization techniques.

A benchmark suite would preferably be portable, running on different operating systems/libraries, hardware, and be amenable to be ported onto hardware or software research proposals for better compartmentalization.

WebAssembly-based protection, strengths and limitations
  • Point of contact: Andrés Sánchez
  • Keywords: compartmentalization, program analysis, webassembly

WebAssembly is an standard virtual architecture in which a program can be compiled to. Thanks to its high performance and isolation through a sandbox, a developer can compile regular source code (e.g., written in C or Rust) to WebAssembly, ensuring that the interaction with the WebAssembly module is limited to the interfaces it exports. Software known for containing vulnerabilities can therefore be set in an external module.

In this project tailored for a MSc project/thesis, the student will analyze existing code and determine the shortcomings produced by its conversion to WebAssembly for security purposes. Ideally, a monolithic program can be split in such a way that the resulting version will be composed by several WebAssembly modules. This study requires the characterization of the limitations of running WebAssembly code and a fine-grained runtime analysis of the resulting software. The outcome shall be compared with other existing techniques.

This project also can also be accomplished by extending the features of the WebAssembly standard to support more software.

Type confusion test suite
  • Point of contact: Nicolas Badoux
  • Keywords: sanitizer, type confusion, test suite

Type confusion is a common vulnerability in C/C++ programs. It occurs when a type is incorrectly casted to another type. This can lead to memory corruption and code execution. HexHive has published a number of works trying to detect and mitigate the impact of type confusions. The goal of this project is to create a test suite for type confusion detection tools. Recent works have been evaluated on a common run time performance benchmark but they miss a validation on a common set of type confusion bugs. The test suite will be composed of a set of programs and unit test with type confusion bugs. Some bugs should be based on real world vulnerabilities while others can be purely synthetic.

We would aim to:

  • Identify a set of type confusion bugs in real world programs. Create a set of
  • synthetic type confusion bugs. Create a representative set of unit tests for
  • type confusion detection tools. Evaluate state-of-the-art type confusion
  • detection tools on the test suite.

Students should have a basic understanding of how C/C++ programs are built and a good grasp of Linux internals.

Fuzzing C++ libraries
  • Point of contact: Nicolas Badoux
  • Keywords: library fuzzing, fuzzing, C++

Unlike fuzzing CLI programs, whose input is modeled as a stream of bytes, fuzzing libraries requires drivers (library consumers) to bridge an input into a sequence of APIs. The code coverage and error discovery depend on the API combinations within the driver. Recent work at HexHive has shown promising result for automatically generating these drivers for C libraries. The goal of this project is to extend this work to C++ libraries. In particular, some adaptations will be necessary to handle the object-oriented nature of C++ as well as supporting casting operations.

The candidate will be required to identify the necessary adaptations to the existing C library fuzzing tool as well as implement support for them in the existing framework. The prototype will be a combination of different technologies, such as static analysis over LLVM IR, Python modules for the driver generation, and fuzzer for the automatic testing. The candidate will also be in charged of finding and motivating the choice of suitable C++ libraries to test.

A candidate should be interested in (or familiar with) the following topics.

  • LLVM/Clang (also C/C++ will help)
  • Python
ARM64 Kernel Driver Retrowriting
  • Point of contact: Luca Di Bartolomeo
  • Keywords: Retrowrite, binary rewriting, mobile reverse engineering

A common feature of the Android ecosystem are proprietary binary blobs. Vendors may not update these and may not compile them with the latest exploit mitigations. A particular cause of concern are kernel modules given their privileged access.

Hexhive’s Retrowrite project is a state-of-the-art binary rewriting tool that can retrofit mitigations to legacy binaries without the need for source code. This currently works on ARM64 and x86-64 platforms, and x86-64 in kernel mode. The goal of this project would be to target ARM64 kernel modules, with the ability to add for example kASAN. We would aim to:

  • Identify kernel modules of particular interest, including open source modules to act as ground truth.
  • Produce a framework to evaluate the effectiveness of binary rewriting these modules by exercising their functionality, using fuzzing where appropriate.
  • Modify Retrowrite to support ARM64 kernel modules.
  • Evaluate the implementation against ground truth targets and against targets of interest. Evaluate the cost of instrumentation passes.

Students should have a basic understanding of how Linux kernel modules are built and loaded, and a good grasp of Linux internals. Ambitious students may also have Android Internals knowledge and be interested in testing their work on Android hardware.

Leveraging application security through memory tagging
  • Point of contact : Andrés Sánchez
  • Keywords: Software development, virtual memory, compilers

Memory tagging is a hardware extension that adds a level of restriction when dereferencing memory addresses: the key held should match the memory key. This extension can be found implemented both by Memory Protection Keys (MPK) and Memory Tagging Extension (MTE), corresponding respectively to x86-64 and ARM64 architectures, which have a different granularity (page vs 16 bytes) and way to store the key (register or per-pointer), resulting in a substantially different programming model.

The adoption of such a technology would be decisive for finding memory safety bugs in existing pieces of code such as databases, cryptographic toolkits, operating system kernels, web servers, web browsers… Albeit this technologies are acknowledged (like MPK for which the Linux kernel provides an interface), their adoption from the application side requires a previous study which remains to be done.

This project includes:

  • Acquisition of familiarity with a relevant program source code base that would benefit through memory tagging.
  • Source code modification of the codebase to include support to memory tagging.
  • Functionality testing and performance impact benchmarking
  • Potential adoption of the source code modification in the project upstream

This project can be performed by either bachelor or master students, as there are different challenging codebases that can be addressed. It is also possible to do a master thesis out of it by creating a compiler-based framework that outlines in a sound way the possible protections an application can receive and analyzes them.

Benchmarking Fuzzers for Structured Text Input Software
  • Point of contact: Chibin Zhang
  • Suitable for: Master Thesis Project
  • Keywords: fuzzing, benchmark, compilers, data analysis

Fuzzing is an effective technique for finding bugs in software. Prior works have created benchmarks to assess the performance of fuzzers. However, these benchmarks are biased towards targets that accept binary inputs and towards fuzzers that mutate at the byte level. Additionally, they suffer from saturation, meaning the performance differences between top fuzzers are often insignificant. It is a known issue that existing byte-level fuzzers do not perform well on targets accepting structured text inputs. Current fuzzing benchmarks do not include state-of-the-art structure-aware fuzzers, such as grammar fuzzers, in their baselines. This is due to the fact that these fuzzers typically require additional grammars, dictionaries, or large seed corpora. Furthermore, existing structure-aware fuzzers have been evaluated on a limited set of disparate targets, run with different specifications, making it challenging to compare their performance quantitatively or even qualitatively.

In this project, you will create an extensive benchmark for targets that accept structured text inputs. You are expected to integrate at least 8 structure/syntax-aware fuzzers and 16 new targets (latest version), along with the required grammars, dictionaries, and corpora. It is suggested to use the Nix build system, as its build configurations are written declaratively and build artifacts are deterministic. This choice is anticipated to streamline the benchmarking process and ensure reproducibility. You will then conduct fuzzing campaigns and analyze the results quantitatively. A potential focus could be assessing the impact of the provided grammars, dictionaries, and corpora on the performance of the fuzzers. The build, run, and analysis scripts will be open-sourced to facilitate future research.

Examples of interesting fuzzers and targets for integration:

  • Fuzzers: AFL++ with cmplog and autodict, Token-level AFL, Gramatron, Nautilus, Grimoire, Superion, Polyglot, CSmith.
  • Targets:
    • All targets included in fuzzbench.
    • Compilers/Interpreters/Assemblers accepting code inputs: clang, hotspot, python, php, ruby, v8, JavaScriptCore, SpiderMonkey.
    • Document formats: html, postscript, word, rtf, roff, markdown.
    • Data (interchange) formats and their processors: json, yaml, toml, xml, csv, tsv, jq, yq, sqlite.

Recommended Background:

  • Completion of compiler and software security-related courses.
  • Familiarity with NixOS and Nix-based build tools.
  • Experience with fuzzing and triaging compiler/interpreter bugs.
Emulating Trusted Applications
  • Point of contact: Philipp Mao
  • Suitable for: MSc project, MSc semester project
  • Keywords: memory safety, reverse-engineering, emulation, Android, ARM

To safely manage a user’s secrets, modern Android devices leverage TAs (trusted applications), running in a TEE (Trusted Execution Environment). These TAs are closed-source and hard to analyze, since they run isolated from the rest of the Android framework.

The goal of this project is to build an emulator that can run TAs. By emulating TAs we’ll be able to debug or even fuzz the TAs. For this project we’ll focus on TAs from the beanpod TEE. The beanpod TEE implementation runs on low-end xiaomi devices. We will build our emulator on top of qiling, an emulator written in python.

Project tasks (in no particular order): - Reverse-engineering of TAs to check if emulation is working correctly. - Implementing emulation support for Global Platform APIs and standard libc functions. (The Global Platform API is a standard for TAs) - Reverse-engineering of the relevant beanpod libraries to add emulation support for custom beanpod specific APIs used by TAs. - Adding cross-TA communication support. - (optional) implement a fuzzing framework on top of our emulator using AFLs unicorn mode.

Students interested in this project should be comfortable with both reverse engineering (think ghidra, binja or ida) and programming in python. Familiarity with ARM or TEE/TAs is a plus but not required.

Why is QEMU’s plugin interface widely ignored?
  • Point of contact: Florian Hofhammer
  • Suitable for: MSc semester project
  • Keywords: Emulation, dynamic binary instrumentation

Many dynamic analysis projects leveraging the QEMU emulator for binary instrumentation heavily patch or fork QEMU to add their instrumentation. For example, Cannoli, a tracing engine for qemu-user, maintains a set of patches that needs to be applied to a certain commit of QEMU. Similarly, the AFL++ maintainers forked QEMU to add coverage instrumentation into binaries at runtime.

The approach of forking or patching QEMU has a significant drawback: the patches need to be meticulously updated every time QEMU touches the affected files, and a user needs to compile QEMU themselves.

QEMU has support for plugins. This raises the questions why projects are forking or patching QEMU instead of creating plugins that can be shipped independently of QEMU. The goal of this project is to evaluate the limitations of QEMU’s plugin system along two main questions:

  • Can common dynamic binary instrumentation use cases such as AFL++’s coverage instrumentation also be implemented as a QEMU plugin instead of a set of patches to QEMU’s source code?
  • If no, what are the limitations of QEMU’s plugin system that prevent the usage of plugins for dynamic binary analysis?
  • If yes, what are the reasons for forking or patching QEMU instead of using plugins? Is there a significant performance overhead when using plugins instead of directly patching QEMU?

An interested student should be comfortable with C code in potentially large code bases such as QEMU. The student should also be comfortable with reading and writing assembly (potentially multiple architectures) for instrumentation purposes. Already having basic familiarity with dynamic instrumentation use cases such as coverage instrumentation for fuzzing or memory tracing is a plus.

Creating a cycle-accurate multi-architecture simulator
  • Point of contact: Florian Hofhammer
  • Suitable for: MSc semester project
  • Keywords: Microarchitecture, simulation

r2wars is a game in which participants write small assembly bots that execute in the same address space. Whichever bot crashes first (e.g., by being overwritten by the competing bot) loses. This game is an adaptation of the original Corewars idea but with a twist: instead of being based on a programming language similar to assembly designed specifically for this kind of game, r2wars builds on top of the radare2 reverse engineering tooling and allows bots to be written in real-world architecture assembly (e.g., x86, Arm, RISC-V, Mips, …). However, while the supported ISAs are taken from the real world, the execution model is far from close to reality. Most instructions take the same amount of time, no matter the ISA or their complexity. In a real system, more complex instructions would execute in more cycles, and microarchitectural state such as pipeline state, caches, etc. would affect performance.

In this project, we aim to model r2wars “closer to the real world”, i.e., execute instructions in a cycle-accurate simulation. For this purpose, we can leverage a cycle-accurate simulator such as gem5.

The project requires familiarity with assembly code (at least reading and understanding) as well as basic knowledge about microarchitectural state.

SECCOMP implementation for double fetch protection
  • Point of contact: Luca Di Bartolomeo
  • Suitable for: MSc thesis
  • Keywords: kernel security, data race protection, security policy

System call filtering is a crucial part of protection policies ubiquitous in cloud, desktop and mobile environments (Android, Docker, etc.). The existing SECCOMP filter system is unable to inspect arguments passed by reference since the user can modify the values in memory, resulting in a TOCTTOU exploit.

Midas is a novel mitigation for TOCTTOU bugs in the kernel, exploiting the user memory access API to provide double fetch protection. In this project, you will implement and evaluate SECCOMP filtering for system call arguments passed by reference, leveraging Midas to protect the kernel from the double fetch introduced in the process.

  • This project requires:
    • Expert experience in C development
    • Experience with standard C/GNU build, development and debug tools (gdb, Makefiles)
    • Understanding of OS principles
    • Basic experience of OS coding/course project
    • Understanding of the x86 architecture and assembly coding/debugging
Leveraging Static Analysis on Binaries to Uncover Time-of-Check-Time-of-Use Bugs
  • Point of contact: Marcel Busch
  • Suitable for: MSc semester project
  • Keywords: software engineering, reverse engineering, binary analysis, static analysis

TOCTOU bugs can lead to severe memory corruptions. These memory corruptions might allow adversaries to compromise and take full control of the affected system. In this project, we want to port and adapt an exisiting binary static analysis to uncover TOCTOU bugs in proprietary real-world software.

A candidate should be interested in (and ideally already be familiar with):

  • Python
  • Ghidra/Ghidrathon and/or angr
  • ARM assembly
  • Static analysis (e.g., RDA)
Sykaller Profiling
  • Point of contact: Zhiyao Feng
  • Keywords: kernel fuzzing

Syzkaller is a state-of-the-art coverage-guided kernel fuzzer. It employs various strategies and optimizations for system call fuzzing and is actively maintained. It has uncovered thousands of bugs from multiple OSes (e.g., Linux, FreeBSD, Windows). Consequently, many research prototypes aimed at improving the kernel fuzzing efficiency are built upon Syzkaller.

In this project, we will take a deep look into Syzkaller’s specific fuzzing strategies and optimizations. The goal is to evaluate their effectiveness, identify the fuzzing objects they excel or struggle with (e.g., system calls with a certain type of argument), and gather insights for potential improvement.

A candidate should be proficient in C/C++/Go programming and have a good grasp of Linux internals.

Benchmarking Fuzzers For Seed Selection Capability
  • Point of contact: Han Zheng
  • Keywords: Benchmark, Fuzzing

Fuzzing is an efficient software testing technique to reveal bugs. Therefore it has been widely investigated both in academia and industry. Despite the growth of the newly proposed fuzzing prototypes, evaluating the fuzzer’s coverage capability is still challenging.

Existing platforms like fuzzbench pick the well-constructed harness, which enable the fuzzers to iterate over each seed in the queue exhaustively.
Nevertheless, real-world scenarios might deviate from this ideal: seed explosion widely exists, fuzzer’s seed selection capability is critical and should not be deprioritize in the evaluation.

In this project, we will extend fuzzbench to more complex targets, which allows a more thorough assessment of fuzzer’s seed selection capability.

The goal of this project:

  • design a metric to define and select the “complex” targets
  • integrate the target into the fuzzbench and evaluate existing fuzzers
  • propose some metrics other than coverage to assess the seed selection capability.

A candidate should be interested in (ideally familiar with) the following: * Python * Basic knowledge of configure/cmake/make * Experience with Coverage Guided Greybox Fuzzer (e.g., AFL/AFL++)

A Ground-Truth Fuzzing Benchmark for Stateful Protocols
  • Point of contact: Qiang Liu
  • Suitable for: BSc semester project, MSc semester project
  • Keywords: Benchmark, Protocols, Fuzzing

Protocols are important in facilitating seamless and secure communication among multiple parties, users, hardware, and software. Serving as the foundational trust upon which complex systems and networks are connected, they are essential for ensuring security across various domains and industries. In recent years, there has been a sudden increase in the development of protocol fuzzers. However, none of them has provided a comprehensive and empirical comparison to each other. Therefore, without sufficient data support, we cannot draw a strong conclusion regarding what constitutes the state-of-the-art or determine the next steps to take.

In this project, specifically, we are going to answer three questions.

  • Regarding code coverage and bug discovery, how effective are existing fuzzers?
  • Regarding code coverage and bug discovery, how much improvement can be achieved through the methodology of existing fuzzers?
  • Regarding other metrics, what are the answers to the above two questions?

We are building a new benchmark based on Magma. A candidate is required to have experience with programming (Python and Bash), compiling (gcc or clang), and testing (integration testing), not necessarily but better to have experience with Docker container, git, fuzzing, and software security. A candidate can choose one of the following tasks.

  • Inject old bugs into the code space of a protocol implementation.
  • Add more fuzzers and more protocol implementation to the benchmark.
  • Collect data based on new metrics and draw figures based on the new data.

Recommended readings

Other projects

Several other projects are possible in the areas of software and system security. We are open to discussing possible projects around the development of security benchmarks, using machine learning to detect vulnerabilities, secure memory allocation, sanitizer-based coverage tracking, and others.