ERC StG CodeSan: Code Sanitization for Vulnerability Pruning and Exploitation Mitigation
Despite massive efforts in securing software, about 60 security bugs are publicly reported each month. Systems software is prone to low level bugs caused by undefined behavior (memory corruption, type confusion, or API confusion). Exploits abuse undefined behavior to execute attacker specified code, or to leak information.
We propose code sanitization (CodeSan), a comprehensive approach to improve code quality. CodeSan will sanitize software by (i) automating bug discovery during development through software testing and (ii) protecting deployed software through reflective mitigations. CodeSan trades formal completeness for practical scalability in three steps: First, policy-based sanitization makes undefined behavior (through violations of memory safety, type safety, or API flow safety) explicit and detectable given concrete test inputs. Second, automatic test case generation increases testing coverage for large programs without the need for pre-existing test cases, enabling broader and automated use of policy-based sanitization. Third, for deployed software, reflective mitigations place runtime checks precisely where they are needed based on data-flow and control-flow coverage from our testing efforts. CodeSan complements formal approaches by protecting software that is currently out of reach due to its size, complexity, or low level nature.
CodeSan is a compelling, comprehensive, and adaptive approach to thoroughly address undefined behavior for complex software. The three proposed thrusts complement each other naturally and will immediately guard large software systems such as Google Chromium, Mozilla Firefox, the Android system, or the Linux kernel, making them resilient against attacks. In line with PI Payer’s track record on open sourcing his group’s research artifacts on cast sanitization, transformative fuzzing, or control-flow hijacking mitigations, all prototypes produced during CodeSan will be released as open-source.
Duration: March 2020 to February 2025
Publications, Prototypes, and Artifacts
Igor: Crash Deduplication Through root-Cause Clustering2021
Zhiyuan Jiang, Xiyue Jiang, Ahmad Hazimeh, Chaojing Tang, Chao Zhang, and Mathias Payer.
In CCS'21: ACM Conference on Computer and Communication Security, 2021 (source, DOI)
Principal Kernel Analysis: A Tractable Methodology to Simulate Scaled GPU Workloads
Cesar Avalos Baddouh, Mahmoud Khairy, Roland N. Green, Mathias Payer, and Timothy G. Rogers.
In MICRO'21: International Symposium on Microarchitecture, 2021
Code Specialization through Dynamic Feature Observation
Priyam Biswas, Nathan Burow, and Mathias Payer.
In CODASPY'21: ACM Conference on Data and Application Security and Privacy, 2021 (source, DOI)
LIGHTBLUE: Automatic Profile-Aware Debloating of Bluetooth
Jianliang Wu, Ruoyu Wu, Daniele Antonioli, Mathias Payer, Nils Ole Tippenhauer, Dongyan Xu, Dave (Jing) Tian, and Antonio Bianchi.
In SEC'21: Usenix Security Symposium, 2021 (source)
Too Quiet in the Library: An Empirical Study of Security Updates in Android Apps' Native Code
Sumaya Almanee, Arda Unal, Mathias Payer, and Joshua Garcia.
In ICSE'21: International Conference on Software Engineering, 2021 (video, source, DOI)
Seed Selection for Successful Fuzzing
Adrian Herrera, Hendra Gunadi, Shane Magrath, Michael Norrish, Mathias Payer, and Tony Hosking.
In ISSTA'21: ACM SIGSOFT International Symposium on Software Testing and Analysis, 2021 (DOI)
FuZZan: Efficient Sanitizer Metadata Design for Fuzzing2020
Yuseok Jeon, WookHyun Han, Nathan Burow, and Mathias Payer.
In ATC'20: Usenix Annual Technical Conference, 2020 (source)
USBFuzz: A Framework for Fuzzing USB Drivers by Device Emulation
Hui Peng, and Mathias Payer.
In SEC'20: Usenix Security Symposium, 2020 (source)
HALucinator: Firmware Re-hosting Through Abstraction Layer Emulation
Abraham A. Clements, Eric Gustafson, Tobias Scharnowski, David Fritz, Christopher Kruegel, Giovanni Vigna, Saurabh Bagchi, and Mathias Payer.
In SEC'20: Usenix Security Symposium, 2020 (HALucinator source, HALfuzz source)