Finding vulnerabilities in software is a challenge. Traditionally, it requires skilled engineers to manually reverse engineer software at the assembly code level to finds flaws. While this approach is often successful, it’s hard to scale. Skilled engineers are sparse, and, even if that weren’t an issue, the cost of additional manpower could be prohibitively expensive.
A common solution to scaling vulnerability research is to automate with software. Often referred to as a fuzzing framework, this software sends different inputs to a target binary while it’s running, hoping to crash it. If a crash is detected, the input causing the crash is saved, and further research is conducted to analyze the crash case. Fuzzing frameworks can be a cheaper alternative to scaling vulnerability research by reducing the cost of manpower over time. Once the initial framework is developed, the primary costs are:
- tool maintenance
- computing power
However, most fuzzing frameworks are not as effective as advertised.
- They aren’t built to parallelize work efficiently
- They only work against specific targets, or
- They struggle to find vulnerabilities in more robust software.
Tackling Vulnerability Research Automation
Ultimately we need to find vulnerabilities better than existing fuzzers.
The solution is to build a scalable, advanced fuzzing framework that supports targets of different architectures and operating systems.
But how? Let’s break down the various components and necessary technology.
1) Improve Bug Discovery in your Framework
Discovering bugs is the single most important feature of a fuzzing framework. There are two main principles that can improve bug discovery:
Increased code coverage: Tracking code already executed in a previous fuzz case is one of the most critical aspects of automated fuzzing.
It’s often the more obscure code paths that yield the best vulnerabilities, so a good fuzzer attempts to cover as much of an application’s code base as possible.
One important technology that helps to increase a fuzzer’s code coverage is Concolic execution. Concolic execution is the use of symbolic execution – a means to execute a program through symbols and constants – and normal concrete execution to identify and solve constraints on code paths. When both symbolic and concrete execution are running in tandem, they can quickly decipher what realistic inputs are needed to execute code in one path versus another. Feeding this information back to a fuzz generator causes both sides of the code path to be executed and code coverage increases.
Advanced introspection: A good fuzzer is designed to not only emulate a binary, but introspect the execution of the binary in real time.
This allows the fuzzing engine to watch for crashes and anomalies that may not crash the binary but signal that a bug has been found.
- Stacks can be watched similarly, looking for any overflows of critical stack pieces such as return addresses and stack cookies.
- Heap introspection involves tracking all memory allocations and deallocations to look for any overflows or reads/writes to memory that appear out of place from normal expected execution.
- Data sections can be analyzed along with code accesses to the data section such that unexpected reads or writes can be flagged.
2) Support Multiple Architectures and Operation Systems
There are two essential factors needed to scale a fuzzer properly:
Modularize components: A fuzzing framework needs the ability to generate different inputs, feed the inputs to a running binary, record any results of note, and repeat.
Each of these steps can be broken into different components. Let’s call them:
By separating these components, each can run independently on a different piece of hardware. So a datahouse might use hardware specialized for databases, while the feeder might use hardware with a powerful CPU.
By optimizing the different hardware solutions for each component you prevent bottlenecks in the framework and can speed up the entire system.
Parallelize Work: An important component of a fuzzer is generating and feeding new input into a target binary.
Each new input is usually based on what’s called a seed input, and generating new input can be as simple as flipping a bit of the seed input. But if two or more threads are trying to create additional input from the same seed, you must make sure the results don’t collide. Otherwise, your work is wasted. To solve this problem, create pseudo-random, deterministic inputs. If each input is deterministic, then it’s reliable and easy to prevent two threads from stepping on each other’s toes.
3) Support Multiple Architectures and Operation Systems
The key to scalability is building a fuzzing framework that is flexible.
Many fuzzing frameworks are created for a specific target in mind, whether it’s a specific piece of software or a category of software, like a browser. This approach is often quite powerful and effective. However, if your goal is to build a fuzzing framework capable of analyzing many different pieces of software, you need a more generic approach. One option is to use QEMU to emulate your target. Leveraging a popular framework built on top of QEMU, called Unicorn, allows you to use QEMU more like a CPU emulator. Unicorn supports a number of different architectures out of the box, and will also support any operating system provided there’s an emulation layer built on top of Unicorn.
Developing a modern fuzzing framework is no small task. Many of the newest technologies must be utilized for even a chance at finding bugs in modern software. Consider this blog post a general guide on what we think are some of the critical problems to solve when creating an efficient, effective fuzzing framework.