Assignment 2

A. Instructions

B. General hints

This page will be updated with more indications as I get more questions.

If you get stuck at any stage, immediately contact me.
Try to keep notes of everything you do and how you overcome issues. This will be useful later if you have to perform the same steps again.
It may be easier to prepare your answers in a text file and copy/paste them into Blackboard when you are done.
Your first steps could be as follows:
- Download and build AFL++.
- Read the documentation of AFL++ about how to use it effectively.
- Download and build Intel XED:
  - First build xed with a regular gcc/clang compiler (no AFL++). You may not need that, but it is a helpful check that everything works as it should.
  - Understand how to use xed from the command line.
  - Run xed on an example file to check that it works.
  - Then build XED with AFL++.
  - Start fuzzing XED.
Building and fuzzing commands quickly get complex (many parameters). It is often helpful to put such long commands into a script file (e.g. myscript.sh), make the script executable (chmod +x myscript.sh), then run the script file (./myscript.sh). This has the added advantage of allowing you to re-run the exact same building or fuzzing commands later.

C. Compiling XED (not platform-specific)

XED uses its own bespoke build system called mbuild. (This is in part because some of its code is generated by Python scripts from tables describing the Intel architecture.)

git clone https://github.com/intelxed/mbuild
git clone https://github.com/intelxed/xed
cd xed
./mfile.py

The use of ./mfile.py is equivalent to the use of make (or cmake, or ninja, …) in other projects. It will look for mbuild at ../mbuild.

There are many options to mfile.py, which you can get by running ./mfile.py --help. A few important ones (which can be combined):

./mfile.py examples                    # build examples of XED uses, including the xed utility
./mfile.py -v 10                       # show mbuild debugging messages, incl. compiler path and parameters
./mfile.py --opt=3                     # adds -O3 to compiler arguments to enable optimization for speed
./mfile.py "--extra-flags=-ggdb"       # adds compiler arguments (in this example: -ggdb)
./mfile.py "--extra-linkflags=-ggdb"   # adds linker arguments (in this example: -ggdb)
./mfile.py --cc=.../afl-clang-fast     # specify C compiler path
./mfile.py --cxx=.../afl-clang-fast++  # specify C++ compiler path
./mfile.py --linker=.../afl-ld-lto     # specify linker path
./myfile.py -j 8                       # run up to 8 build jobs in parallel

You must replace the .../ paths above by the path to afl-clang executables. Prefer full absolute paths here (starting with /). Examples include: /opt/homebrew/bin/afl-clang-fast, /usr/local/bin/afl-clang-fast, /usr/bin/afl-clang-fast.

If you use -fsanitize=..., the option must be passed to both the compiler and the linker.

You can use afl-clang-lto if it is available on your platform. Fuzzing will be slightly faster with afl-clang-lto, but compilation will be significantly slower. One option would be to test everything and start fuzzing with afl-clang-fast variants, and only use LTO for long (e.g. overnight) runs in an effort to find more crashes.
Once compilation is done, check that xed is working. You need to run ./mfile.py examples to compile it, and it will be located in obj/wkit/bin/xed.
You may need to compile XED multiple times (e.g. with clang or with afl-clang, with or without sanitizers, etc.) Once you successfully produce a xed executable, I would advise copying it somewhere for safekeeping.
One downside of parallel builds is that understanding compilation errors can be harder. In such case, immediately re-run ./mfile.py without the -j option and try to fix the compilation error. Once it is fixed, you can interrupt the non-parallel build (control+C), then resume building in parallel.

D. Fuzzing XED

For the starting set of input files, afl-fuzz works best with a limited number (5 to 20) of small valid files (smaller than, say, 10 kB, and can be as small as 5 to 10 bytes) that are different from each other. Invalid files can be included, but they are typically not useful.
The first time you run afl-fuzz, it may ask you to adjust system settings to allow it to run faster. Whenever those adjustments are easy to perform, it is recommended to follow its advice. For example, this command
```
echo core | sudo tee /proc/sys/kernel/core_pattern
```
is easy to run and is essentially required for AFL++ to work properly on Linux and WSL2.
When afl-fuzz is running, a red message indicating “no new paths” (or something similar) means that fuzzing is probably not working. The most common cause of that is that the fuzzed program (xed in our case) immediately exits with an error message.

The first thing to try in such case is to run the same xed command outside of afl-fuzz, and make sure that everything is working fine.

If it is, then there must be a difference between us running xed manually and afl-fuzz running it. The next two points could be causes for such difference.
By default afl-fuzz chooses a name and directory for the “input” files it creates. If this confuses xed, the problem can be worked around with the -f option of afl-fuzz (see afl-fuzz -h for details), which forces it to use a fixed name and path for input files.
If everything is running properly, afl-fuzz should find bugs within the first 2 to 5 minutes.
Once afl-fuzz is running, it keeps looking for new crashes until you stop it. At first, it should be plenty enough to stop it after 5-10 crashes.
You need a varied set of input files to allow AFL++ to find crashes quickly. You can obtain such a set, for example, by compiling small programs on your own.
In order to get valid object files or executable files for xed to disassemble, you can either

E. Understanding crashes

One of the first steps, once you find a crash, is to pinpointing the exact location (in the source code) where it happens. You could obtain a strack trace at the point of the crash by using valgrind, but valgrind overloads malloc(), which affects pointer addresses, so your bug may not occur when running under valgrind. Debuggers do not have this drawback.

F. Hints specific to x86_64 Windows

Prefer WSL2 (over WSL1).
AFL++ Installation
1. Option 1 (easy)
  - apt-get install afl++
2. Option 2 (compile AFL++ from source)
  - Instructions are here: https://github.com/AFLplusplus/AFLplusplus/blob/stable/docs/INSTALL.md. As far as I have been told, the docker-based instructions don’t seem to work, so it is probably better to skip them.
  - Your WSL2 distribution probably has a version of LLVM more recent than 18. Therefore, ignore the instructions aiming to install version 18 specifically, replacing
```
sudo apt-get install -y lld-14 llvm-14 llvm-14-dev clang-14 || sudo apt-get install -y lld llvm llvm-dev clang
```
    by
```
sudo apt-get install -y lld llvm llvm-dev clang
```

G. Hints specific to MacOS

Make sure that your operating system is fully updated.
Ensure Homebrew is installed
AFL++ Installation
1. Option 1 (easy)
  - brew install afl++
  - Note that there is no afl-clang-lto. It is ok, use afl-clang-fast instead.
  - By default, the afl binaries are in:
    - /opt/homebrew/afl-clang-fast
    - /opt/homebrew/afl-clang-fast++
    - /opt/homebrew/afl-fuzz
2. Option 2 (compile AFL++ from source, not much harder)
  - The installation instructions for MacOS are here: https://github.com/AFLplusplus/AFLplusplus/blob/stable/docs/INSTALL.md#macos-on-x86_64-and-arm64 but they are a bit out of date, so before following them, first try this:
  - Just download AFL++ source and run make:
```
git clone https://github.com/AFLplusplus/AFLplusplus
cd AFLplusplus
make
```
  - If you get the message “LLVM mode could not be built … please install llvm-13”, then make sure that llvm is installed with brew:
```
brew install llvm
```
    then find the llvm-config utility, it should be somewhere like /opt/homebrew/Cellar/llvm/19.1.3/bin/ (the exact path will vary depending on the version of LLMV installed).
    
    Then, try to run AFL++’s Makefile again, but specifying where it can find llvm-config, like:
```
LLVM_CONFIG=/opt/homebrew/Cellar/llvm/19.1.3/bin/llvm-config   make
```
  - You may get a message “LLVM LTO mode could not be build”. You can safely ignore this.
  - Test failure messages containing “assembler command failed” can be ignored as well, as long as you get afl-clang-fast and afl-clang-fast++.
  - Note the full paths of the afl-clang-fast and afl-clang-fast++ executable binaries. You will need them later.
If you see a message
```
ld: unsupported tapi file type '!tapi-tbd' in YAML file
```
when trying to compile, then something went wrong with you system configuration. Find out where the ld utility is located by running:
```
which ld
```
You should get /opt/homebrew/bin/ld or /usr/local/bin/ld (or possibly /usr/bin/ld) but not anything to do with conda. If you see conda in ld’s path, then you must disable Anaconda.
```
conda deactivate
```
Note that even if it is deactivated now, Anaconda may have interferred with how LLVM, clang and AFL++ were installed. You may have, for example to re-run, with Anaconda deactivated,
```
brew install afl++
```
To get a stack trace, you can use lldb with the following command:
```
lldb path/to/bin/xed -o 'run -i path/to/crash/file' -o 'bt' -o 'quit'
```
where you need to adjust path/to/bin/scip and -f path/to/crash/file.mps to your specific needs. Explanations:
- We pass to lldb the path to the binary executable (and, optionally, additional parameters).
- By default, lldb starts in an interactive mode in which we can type commands. The -o command command-line parameter is equivalent to typing command in interactive mode.
- The run commands tells lldb to run the executable. We type after run the parameters we want to pass to the executable. In our case, we could want xed to read and solve a file, hence -i path/to/crash/file.
- The bt commands tells lldb to print the stack trace.
- The quit commands prevents lldb from entering interactive mode.
If you get an error message about llvm-ar not being found, you need to update your PATH environment variable so that it can be found.
- First make sure llvm is installed with brew:
```
brew install llvm
```
- Then, find the llvm-ar utility. It should be somewhere like /opt/homebrew/Cellar/llvm/19.1.3/bin/ (the exact path will vary depending on the version of LLMV installed). As a last resort, you can use the following commands to find the exact path (this will be a bit slow):
```
find /usr -name llvm-ar
find /opt -name llvm-ar
```
- Next, you can run mfile.py with the PATH variable overriden:
```
PATH=/opt/homebrew/Cellar/llvm/19.1.3/bin/:$PATH ./mfile.py
```
- You can make this change permanent to the current shell session by running:
```
export PATH=/opt/homebrew/Cellar/llvm/19.1.3/bin/:$PATH
```
- You can make this change permanent to all shell sessions by adding the line above to your ~/.zshrc file. (However, remember that this will only apply to shell sessions opened after the change.)
If you get an error message about MacOSX26.sdk not being found (or MacOSX14.sdk, or any other version, as long as it corresponds to the version of OSX you have installed), you can create that file. For example, if the message claims that /Library/Developer/CommandLineTools/SDKs/MacOSX26.sdk is not found, but you have /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk (without the version 26 in the file name), then you create the missing file by running the following commands (adapt them to the version of the missing file):
```
cd /Library/Developer/CommandLineTools/SDKs
sudo ln -s MacOSX.sdk MacOSX26.sdk
# your password will be prompted here
```

If you get an error message about ld not supporting the -z option, you can force mfile.py to stop using that option by editing the xed_build_common.py file:

diff --git a/xed_build_common.py b/xed_build_common.py
index a0ba03f0..402adc45 100755
--- a/xed_build_common.py
+++ b/xed_build_common.py
@@ -136,13 +136,13 @@ def gnu_secured_build(env: dict) -> str:
             ### Position Independent Code ###
             # Generates position-independent code during the compilation phase
             flags += ' -fPIC'
-            if not env['debug']:
-                ### Stack and Heap Overlap Protection ###
-                # Enables Read-Only Relocation (RELRO) and Immediate Binding protections
-                env.add_to_var('LINKFLAGS','-Wl,-z,relro,-z,now')
-                ### Inexecutable Stack ###
-                # Specifies that the stack memory should be marked as non-executable
-                env.add_to_var('LINKFLAGS','-z noexecstack')
+            #if not env['debug']:
+            #    ### Stack and Heap Overlap Protection ###
+            #    # Enables Read-Only Relocation (RELRO) and Immediate Binding protections
+            #    env.add_to_var('LINKFLAGS','-Wl,-z,relro,-z,now')
+            #    ### Inexecutable Stack ###
+            #    # Specifies that the stack memory should be marked as non-executable
+            #    env.add_to_var('LINKFLAGS','-z noexecstack')
         else:  # Windows
             # Enables Data Execution Prevention (DEP) for executables.
             env.add_to_var('LINKFLAGS','-Wl,-nxcompat')

If you get an error about the cpuid.h header being for x86 only:

.../cpuid.h:14:2: error: this header is for x86 only
14 | #error this header is for x86 only
|  ^
.../cpuid.h:312:5: error: invalid output constraint '=a' in asm
312 |     __cpuid(__leaf, __eax, __ebx, __ecx, __edx);

then, you can fix it by applying the following changes to examples/xed-examples-util.c:

diff --git a/examples/xed-examples-util.c b/examples/xed-examples-util.c
index 79127a07..909a77c0 100644
--- a/examples/xed-examples-util.c
+++ b/examples/xed-examples-util.c
@@ -27,7 +27,7 @@ END_LEGAL */
# include <sys/types.h>
# include <sys/stat.h>
# include <fcntl.h>
-#include <cpuid.h>
+//#include <cpuid.h>
#endif
#include <ctype.h>
#include <stdlib.h>
@@ -1716,7 +1716,10 @@ void xed_print_intel_asm_emit(const xed_uint8_t* array, unsigned int olen) {
void get_cpuid(xed_uint32_t leaf, xed_uint32_t subleaf, 
        xed_uint32_t* eax, xed_uint32_t* ebx, xed_uint32_t* ecx, xed_uint32_t* edx) {
#if defined(XED_MAC) || defined(XED_LINUX) || defined(XED_BSD)
-    __cpuid_count(leaf, subleaf, eax, ebx, ecx, edx);
+    //__cpuid_count(leaf, subleaf, eax, ebx, ecx, edx);
+    (void)leaf;
+    (void)subleaf;
+    *eax = *ebx = *ecx = *edx = 0;
#else
int cpu_info[4];
__cpuidex(cpu_info, leaf, subleaf);

If afl-fuzz about shmget() failing or other issues at startup, you may have to run:
```
sudo afl-system-config
```
before running afl-fuzz

Software Engineering - Fall 2025