Breaking the Handshake: Network Protocol Fuzzing

If you have ever run a fuzzer like AFL++ on a file parser (like libpng or ffmpeg), you know the drill: you throw mutated bits at a binary until it crashes. But try to point that same logic at a stateful network server, and you will hit a wall.

Network fuzzing is a systematic approach to discovering security and robustness flaws in software that communicates over structured protocols. It differs from file fuzzing or command-line fuzzing because network protocols involve timing, ordering, and interaction among multiple peers. Recent surveys cover the landscape of protocol fuzzing and stateful system testing in depth [1] [2].

In this post we’re going to look at why naïve random mutation fails against network protocols, and how modern stateful fuzzing architectures address this using structure modeling and deterministic harnessing. We will also look at practical examples targeting the MQTT protocol [3].

Here’s the flow:

Why network fuzzing fails with naïve mutation.
How to model syntax and enforce determinism.
Three practical MQTT setups: white-box, black-box, and gray-box.

The Problem: It’s Not Just About the Bytes

In file fuzzing, the input is static. In network fuzzing, the input is a conversation.

As defined in modern System Under Test (SUT) models, a network protocol is a state machine. If your fuzzer sends a valid GET_DATA packet, but sends it before the HANDSHAKE is complete, the server won’t crash, it will just close the connection. The server logic is correct; the fuzzer is just “speaking out of turn.”

There are four specific challenges that make this difficult:

Statefulness: Each message is interpreted relative to prior state. Without valid handshakes, most messages are rejected, so random mutation remains stuck in the connection setup phase.
Structure: Messages contain complex fields whose values depend on each other. Checksums, length prefixes, and magic bytes create dependencies. If you flip a bit in the payload but don’t update the CRC32, the packet is dropped before it ever hits the vulnerable logic.
Asynchrony: Network I/O, timers, and multithreading introduce nondeterminism. A timeout may arise from scheduling rather than input semantics, producing noise in feedback.
Diversity: Implementations interpret the same specification differently. Vendors add extensions or tolerate errors differently, requiring fuzzers that remain robust under dialect variation.

Syntax Modeling

To get past the “Structure” challenge, we cannot simply cat /dev/urandom into a socket. We need to model the protocol syntax.

When a formal grammar isn’t available, we often use tools like Scapy (Python) to define the packet boundaries. This allows the fuzzer to mutate fields rather than bytes.

Here is an example of modeling a custom binary protocol using Python. Instead of fuzzing the raw stream, we fuzz the fields, and let the generator handle the “hard” constraints like Length and Checksum.

 1class MyCustomProto(Packet):
 2    name = "MyProto"
 3    fields_desc = [
 4        # The fuzzer shouldn't mutate this randomly,
 5        # or the parser rejects it immediately.
 6        IntField("magic", 0xDEADBEEF),
 7
 8        # Field dependencies: The length must match the payload
 9        ShortField("len", None),
10
11        # This is where we want the fuzzer to go wild
12        StrLenField("payload", "", length_from=lambda x: x.len),
13
14        # Checksum must be recalculated on every mutation
15        IntField("checksum", None)
16    ]
17
18    def post_build(self, p, pay):
19        # Automatically fix length and checksum before sending
20        if self.len is None:
21            l = len(p) - 8 # simplified calculation
22            p = p[:4] + struct.pack("!H", l) + p[6:]
23        if self.checksum is None:
24            ck = crc32(p[:-4])
25            p = p[:-4] + struct.pack("!I", ck)
26        return p + pay

By using a model like this, we ensure that 100% of our generated test cases pass the initial packet validation checks. This sharp reduction in “wasted executions” allows the fuzzer to reach semantic logic much faster.

The Harness: Enforcing Determinism

A major bottleneck in network fuzzing is the kernel. Opening a TCP socket, performing a handshake, and waiting for recv() is incredibly slow (in CPU time) and introduces timing noise. If a test case crashes the server, can you reproduce it? If the network was laggy, was the crash caused by the packet content or the delay?

Modern approaches (like NSFuzz, AFLNet, or StateAFL) often use “desocketing” or LD_PRELOAD hooks to replace network calls with memory buffers [4] [5] [6]. Also lock down reproducibility: fix seeds, avoid background threads, and use deterministic timers where possible.

Here is a conceptual C++ harness compatible with LibFuzzer. Instead of running the server and connecting via localhost, we link directly against the server library and feed data directly to the parsing function.

 1// harness.cc
 2#include "server_lib.h"
 3#include <stdint.h>
 4#include <stddef.h>
 5
 6extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
 7    // 1. We must ensure every run starts from a clean slate to maintain
 8    // auditability.
 9    ServerState* SUT = server_init();
10
11    // 2. We inject the data directly into the library
12    server_process_buffer(SUT, data, size);
13
14    // 3. Collect Feedback / Clean up
15    server_teardown(SUT);
16
17    return 0;
18}

This harness controls Timing (no network waits), Isolation (fresh memory), and Synchronization (single-threaded execution).

The Feedback Loop

Once we have structured inputs and a deterministic harness, we need feedback. Classical greybox fuzzers use edge coverage. Protocol fuzzers extend this with state coverage [5] [6]:

Response-based state modeling: Clustering responses (e.g., 200 OK vs 500 Error) to infer the protocol state machine.
Variable-based state modeling: Instrumenting the SUT to watch key variables, like session_state.

The scheduler uses this info to prioritize inputs. If a specific sequence of messages transitions the server from INIT to AUTHENTICATED, the fuzzer saves that sequence and uses it as a prefix for future mutations.

Here is a quick comparison of the three approaches:

Approach	Structure	State	Throughput	Setup effort
White-box (AFL++)	Optional	Optional	High	High
Black-box (Boofuzz)	Required	Required	Low	Medium
Gray-box (AFLNet)	Optional	Required	Medium	High

Fuzzing MQTT

To demonstrate these concepts, let’s look at two practical approaches to fuzzing the MQTT protocol. The MQTT specification and its security guidance provide the baseline protocol behavior and constraints we are testing against [3] [7].

Whitish-Box: AFL++ (Source Available)

In a whitish-box scenario, we have access to the source code (e.g., the Mosquitto broker). Our goal is to bypass the operating system’s networking stack entirely to increase throughput and determinism.

By using socketpair, we create a bidirectional communication channel that resides entirely in memory. The fuzzer writes mutated data to one end of the pair (sv[1]), and we trick the Mosquitto instance into reading from the other end (sv[0]). To Mosquitto, it looks like a standard network client, but to the kernel, it is just a memory copy. This allows us to use standard sanitizers (like ASAN) to detect memory corruption immediately, without the flakiness of timeouts or dropped packets.

 1#include <stdint.h>
 2#include <stddef.h>
 3#include <unistd.h>
 4#include <sys/socket.h>
 5#include <fcntl.h>
 6
 7extern "C" {
 8#include "mosquitto.h"
 9#include "mosquitto_internal.h"
10#include "packet_mosq.h"
11}
12
13static int g_initialized = 0;
14
15extern "C" int LLVMFuzzerInitialize(int *argc, char ***argv) {
16    if (mosquitto_lib_init() != MOSQ_ERR_SUCCESS) return 0;
17    g_initialized = 1;
18    return 0;
19}
20
21extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
22    if (!g_initialized || size < 2 || size > 65535) return 0;
23
24    // Create a fresh Mosquitto instance for every iteration
25    struct mosquitto *mosq = mosquitto_new(NULL, true, NULL);
26    if (!mosq) return 0;
27
28    // Create a socket pair to mock the network connection
29    // sv[0] is for the server, sv[1] is for the fuzzer
30    int sv[2];
31    if (socketpair(AF_UNIX, SOCK_STREAM, 0, sv) == -1) {
32        mosquitto_destroy(mosq);
33        return 0;
34    }
35
36    // Write the fuzzer's mutation directly into the socket buffer
37    write(sv[1], data, size);
38
39    // Assign the server-side socket to the mosquitto instance
40    mosq->sock = sv[0];
41    mosq->protocol = mosq_p_mqtt5;
42    mosq->state = mosq_cs_connected;
43
44    // Execute the protocol parsing logic synchronously
45    packet__read(mosq);
46
47    close(sv[0]); close(sv[1]);
48    mosquitto_destroy(mosq);
49    return 0;
50}

Running this harness allows AFL++ to fuzz the Mosquitto broker at speeds exceeding 8,000 executions per second on a single core, as shown in the status screen below:

 1       american fuzzy lop ++4.35a {default} (./fuzz_mosquitto) [explore]
 2┌─ process timing ────────────────────────────────────┬─ overall results ────┐
 3│        run time : 0 days, 0 hrs, 3 min, 37 sec      │  cycles done : 30    │
 4│   last new find : 0 days, 0 hrs, 3 min, 28 sec      │ corpus count : 548   │
 5│last saved crash : 0 days, 0 hrs, 3 min, 11 sec      │saved crashes : 3     │
 6│ last saved hang : none seen yet                     │  saved hangs : 0     │
 7├─ cycle progress ─────────────────────┬─ map coverage┴──────────────────────┤
 8│  now processing : 78.97 (14.2%)      │    map density : 3.98% / 17.74%     │
 9│  runs timed out : 0 (0.00%)          │ count coverage : 2.81 bits/tuple    │
10├─ stage progress ─────────────────────┼─ findings in depth ─────────────────┤
11│  now trying : havoc                  │ favored items : 101 (18.43%)        │
12│ stage execs : 73/100 (73.00%)        │  new edges on : 163 (29.74%)        │
13│ total execs : 1.76M                  │ total crashes : 71 (3 saved)        │
14│  exec speed : 8048/sec               │  total tmouts : 14 (0 saved)        │
15├─ fuzzing strategy yields ────────────┴─────────────┬─ item geometry ───────┤
16│   bit flips : 1/96, 1/95, 1/93                     │    levels : 2         │
17│  byte flips : 0/12, 1/11, 2/9                      │   pending : 0         │
18│ arithmetics : 1/812, 0/1344, 0/1032                │  pend fav : 0         │
19│  known ints : 0/99, 2/386, 0/484                   │ own finds : 44        │
20│  dictionary : 0/228, 0/247, 0/0, 0/0               │  imported : 425       │
21│havoc/splice : 36/1.71M, 0/0                        │ stability : 100.00%   │
22│py/custom/rq : unused, unused, unused, unused       ├───────────────────────┘
23│    trim/eff : 0.66%/38.8k, 83.33%                  │          [cpu000: 31%]
24└─ strategy: explore ────────── state: started :-) ──┘

However, this approach has limitations. It requires access to source code and often necessitates significant changes to the build process to link statically against the target library. It also mocks out the network layer, which means bugs specific to the actual socket handling (like epoll race conditions) might be missed.

Black-Box: Boofuzz

When source code is unavailable, we cannot bypass the network stack. Instead, we use tools like Boofuzz to model the protocol grammar and state machine explicitly. Related black-box work like Snipuzz shows how far message-structure inference can go without source access [8].

In this script, we solve two problems:

Structure: We define a mqtt_varlen_encoder to handle MQTT’s variable-length integers automatically. If the fuzzer expands the payload, this encoder ensures the length field remains valid so the packet is accepted by the broker.
State: We use session.connect to define a graph. The fuzzer learns that it must successfully send a MQTT_CONNECT packet before it attempts to send a MQTT_PUBLISH packet.

 1def mqtt_varlen_encoder(value):
 2    n = int.from_bytes(value, byteorder="big", signed=False) if value else 0
 3    if n < 0 or n > 268_435_455: raise ValueError(f"Remaining Length out of range for MQTT varint: {n}")
 4    out = bytearray()
 5    while True:
 6        encoded = n % 128
 7        n //= 128
 8        if n > 0: encoded |= 0x80
 9        out.append(encoded)
10        if n == 0: break
11    if len(out) > 4: raise ValueError("MQTT varint produced >4 bytes, which is invalid.")
12    return bytes(out)
13
14def build_mqtt_packet(name: str, control_header: Union[int, dict], variable_header_fields=None, payload_fields=None):
15    variable_header_fields = variable_header_fields or []
16    payload_fields = payload_fields or []
17
18    def build_fields(field_defs):
19        elements = []
20        for f in field_defs:
21            ftype, fname, fval, fuzzable, endian, max_len = f.get("type"), f.get("name"), f.get("value", 0), f.get("fuzzable", True), f.get("endian", "big"), f.get("max_len", None)
22
23            if ftype == "group":
24                values, default_value = f.get("values", []), f.get("default_value", None)
25                elements.append(Group(name=fname, values=values, default_value=default_value, fuzzable=fuzzable))
26            elif ftype == "byte": elements.append(Byte(name=fname, default_value=fval, fuzzable=fuzzable))
27            elif ftype == "word": elements.append(Word(name=fname, default_value=fval, endian=endian, fuzzable=fuzzable))
28            elif ftype == "string":
29                elements.append(Size(name=f"{fname}_len", block_name=fname, length=2, endian=">", fuzzable=False))
30                elements.append(String(name=fname, default_value=fval, fuzzable=fuzzable, max_len=max_len))
31            elif ftype == "raw": elements.append(Bytes(name=fname, default_value=fval, fuzzable=fuzzable))
32
33        return elements
34
35    if type(control_header) == dict:
36        fvalues, fdef, ffuzz = control_header.get("values", None), control_header.get("default_value", None), control_header.get("fuzzable", False)
37        if fvalues == None and fdef == None:
38            print("[FATAL] At least values or default value has to be specified for the control header")
39            exit()
40        ch = Group(name="ControlHeader", values=fvalues, default_value=fdef, fuzzable=ffuzz)
41    else:
42        ch = Byte(name="ControlHeader", default_value=control_header, fuzzable=False)
43
44    return Request(name, children=(
45        Block(name="FixedHeader", children=(
46            ch, # Control Header
47            Block(name="RemainingLength", children=Size(name="RemainingLengthRaw", block_name="Body", fuzzable=False, length=4, endian=">"), encoder=mqtt_varlen_encoder, fuzzable=False)
48        )),
49        Block(name="Body", children=(
50            Block(name="VariableHeader", children=build_fields(variable_header_fields)),
51            Block(name="Payload", children=build_fields(payload_fields))
52        ))
53    ))
54
55def build_connect_request():
56    variable_header = [
57        {"type": "string", "name": "ProtocolName", "value": "MQTT", "fuzzable": False},
58        {"type": "byte", "name": "ProtocolLevel", "value": 5, "fuzzable": False},
59        {"type": "byte", "name": "ConnectFlags", "value": 0x02, "fuzzable": False},
60        {"type": "word", "name": "KeepAlive", "value": 60},
61        {"type": "byte", "name": "PropertiesLength", "value": 0, "fuzzable": False},
62    ]
63
64    payload = [
65        {"type": "string", "name": "ClientID", "value": "boofuzz", "max_len": 30}
66    ]
67    return build_mqtt_packet("MQTT_CONNECT", 0x10, variable_header, payload)
68
69# ... ... ...

The session.connect calls internally create a Finite State Machine representing the protocol evolution. This can be shown with session.render_graph_graphviz().create_png().

The output from my script produced the following

MQTT State Graph

Boofuzz generates a web/tui interface to track progress that provides immediate feedback on the state protocol graph traversal:

 1# NOTE: this is an extract of the tui as the output was longer and not really
 2# usefull to paste it here
 3
 4│[2025-12-29] Test Case: 1460: MQTT_CONNECT->MQTT_SUBSCRIBE:[MQTT_SUBSCRIBE.Body.Payload.TopicFilter:751]
 5│[2025-12-29]     Info: Type: String
 6│[2025-12-29]     Info: Opening target connection (127.0.0.1:1883)...
 7│[2025-12-29]     Info: Connection opened.
 8│[2025-12-29]    Test Step: Monitor CallbackMonitor#140523418095968[pre=[],post=[],restart=[],post_start_target=[]].pre_send()
 9│[2025-12-29]    Test Step: Transmit Prep Node 'MQTT_CONNECT'
10│[2025-12-29]     Info: Sending 22 bytes...
11│[2025-12-29]     Transmitted 22 bytes: 10 14 00 04 4d 51 54 54 05 02 00 3c 00 00 07 62 6f 6f 66 75 7a 7a b'\x10\x14\x00\x04MQTT\x05\x02\x00<\x00\x00\x07boofuzz'
12│[2025-12-29]     Info: Receiving...
13│[2025-12-29]     Received: 20 09 00 00 06 22 00 0a 21 00 14 b' \t\x00\x00\x06"\x00\n!\x00\x14'
14│[2025-12-29]    Test Step: Callback function 'conn_callback'
15│[2025-12-29]     Info: Received CONNACK: 200900000622000a210014
16│[2025-12-29]    Test Step: Fuzzing Node 'MQTT_SUBSCRIBE'
17│[2025-12-29]     Info: Sending 1000010 bytes...
18│[2025-12-29]     Error!!!! SIGINT received ... exiting

While Boofuzz is excellent for logic and state testing, it is significantly slower than AFL++ because it operates over the real network stack. It typically manages 10-50 executions per second compared to AFL++’s thousands. Furthermore, defining the protocol grammar manually is time-consuming and prone to errors if the specification is complex.

Gray-Box: AFLNet

//TBD

Conclusion

We covered why naïve mutation fails on protocols, how syntax modeling keeps inputs valid, and how deterministic harnesses reduce noise so feedback can guide exploration. The MQTT examples show the trade-offs: white-box harnesses buy speed, black-box frameworks buy reach, and gray-box fuzzers bridge the two. The practical takeaway is simple: start by making inputs valid enough to reach deep logic, then let coverage or state feedback do the exploration. LLM-guided approaches are emerging as another way to learn protocol structure and guide mutations [9].

References

[1] Zhang et al. (2024). A Survey of Protocol Fuzzing. ACM Computing Surveys.

[2] Daniele, C., et al. (2024). Fuzzers for Stateful Systems: Survey and Research Directions. ACM Computing Surveys.

[3] OASIS (2021). MQTT Version 5.0 Specification.

[4] Qin, S., et al. (2022). NSFuzz: Towards Efficient and State-Aware Network Service Fuzzing. Proceedings of the 1st International Fuzzing Workshop.

[5] Pham, V.-T., et al. (2020). AFLNet: A Greybox Fuzzer for Network Protocols. IEEE ICST.

[6] Natella, R. (2022). StateAFL: Greybox Fuzzing for Stateful Network Servers. Empirical Software Engineering.

[7] OASIS (2021). MQTT NIST Cybersecurity.

[8] Feng, X., et al. (2021). Snipuzz: Black-box Fuzzing of IoT Firmware via Message Snippet Inference. ACM CCS.

[9] Meng, R., et al. (2024). Large Language Model Guided Protocol Fuzzing. NDSS.