Unsafe at Any Speed: Tradeoffs and values in the Rust ecosystem

So, I’m building something dumb in Rust, and I need the dumb thing to talk to the Internet. This should be simple¹—I just want to send a handful of GETs and PUTs.

I guess I need an HTTP library. Lucky for me, Rust has an amazing ecosystem—Cargo and crates.io are easily some of of my favorite language features—but this also means a dizzying set of options. A quick search reveals dozens of HTTP client libraries, and while some are certainly more popular than others, it’s hard to tell what I want.

So I ask friends for advice, and one of them points me at a great survey by Sergey Davidoff. I’ve never met Sergey, but we seem to share some values:

Simpler is better: Complex problems require complex solutions, but we should strive for simplicity in our software. Code is a liability—less code is less that could go wrong, and less to debug when things do go wrong! I want a single HTTPS connection. I don’t need persistent sessions with connection pools and cookies. I don’t need an async runtime.² I need a glass of scotch, a socket, and a few syscalls. In the same vein,
Reliability trumps performance: Fast code matters—I’ve spent most of my career working on hard-real-time systems that are useless unless they respond in a few microseconds. But correctness is at least as important as fastness, especially when navigating the radioactive hellscape we call the Internet. We should prefer code that obviously does what it’s supposed to over bespoke reimplementations of standard tools (memory allocation, string handling, etc).

Sergey wrote all of this in the ancient year of 2020. Let’s see what the modern world brings us.

Today, reqwest seems to be the most popular HTTP client library, just as it was back then. It has a simple, blocking³ API and a nice feature set, customizable via Cargo options. And like several other popular client crates, reqwest sits on top of hyper, a stack whose home page bills it as the HTTP implementation for Rust. The same site promises us that hyper is “no more complicated than it has to be.” I wonder if whoever wrote that has seen the source code.

Hyper contains a stunning amount of custom machinery. Whether you’re putting HTTP headers into a hash map, or even just skipping blank lines, hyper does things its own way. And its own way is often unsafe.

Writing a bunch of unsafe code isn’t inherently wrong, nor does it mean there must be bugs hiding inside. Hyper and friends all have thorough test suites, and unsafe is a vital escape hatch that all Rust code ultimately needs to talk to your system. But unsafe code requires an extra level of care, and even very careful people fuck up. This code is for the Internet! Malicious actors are the norm here—it’s a place where honest mistakes end with foreign mafias and bored teenagers using your computer as a timeshare. I’d expect authors who speak at length on the benefits of memory-safe web tooling to be a lot more hesitant to throw that safety out. And writing your own hash map and string splitting while claiming your code is “no more complicated than it has to be” comes off as at least a little disingenuous.

What does this unsafe space magic buy me, anyway? Comments suggest a decent 5%–10% speedup. But even if it were twice as fast as boring standard library code, our buddy Amdahl reminds us that all this cleverness is a miniscule sliver of time compared to sending HTTP requests around the world—or even just around the datacenter. latency numbers

And does it even buy me that? Take HeaderMap—hyper is kind enough to provide benchmarks, and at first glance, its implementation does beat std::collections::HashMap. But the Rust standard library deliberately picks slower, randomized hashes to prevent denial of service attacks! Replace those with something like rustc-hash, and the standard library immediately starts beating HeaderMap in most tests.⁴ (Hyper’s implementation is also designed to be DOS-resistant, but I didn’t find any docs explaining how those abilities compare to the standard library’s, or if the benchmarks and test suite are meant to exercise them.)

Back in 2020, Sergey closed with a dismal picture of HTTP in Rust:

The place of the go-to Rust HTTP client is sadly vacant.

Things certainly haven’t improved as much as I hoped.

I’m not sure what the lesson is here. Am I just missing something? Do huge swaths of Rust users value vanishingly small performance gains over memory safety, in a language that prides itself on being able to provide speed and safety? Or do most people just not care how the sausage is made?

I’m not sure what the answer is, but now I’m sad. I’ll be over here looking for an HTTP stack that doesn’t reinvent str::lines().⁵

Daniel Stenberg and the decades of work he’s put into cURL might beg to differ! But I’d hope that in the Year of Our Lord 2024, we have some tools to simplify the task. ↩
Despite what I’ve said, I have nothing against Tokio and Big Async. It’s well-designed (within Rust’s design goals) and people are using it to make some amazing software. If I were serving ten thousand connections, it would be the perfect tool for the job. But I’m serving one, and I don’t need an event loop and a work-stealing thread pool. ↩
…though the blocking API seems to just wrap the non-blocking one in wait loops with timeouts. I promise I don’t need Tokio for a single socket, really! ↩
I’m told Hyper’s implementation predates the standard library switching to Swiss Tables. This would explain a lot, except for why hyper still isn’t using them. ↩
ureq seems promising. ↩