serde and borsh: Two Rust Libraries, Two Different Worlds

Why I'm Suddenly Staring at Two Serialization Libraries

I'm a few weeks into rewriting parts of my Solana MEV bot in Rust, and the same two crate names keep showing up in every file I touch: serde and borsh. They both have derive macros that look almost identical. They both turn structs into bytes and bytes back into structs. For a brief, confused afternoon I wonder why the Rust ecosystem needs two of these things at all.

Then I actually try to use them, and the difference becomes obvious within an hour. One of them is for talking to the outside world — REST APIs, config files, anything a human or another service needs to read. The other is for talking to a Solana program — raw bytes packed into an account, no metadata, byte-for-byte deterministic. Mixing them up is like trying to mail a paper letter through a fiber optic cable. They're both "data transfer," but the assumptions could not be further apart.

I want to write down what I'm figuring out as I go, because every Rust developer working on Solana hits this fork in the road, and almost nobody explains the two libraries together in a way that makes the choice feel intuitive.

What serde Actually Is

The official tagline at serde.rs calls it "a framework for serializing and deserializing Rust data structures efficiently and generically." The name itself is just SER plus DE — serialize, deserialize. No clever acronym, no marketing.

The thing that took me a minute to internalize is that serde is not a format. It's a framework. The official docs split the ecosystem into two groups: data structures that know how to describe themselves (Serialize, Deserialize traits), and data formats that know how to write/read bytes (Serializer, Deserializer traits). Serde is the translator sitting between them.

What that buys you is staggering. According to the official documentation, more than twenty-five data formats plug into the same trait system. You write #[derive(Serialize, Deserialize)] once on your struct and the same struct is suddenly compatible with all of them. It's the closest thing Rust has to a universal data adapter — like how a single USB-C port now charges your laptop, your phone, and your headphones with the same cable.

And serde does it without runtime reflection. The traits are wired up at compile time via derive macros, which is why the official site emphasizes that the compiler "can often completely optimize away interactions between data structures and formats." Java- or Python-style reflection has a tax; serde has almost none.

A serde Example That Looks Like Magic

The ergonomics genuinely surprise me. Per the official derive guide, the entire setup in Cargo.toml is one line:

serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"

And then a full round trip looks like this:

use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize, Debug)]
struct Point {
    x: i32,
    y: i32,
}

fn main() {
    let point = Point { x: 1, y: 2 };
    let serialized = serde_json::to_string(&point).unwrap();
    // {"x":1,"y":2}
    let deserialized: Point = serde_json::from_str(&serialized).unwrap();
}

That's it. The struct definition and the JSON output share field names. No mapping layer, no schema files, no manual parsers. Coming from years of writing JSON.parse(...).then(validate) chains in TypeScript, I keep waiting for the catch. There isn't really one — at least not for typical web-style payloads.

The serde_json crate, currently at version 1.0.149 per docs.rs, gives you both ends of the spectrum. If you know your shape, you parse straight into a typed struct. If you don't, you parse into the untyped Value enum:

enum Value {
    Null,
    Bool(bool),
    Number(Number),
    String(String),
    Array(Vec<Value>),
    Object(Map<String, Value>),
}

For my bot, the typed path is what I want everywhere. Untyped JSON in a hot loop is how you end up with bugs that only show up on Tuesday afternoons in production.

serde's Customization Surface

What keeps surprising me is how much real-world ugliness serde already has answers for. The Shuttle blog's serde walkthrough lays out the attribute system that solves most of the problems I've hit so far.

Upstream API uses camelCase and Rust insists on snake_case?

#[derive(Deserialize, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct MyStruct {
    my_message: String,  // (de)serializes as "myMessage"
}

A field named type that conflicts with a Rust keyword?

#[serde(rename = "type")]
kind: String,

Want defaults when a field is missing from the JSON?

#[serde(default)]
my_field: String,

Want to scream loudly if the JSON has fields you didn't expect (great for catching upstream API drift)?

#[serde(deny_unknown_fields)]
pub struct MyStruct { /* ... */ }

For enums, serde handles both external tagging (the default) and internal tagging:

#[derive(Deserialize, Serialize)]
#[serde(tag = "type")]
enum MyEnum {
    Data { id: String, data: Value },
    SomeOtherData { id: i32, name: String },
}

That last one alone has saved me hours of writing custom adapters for upstream APIs that mix discriminator fields into their payloads. Compared with hand-rolling a parser, it's the difference between buying groceries at a supermarket versus growing your own vegetables.

What borsh Actually Is

Now the other library. The official borsh website spells out the acronym: Binary Object Representation Serializer for Hashing. The phrase "for Hashing" is the part that explains everything else.

Borsh, currently at version 1.6.1 per the borsh-rs README, is maintained by the NEAR Protocol team and requires Rust 1.77+. Unlike serde, it is not a framework over many formats. It is exactly one binary format with one specification. The official site lists its design goals in priority order: consistency, safety, specification, speed.

Consistency comes first deliberately. The site says the format guarantees a "bijective mapping between objects and their binary representations" — meaning the same struct serializes to the same bytes every single time, on every machine, in every language that implements the spec. That property is what makes the bytes safe to hash, sign, and compare.

The official documentation also makes a striking statement that took me a while to appreciate: borsh "achieves high performance in Rust by opting out from Serde." This is unusual. Most binary formats in the Rust world (bincode, postcard, MessagePack via rmp-serde) are built on serde. Borsh deliberately is not. The trade-off is intentional: by not implementing the generic Serializer/Deserializer machinery, borsh produces smaller code and gains some features that serde-compatible formats can't offer, like borsh_init (a hook that runs after deserialization) and borsh_skip.

How borsh Encodes Bytes

Reading the borsh spec is the moment the philosophy clicks for me. There are no surprises in the bytes — and that's the whole point.

From the official spec and the excellent RareSkills walkthrough of Solana borsh, the encoding rules are:

  • Integers: little-endian byte order.
  • Dynamic containers (vectors, strings, maps): a u32 length prefix written before the values.
  • Unordered containers (HashMap, HashSet): keys sorted lexicographically before serialization, so the bytes are deterministic.
  • Structs: fields written sequentially in declaration order, no padding.
  • Enums: a u8 variant index followed by the variant's data.
  • Option<T>: 0x00 for None, 0x01 followed by the value for Some.

The RareSkills post has a tiny example I keep coming back to. Encoding the string "hi" produces:

[0x02, 0x00, 0x00, 0x00, 0x68, 0x69]

Four bytes for the length (2, in little-endian u32), then the two UTF-8 bytes for h and i. That's it. No quotes, no field names, no separators, no schema markers. If you don't know that this blob represents a string, you can't tell. You just have to know.

A struct with multiple fields lays out exactly the same way — bytes flow sequentially with no gaps:

struct UserData {
    active: bool,    // 1 byte
    age: u8,         // 1 byte
    name: String,    // 4 bytes (length) + UTF-8
    scores: Vec<u8>, // 4 bytes (length) + values
}

Serialized: [active][age][name_len_4][name_utf8...][scores_len_4][scores...]. Every byte has a job. Nothing is decorative.

Using borsh in Code

The Rust API mirrors serde's ergonomics, just with different trait names. Straight from the borsh-rs README:

use borsh::{BorshSerialize, BorshDeserialize, from_slice, to_vec};

#[derive(BorshSerialize, BorshDeserialize, PartialEq, Debug)]
struct A {
    x: u64,
    y: String,
}

fn main() {
    let a = A { x: 3301, y: "liber primus".to_string() };
    let encoded = to_vec(&a).unwrap();
    let decoded = from_slice::<A>(&encoded).unwrap();
    assert_eq!(a, decoded);
}

If you cover up the trait names, the only structural difference between borsh and serde is which trait you derive. Everything else — the attribute style, the trait-based design, the compile-time generation — feels deeply familiar.

That's no accident, in my read. NEAR's team clearly studied serde's API ergonomics and kept what works while throwing out the parts that don't fit a hash-deterministic binary world.

Why Solana Picks borsh

This is where the story stops being abstract for me. I'm building a Solana bot, so borsh isn't a choice — it's the format the chain itself uses for on-chain account data and instruction payloads. The Solana Cookbook serialization guide and the RareSkills tutorial converge on the same point: borsh is Solana's standard.

Three reasons jump out as I read.

First, determinism for hashing. When the runtime computes hashes over account data or transaction payloads, it needs the bytes to be identical no matter who produced them. Most JSON serializers don't guarantee this — different libraries put fields in different orders, different float representations, etc. Borsh's spec is strict enough that two clients written in different languages will produce the exact same bytes for the same struct.

Second, storage cost. Solana charges rent in proportion to account size. Every extra byte costs lamports. Self-describing formats like JSON pay a tax for every field name they embed; borsh pays nothing because the layout is implicit. For a bot that creates thousands of accounts over its lifetime, the difference adds up like the difference between flying carry-on and checking three suitcases on every Southwest flight.

Third, cross-language compatibility. There are borsh implementations in Rust, JavaScript/TypeScript, Python, Java, and Go. My bot's hot path runs in Rust, but the off-chain tooling reads the same account data from a TypeScript dashboard. Both sides agree on the byte layout because there's a formal specification both implementations follow.

Anchor: borsh by Default

Almost every Solana program written in the last couple of years uses Anchor on top of native Solana, and Anchor leans on borsh implicitly. The official Anchor program-structure documentation describes four core macros: declare_id!, #[program], #[derive(Accounts)], and #[account].

The one I care about today is #[account]. Anchor's docs put it plainly: "Account data is automatically serialized and deserialized as the account type" via this macro, and the underlying format is borsh. So when I write:

#[account]
pub struct NewAccount {
    data: u64,
}

Anchor wires up borsh Serialize and Deserialize plus an 8-byte account discriminator (the first 8 bytes of SHA256("account:NewAccount")) that lets the runtime detect when the wrong account type is passed in. That's why every Anchor space allocation looks like space = 8 + <data bytes> — the 8 is for the discriminator, the rest is the borsh-serialized payload.

The discriminator is a small but clever guardrail. Without it, two different account types with similar layouts could be silently confused. It's the same kind of "check the magic bytes" pattern you see in file format parsers — a safety net against type confusion that a non-self-describing format would otherwise lack.

Performance: Where the Caveats Live

I keep seeing benchmark posts on Reddit that put borsh "in the middle of the pack" — faster than JSON, slower than rkyv or bitcode. A community-maintained benchmark, tested on rustc 1.97.0-nightly, measures serialize/deserialize time, output size, zero-copy access, and compression.

A few headline numbers I find useful as anchors: bitcode reaches 138.84 µs on log-style data, while rkyv hits 1.2462 ns for zero-copy access. Borsh isn't the absolute fastest in either category. The official borsh.io claim is more measured — "faster than bincode in some cases" — and it's tested on standard cloud instances using Criterion against blockchain-relevant objects: blocks, headers, transactions.

I've stopped caring about which library wins synthetic benchmarks, honestly. For my use case, the question isn't "what's fastest in microseconds." It's "what does the runtime require, and what's deterministic enough to hash." Borsh wins both of those automatically. Asking whether rkyv is faster is like asking whether a Tesla is faster than a delivery van — true, but irrelevant if your job is delivering packages.

Side-by-Side: When to Reach for Each

After a few weeks of using both, I've boiled it down to a one-line rule that I now write at the top of every new file:

  • External world → serde. REST APIs, config files, logs I might want a human to read, anything that crosses my bot's boundary into a non-Rust system that's easier to debug as text.
  • On-chain world → borsh. Solana account data, instruction payloads, anything that needs to hash deterministically or live inside a compute-budget-constrained program.

The libraries don't compete; they cover different territories. In my bot's codebase, both serde and borsh show up in Cargo.toml, and the same struct sometimes derives both — Serialize + Deserialize for logging it as JSON, BorshSerialize + BorshDeserialize for sending it to the chain. That's not a hack, that's the intended pattern.

A quick checklist I now run mentally before adding a derive macro:

  • Does this data ever cross the program boundary into Rust-only land? → borsh.
  • Does a human ever need to read it without a debugger? → serde + JSON or TOML.
  • Will it be hashed or signed? → borsh, full stop.
  • Will it be sent over HTTP to a service I don't control? → serde.
  • Both? → derive both. They don't conflict.

What This Means for Building on Solana

The broader implication, looking past my own bot, is that the Rust serialization story for blockchain isn't a single tool — it's a layered stack. Serde handles the off-chain plumbing (loading config, talking to RPC providers via JSON-RPC, writing logs in structured formats). Borsh handles the on-chain reality (account layout, instruction encoding, hashing). Anchor sits on top of borsh and adds the discriminator + macro ergonomics that turn raw bytes into something type-safe.

If you're learning Solana development right now and feeling overwhelmed by all the crate names, the mental model that helped me was: serde is for everything humans touch, borsh is for everything the chain touches. Once that line is clear, the rest of the ecosystem starts making sense — Anchor's #[account] macro stops being magic, the try_from_slice calls scattered across Solana programs stop being mysterious, and the cross-language client SDKs (TypeScript, Python) suddenly fit into a coherent picture.

I still have plenty to learn. My bot is hitting cases where I'm hand-encoding instruction data because the program I'm interacting with isn't Anchor and doesn't ship an IDL, and that's a different rabbit hole. But the foundation feels solid for the first time. The two libraries no longer look like duplicated effort. They look like two specialized tools that happen to share an ergonomic style — a chef's knife and a paring knife sitting on the same magnetic strip.

Key Takeaways

  • serde is a format-agnostic framework, not a format itself, and it's the de-facto standard for any external data work in Rust — APIs, configs, logs.
  • borsh is a single binary format optimized for determinism, compactness, and hashing — purpose-built for blockchain account data.
  • The two libraries can — and often do — coexist on the same struct. Deriving Serialize, Deserialize, BorshSerialize, BorshDeserialize is a common, intentional pattern.
  • Solana's ecosystem (native programs, Anchor, multi-language clients) standardizes on borsh because hash-determinism and on-chain space efficiency are non-negotiable.
  • For new Solana developers: external world = serde, on-chain world = borsh. That single rule cuts through most of the early confusion.

Disclaimer

This article is for informational and educational purposes only and does not constitute financial, investment, legal, or professional advice. Content is produced independently and supported by advertising revenue. While we strive for accuracy, this article may contain unintentional errors or outdated information. Readers should independently verify all facts and data before making decisions. Company names and trademarks are referenced for analysis purposes under fair use principles. Always consult qualified professionals before making financial or legal decisions.