Prebuild Optimization: One Reorder, Roughly 14x Faster Restart

The restart that wouldn't get out of bed

My bot restarts more than I'd like. RPC connections die under load. Blockhashes go stale after a minute or so. A leader rotation makes the previous tip-routing assumptions wrong. I push a config tweak. Half the time, "restart" is just the cheapest way to reset everything to a known state.

Which means restart time matters in a way that, for most software, it really doesn't. A web server can take 30 seconds to come up; nobody notices. An MEV bot that's offline for 30 seconds is a bot that's missed every opportunity in roughly 75 Solana slots — block time on Solana is around 400 milliseconds, per Solana's official documentation, so you can do the math on what "a slow restart" actually costs in opportunity terms. It's not graceful degradation. It's a blackout.

So I started timing my own cold start. The numbers were embarrassing. Not embarrassing in absolute terms — restart was finishing in a perfectly reasonable amount of time, the kind of time a normal application would brag about. But every second the bot was waking up was a second it couldn't see the chain. And I had a database full of prebuilt transaction templates that were supposed to make this fast.

The punchline of this post is that I didn't speed up any individual operation. I didn't add a cache, optimize a query, parallelize a loop, or rewrite anything in a faster language. I changed the order in which the bot pulled things out of the database during warm-up. That single reorder pulled cold-start time down by roughly 14x in my own measurements. Same operations. Same data. Same total work. Just done in the right sequence.

What "prebuilt transaction template" actually means

If you're new to the series, here's the shape of the problem. A Solana arbitrage transaction isn't a small thing. Per the Solana Foundation's documentation on transactions, the on-the-wire size limit is 1,232 bytes — derived from the IPv6 minimum MTU of 1,280 bytes minus 48 bytes of network headers. Inside that envelope you have to fit signatures (64 bytes each, Ed25519), a header, account addresses, a recent blockhash, and compiled instructions. Legacy transactions cap at 32 accounts. Versioned transactions with Address Lookup Tables expand that to 64.

A real arbitrage hop touches a lot of accounts. Pool state accounts, token mints, vault accounts, the routing program, the user's token accounts on both sides of every leg. Without lookup tables, you don't fit. With lookup tables, every account address compresses from 32 bytes to a 1-byte index — a 32:1 compression ratio, again per the Solana Foundation's Address Lookup Tables guide. A single lookup table can hold up to 256 addresses. So the real currency of a fast bot isn't "a transaction" — it's a transaction plus a set of resolved lookup table accounts the runtime needs to decompress those indices.

A prebuilt template, in my world, is a structured object that bundles together: the compiled message body, the lookup table account references it depends on, the compute-budget instructions sitting at the front, the signer set, and a chunk of metadata that lets the bot turn the template into a wire-ready transaction in microseconds when an opportunity appears. The template is the result of work the bot doesn't want to redo every time a candidate trade pops up. Building it is expensive. Reusing it is cheap. That's the whole point.

And because building it is expensive, the bot persists templates to a local database. On restart, instead of rebuilding from scratch, it rehydrates. That's where the order question lives.

A pipeline that already taught me this lesson

Before I get to my mistake, look at the system the templates are designed for. Solana's runtime processes a transaction through an eight-stage pipeline documented by the Solana Foundation: receive and deserialize, signature verification, sanitize, compute-budget and age and status checks, nonce and fee-payer validation, account loading, instruction execution, and finally commit or rollback.

The stages aren't a suggestion. The Foundation's documentation states the dependencies plainly. Fee-payer validation must precede account loading. Compute-budget parsing must precede fee calculation. Nonce validation must precede blockhash-age checks. Account loading must precede instruction execution. Instruction execution must precede post-condition verification. There is no version of the pipeline where you load accounts first and validate the fee payer afterward. Try it and the runtime won't politely reorder your inputs — it will reject the transaction or charge you for the failure. The base fee is 5,000 lamports per signature, and the docs are clear that fees are charged even when transactions fail.

This is a system that has internalized the idea that order is part of the algorithm. The runtime doesn't think of "validate fee payer" and "load accounts" as two interchangeable tasks scheduled by a queue. It thinks of them as two links in a chain, where reversing the order isn't slow — it's wrong.

The lesson, which took me a humiliating amount of time to internalize, is that if my bot is restoring templates that are designed to feed this pipeline, my restoration order has to mirror the dependencies the pipeline assumes. Otherwise I'm reconstructing the inputs in a sequence the consumers can't actually use.

What I had been doing

My original restore loop was, in retrospect, a piece of code that had been written for a programmer's convenience rather than the runtime's. The first thing it did was pull the highest-level objects out of the database — the compiled message bodies, the ones that look most like "a transaction." Each one of those bodies referenced a set of lookup tables by their public keys. The loop walked the message, hit a reference to a lookup table account it didn't have yet, paused, fetched the lookup table data (sometimes from local storage, sometimes from the network), parsed it, and then continued.

It was, structurally, a depth-first restore. Every parent walked, every dependency resolved on demand. It looked clean. It was readable. It was also pessimal in exactly the way that depth-first cache traversal is always pessimal: every parent triggers a sub-walk, every sub-walk does a fetch, every fetch does a parse, every parse re-touches structures that an earlier parent already touched.

Messages share lookup tables. Many templates reference the same handful of frequently used tables. In a depth-first restore, the first message resolves a lookup table from disk. The second message references the same table, but because the resolution path lives on the message, the cache layer above doesn't always intercept it cleanly — particularly when the resolution involves going through a network client to confirm the table's current state, or re-deserializing a multi-hundred-byte address list. I ended up with a lot of nominally redundant work that an actual measurement framework could see and that a human eye, reading the code, could not.

The bot did this for every template. The longer the database, the longer the restart. The more popular a lookup table, the more times it got rehydrated. None of this was wrong. The bot eventually came up. It just came up in the worst possible order.

What the reorder looks like

The fix isn't subtle once you see it. Restore the dependencies first, in their own pass. Then restore the things that depend on them, in a second pass.

In concrete terms: the first pass walks the database and pulls out everything that looks like a leaf node — the lookup table accounts, the static account metadata, the compute-budget pieces that don't depend on anything else. These get restored into in-memory structures with no resolution work to do, because there's nothing for them to point at that hasn't been loaded. The second pass walks the templates that point at those leaves. Every reference resolves immediately, against an in-memory map, with no fallback to disk or to the network.

This is the same total work. Same number of database rows read. Same number of objects allocated. Same final state in memory. The difference is that in the depth-first version, every parent-child traversal is a potential cache miss, a potential re-parse, a potential round-trip. In the dependency-ordered version, the first pass is a tight scan of small objects with high locality, and the second pass is a tight scan of larger objects whose every pointer hits something already resident. The cache is warm before it's asked to do anything interesting.

In my own measurements, the difference came out at roughly 14x. I'd rather not stake my reputation on that exact ratio holding for everyone — the absolute number is going to depend on database size, hardware, and how many lookup tables are shared across templates — but the shape of the speedup is robust and unsurprising in hindsight. It's the same shape every dependency-ordered initialization produces.

This is not a Solana insight

The humbling thing about debugging this was realizing that approximately every mature framework has solved exactly this problem already, and I had walked past their solutions for years.

Spring Boot, the workhorse of an enormous swath of American enterprise Java, has @DependsOn, @AutoConfigureOrder, @AutoConfigureAfter, @AutoConfigureBefore — an entire vocabulary for declaring "this bean needs that bean to exist first." Per a documentation source on Spring Boot configuration order, when configurations depend on each other, controlling the order of configuration becomes crucial. That's not just a performance hint. It's a correctness hint that doubles as a performance hint, because if you let beans initialize in an arbitrary order, the framework is going to do a lot of redundant lazy resolution.

Google's Android team built App Startup on the same observation. The library exists because phone apps were lighting up dozens of independent initializers in arbitrary order, paying the cost of every one of them serially before the user saw a screen. App Startup's dependencies() method is, almost word for word, the same idea I had to learn the hard way for my bot: declare the order, restore the leaves, then the trunks, then the branches. Same total work. Faster perceived launch. Google's own documentation describes it as a more performant way to initialize components at app startup and explicitly define their dependencies.

Go a step further. The ACM Programming Languages paper on initialize-once startup describes GraalVM Native Image, where Java initialization code can run at build time and the runtime starts with a pre-populated heap. The reported speedup is up to two orders of magnitude — roughly 100x — versus the Java HotSpot VM. The algorithm doesn't change. The data structures don't change. The work moves from runtime to build time, and the runtime starts with state already in the right shape. Order — when something happens — is the optimization.

In high-frequency trading, the published research is even more explicit. A peer-reviewed Springer Nature paper on reordering transaction execution to boost HFT applications describes a technique called PARE (Pipeline-Aware Reordered Execution) that improves application performance by rearranging statements in order of their degrees of contention, with a separate algorithm preserving serializability. Same statements. Same outputs. Different order. Measurable speedup.

If you've ever written a SQL query and learned that filtering before sorting beats sorting before filtering, you've already used the principle. The optimizer reorders your operators behind your back to get there. When you control the loop yourself, nobody reorders for you.

A few American analogies that helped me see it

When I'm trying to convince myself that a piece of insight is real and not a coincidence, I look for the same shape in places that have nothing to do with code.

Thanksgiving dinner is the cleanest one. Anybody who has cooked for a crowd knows the difference between a stressed cook and a calm one is almost entirely about ordering. The turkey takes hours, so it goes in first. The mashed potatoes can hold under foil, so they finish in the middle. The gravy needs the turkey drippings, so it comes after the bird. The biscuits are quick and want to hit the table hot, so they go in last. The recipe is identical for both cooks. The reason one of them is on the porch with a glass of wine at 5:45 and the other is sweating at 7:15 is the order they decided to start things.

A NASCAR pit stop is a more aggressive version of the same idea. Four tires changed, fuel added, and the driver back on track in seconds. Nobody is moving faster than the laws of physics; everybody is moving in the right order. The fueler doesn't wait for the rear tires. The jack is on the car before the gun is on the lug nut. The rehearsal is the optimization. The same crew, doing the same work in a less-practiced sequence, would lose half the field by the time the car re-entered the track.

An assembly line, going back to the early-twentieth-century Detroit innovation that made cars affordable in America, is the industrial-scale version of this. Nobody on a Ford line in 1913 was a faster machinist than a craftsman in a wagon shop. Everybody was a slower machinist. The reason a Model T came off the line in a fraction of the time the wagon shop took is that the order was fixed and every station did one thing while the previous station's output was already where it needed to be.

None of these examples involved "working harder" or "faster tools." They involved a person looking at the workflow and asking which step had to come before which other step, and then refusing to start anything out of order.

Why the order question hides on Solana specifically

There's a reason this lesson was easy for me to miss. Solana sells itself, fairly, on parallelism. The Sealevel runtime executes transactions across multiple cores when their account-access patterns don't conflict. Solana documentation and community materials emphasize independence — independent transactions, independent accounts, independent execution. If your mental model of the chain is "things happen in parallel," you can convince yourself that order doesn't matter very much.

It does within a transaction. Per the transaction pipeline documentation, instructions execute sequentially in the order they appear in the message. The maximum instruction trace depth is 64. Compute-budget instructions, by convention, sit at the start of the message because the runtime needs to know your compute limit before it starts executing. Versioned transactions and Address Lookup Tables similarly impose ordering — for instance, lookup tables modified earlier in the same bundle cannot be used later in that bundle, a constraint the Address Lookup Tables guide makes explicit. None of this is parallel. All of it is a strict pipeline.

My mistake was assuming that because the chain runs concurrently at the macro level, my bot could rehydrate its state concurrently at the micro level. The chain isn't concurrent at the micro level. Within a single transaction's lifetime, every step has a predecessor, and skipping the predecessor is a guaranteed re-do. My bot's prebuilt templates are objects designed to feed that strict pipeline. Restoring them in the wrong order is asking the in-memory structures to behave like the pipeline doesn't exist. The pipeline always wins.

The mental shift: software is a checklist

The single most useful reframing I picked up from this debugging session was this: a lot of high-performance software is a checklist that has to be executed in order, and the runtime is the auditor that catches you when you skip a box.

Solana's transaction pipeline is a checklist. The Address Lookup Tables system is a checklist (extend before reference, freeze before use, don't modify and consume in the same atomic group). Bundle submission protocols generally require that transactions be fully signed before they're submitted. Spring Boot's @DependsOn is a checklist annotation. App Startup's dependencies() is a checklist API. SQL query planning is a checklist your database optimizer runs for you because if it didn't, you'd write the steps in the wrong order.

My bot's database restore is a checklist too. I just hadn't been treating it like one. I had been treating it like a loop. A loop that walks parent objects and resolves their children on demand is not a checklist; it's a tree traversal that pretends the parent and the child are the same level of priority. They aren't. The child is the precondition. The parent only makes sense once the child is resident.

Once you start seeing initialization paths as checklists, the optimizations write themselves. You don't ask "how do I make this loop faster." You ask "what is the minimum set of things that must exist before this loop's body can run without re-fetching, and have I already loaded all of them?" If the answer is no, you've found your speedup.

What this changes for me going forward

I'm not pretending this fix has solved restart. The bot still has plenty of cold-start cost I haven't touched. Network handshakes. Subscription warm-up. Whatever the JIT is doing with my hot paths during the first few thousand cycles. There's a long tail of non-trivial work that no amount of clever ordering is going to make disappear, and I'm not going to chase the next factor of two by reorganizing more loops.

But the broader stance I'm taking from this incident is to look at every restart-time path through the lens of "what's the dependency order, and am I respecting it." Whenever I add a new prebuilt artifact to the database, the first design question is now what it depends on, not what it produces. The schema is starting to grow weakly typed dependency hints — not a full DAG, not yet, but enough that the restore order becomes obvious from the data rather than something I have to remember while staring at the code.

The other shift is more honest. I had assumed, going in, that any meaningful speedup from this point in the project would come from algorithmic work — better routing, smarter cycle detection, faster math. It turned out the cheapest, biggest single speedup I'd hit in weeks was a structural one I could have made on day one if I'd been thinking about dependency order. That's a humbling thing to learn about your own taste in optimizations. It also suggests that the next big win is probably a similar shape, sitting somewhere I haven't bothered to look because I assumed the obvious shape of the work was already correct.

I'm not going to predict where it'll be. I've already been wrong about that today.

Key Takeaways

A Solana MEV bot's restart cost is dominated by warm-up, not by steady-state work — and warm-up is shaped almost entirely by the order in which prebuilt state is restored from the database, not by any individual operation's speed.
Solana's transaction pipeline itself is documented as a strict ordering of stages with hard predecessor relationships; restoration code that ignores those dependencies forces redundant work even when the data on disk is fully prebuilt.
The pattern is universal: Spring Boot's @DependsOn annotations, Android App Startup's dependencies() API, GraalVM Native Image's build-time initialization, and PARE-style transaction reordering in HFT all express the same principle that order is part of the algorithm.
Useful reframing: software initialization is a checklist, not a loop. Restore leaves before trunks, dependencies before consumers, and the cache stays warm by the time anything interesting touches it.
The biggest single speedup in my own restart path came from changing nothing about the work and everything about the sequence. Roughly 14x in my own measurements — same data, same operations, different order.

Disclaimer

This article is for informational and educational purposes only and does not constitute financial, investment, legal, or professional advice. Content is produced independently and supported by advertising revenue. While we strive for accuracy, this article may contain unintentional errors or outdated information. Readers should independently verify all facts and data before making decisions. Company names and trademarks are referenced for analysis purposes under fair use principles. Always consult qualified professionals before making financial or legal decisions.