Compute Unit Optimization: Use Less, Hit Harder

The Lever I Had Been Ignoring

I have been running my arbitrage bot for weeks now and the same pattern keeps showing up in the logs. I find what looks like a clean opportunity, I sign the transaction, I send it with what I think is a competitive tip, and I still land behind somebody else who paid roughly the same. For a long time my reflex was the obvious one — pay more. Tip harder. Out-bid the room.

This week I finally sat down and stared at the priority math, and I realized I have been thinking about this like a guy who tries to win a NASCAR race by buying a bigger fuel tank. The bigger tank is fine. But the cars that actually win are the ones that are lighter. On Solana, the equivalent of a lighter car is a transaction that asks for fewer Compute Units. So that is what I am working on now: spending the same money, but asking the network for less compute, and seeing whether the scheduler rewards me for it.

Compute Units, In Plain English

A Compute Unit (CU) is Solana's metering currency for work. Every instruction your transaction executes — a token transfer, a signature verification, a cross-program invocation, a log statement — costs some amount of CU. The runtime tallies it up. If you exceed the limit you declared, the transaction fails. If you stay under, you pay for whatever you declared anyway. Think of it less like a gas tank and more like a reservation at a restaurant: you reserve a table for eight, and the restaurant charges you for eight whether you show up with four people or eight.

The baseline numbers, per the Solana Foundation's compute budget docs:

Default per-instruction allocation: 200,000 CU
Maximum per-transaction limit: 1,400,000 CU
Maximum per-block: 60,000,000 CU
Maximum write-locked per account per block: 12,000,000 CU

That last number is the one that quietly punishes hot pools. If a popular AMM account is being hammered, the block scheduler can only spend 12M CU writing to it per block. Everyone trying to touch the same pool is competing for the same shared budget. Lighter transactions fit into that budget more times.

The other piece of context that took me a while to internalize: the priority fee is calculated from the CU limit you declare, not the CU you actually consume. The official compute budget documentation states this directly — "The priority fee is determined by the requested compute unit limit on the transaction, not the actual number of compute units used." If I declare 1,000,000 CU and use 80,000, I pay for the million. The slack is wasted twice: once in my fee bill, and once in the scheduler's view of how expensive my transaction is.

The Scheduler Math That Changed My Mind

The formula that did the actual convincing is in the Solana fee structure docs:

Priority = reward * 1,000,000 / (cost + 1)

reward is roughly the priority fee I'm paying. cost is roughly the CU I'm requesting. The scheduler sorts transactions by this number and processes the high-priority ones first.

Look at what this means. Imagine I'm paying the same priority fee as a competitor — same reward in the numerator. If their cost is twice mine, my Priority value is roughly twice theirs. I jump them without spending another lamport. The lever isn't "pay more." The lever is "weigh less for the same payment."

Anza's developer relations team made the same point in a July 2025 post: "higher transaction costs reduce priority for identical fees." That sentence is dry, but it's basically the entire thesis of CU optimization. Two transactions, same tip, the lighter one wins.

The priority fee math itself is straightforward, also per the fee structure docs:

prioritization_fee = ceil(compute_unit_price * compute_unit_limit / 1,000,000)

So if I cut my CU limit in half and double my unit price, I pay the same total fee — but my effective "price per CU" is now twice as high, and my scheduler rank is much better. Same money, sharper elbows.

What Actually Costs CU

Before I could optimize anything I had to figure out what was eating my budget. Solana's documentation and the official cu_optimizations repo on GitHub publish benchmark numbers, and the gap between "naive code" and "thoughtful code" turns out to be enormous.

A short list of the published before/after numbers, drawn from the official optimization guide:

Logging a pubkey as a Base58 string: 11,962 CU. Logging it via .key().log(): 262 CU. That is roughly a 97% reduction for a single log line. If your program logs three pubkeys for debugging, you might be burning 35,000 CU on stuff nobody reads in production.
Deriving a PDA via find_program_address: 12,136 CU. Calling create_program_address with the bump cached on the account: 1,651 CU. Roughly an 86% reduction. The first version makes the runtime brute-force search for the canonical bump; the second one just verifies the answer you handed it.
Borsh deserialization (full account): 2,600 CU. Zero-copy deserialization: 1,254 CU. About a 52% reduction. Zero-copy reads bytes in place instead of copying them into a fresh struct.
A six-element Vec<u64>: 357 CU. The same data as Vec<u8>: 211 CU. Smaller types do less work even when the values would fit in either.
System program transfer via CPI: 2,215 CU. A direct lamport mutation: 251 CU. A roughly 89% reduction, though this one comes with caveats — you'd only do it inside a program you own, where you control the accounts being mutated.

The headline savings on logging are almost embarrassing. Most production Solana programs have at least one or two debug log statements that nobody pruned. That's free money sitting on the table. If I removed three Base58 pubkey logs from a hot path, I'd save roughly 35,000 CU per call, which is more than the entire compute budget of many simple transactions.

The other surprise was framework overhead. Dean Little's Accelerate 2025 talk on writing optimized programs reported a benchmark where the same logical operation cost 649 CU under Anchor versus 109 CU under the Pinocchio framework — an 83% reduction. The same talk reported program size dropping from 76 kilobytes to under 1 kilobyte. I am not about to rewrite my program in raw assembly, but the lesson stuck: convenience layers cost compute, and the cost compounds across every instruction the runtime processes.

Dean Little also said something in that talk that I keep coming back to: "Optimization is actually not always capital efficient." Translated: the time you spend shaving 50 CU off a function might earn you less than spending that same time on a feature that actually finds more opportunities. Optimize the things that matter. Ignore the rest.

The Hidden Overheads I Didn't Know About

The Anza post from July 2025 broke transaction cost into five components, and a few of them are easy to miss:

Execution CUs — the part everybody thinks about.
Loaded account data cost — 8 CU per 32 KB of loaded account data. The default loaded-data limit is 64 MB, which works out to roughly 16,000 CU of overhead before your code even runs.
Write lock cost — 300 CU per write-locked account.
Signature cost — 720 CU per signature.
Data bytes cost — 4 CU per byte of instruction data.

The loaded account data line is the one that surprised me. If my transaction doesn't actually need 64 MB of headroom — and most don't — I can set setLoadedAccountsDataSizeLimit to something much smaller and recover that overhead. Anza's post calls this out explicitly as one of the easy wins. It is the Solana equivalent of canceling a gym membership you don't use.

The signature and write-lock costs are smaller per item but they multiply. A transaction with three signers and seven writable accounts is paying 2,160 CU in signature cost and another 2,100 CU in write locks before any of my logic runs. None of that is logic I can optimize — those are fixed costs of the shape of my transaction. The only knob I have is account count: fewer signers, fewer writables, fewer fixed costs.

The Simulate-Measure-Set Loop

Knowing what costs CU is half the battle. The other half is actually measuring my transaction and asking for the right amount. The Solana Foundation's guide on requesting optimal compute budget lays out the workflow:

Build the transaction with the CU limit cranked up to 1,400,000 — high enough that simulation can't fail for capacity reasons.
Call simulateTransaction() and read unitsConsumed from the response.
Multiply by 1.1 to add a 10% safety buffer.
Prepend two Compute Budget Program instructions at position 0 of the real transaction: SetComputeUnitLimit(measured * 1.1) and SetComputeUnitPrice(microLamports).
Send.

This is the move I had been skipping. My early bot just slammed every transaction with a fat fixed CU limit and trusted that I'd never overshoot. That's safe but stupid — every wasted CU above what I actually use is wasted scheduler priority. Simulating before sending costs me one extra RPC round trip, and in return I get a tight CU budget that gives the scheduler a much better story about how cheap my transaction is.

The 10% buffer is worth pausing on. Why 10% and not zero? Because the simulation is run against the current state of the network and your transaction will execute slightly later — by which point the accounts you read might have changed, your branches might have taken slightly different paths, and your actual consumption might creep up a few percent. Ten percent is the conventional cushion. Several infrastructure providers reference the same multiplier in their documentation. If I undershoot, I don't just pay less — my transaction fails entirely and I eat the base fee. Five thousand lamports per signature, charged whether the transaction succeeds or fails, per the fee structure docs.

There is one subtlety in step 4 that I almost missed: the two Compute Budget instructions need to be at the very start of the transaction. They are not regular instructions that execute in order; they are configuration the runtime reads before scheduling. Putting them elsewhere is either ignored or treated as an error, depending on the SDK version. Position 0 and position 1, before anything else, every time.

What the Validator Data Says

One thing that kept me honest while reading all of this was a June 2025 analysis on whether CU consumption actually moves the needle for validators. Their headline finding is that CU consumption explains only about 2% of the variance in validator fee rewards. Two percent. The other 98% comes from order flow, network conditions, stake weight, and a dozen other things outside any single transaction's control.

That number could be read as "CU optimization doesn't matter much." I read it the other way. Most validators are already running their blocks at 78–82% CU utilization, clustered tight against the practical cap. The block is full. What that means for a transaction submitter is that the marginal seat at the table is genuinely contested — every transaction the scheduler sorts into the block is bumping another transaction out of the block. In that environment, anything that improves my scheduler rank for the same fee is a real edge. Two percent of validator-side variance might still be a much larger share of the variance in whether my specific transaction lands.

Rated's analysis also noted that demand spikes during U.S. Eastern working hours, roughly 8 a.m. to 6 p.m. That maps onto my own experience: the windows when I find the most opportunities are also the windows when blocks are most crowded. Optimization matters most exactly when it's hardest to land, which is exactly when I want it to matter most.

What I Am Actually Going To Change

If I had to rank the things I'm planning to do in order of expected payoff per hour of work, it would look something like this:

Tier 1 — embarrassingly cheap wins.

Walk every code path in my hot transactions and rip out every Base58 pubkey log. If I need to log a pubkey for debugging, use the syscall-friendly approach that costs a few hundred CU instead of twelve thousand. While I'm in there, look for any signature or write lock that isn't strictly necessary and see whether I can restructure the transaction to drop accounts.

Tier 2 — measurement infrastructure.

Stop using a fat fixed CU limit on production sends. Move to the simulate-measure-set workflow with a 10% buffer. This is one extra RPC call per send, in exchange for tight CU budgets that materially improve scheduler rank. I also want to start logging unitsConsumed from the simulation alongside my own internal estimates, so I can see when my code drifts toward burning more CU than I expect.

Tier 3 — boring wins inside my program.

Wherever I'm using find_program_address repeatedly inside an on-chain program, cache the bump on the account and switch to create_program_address. That's roughly an order-of-magnitude reduction per derivation. For account deserialization, where the data is large or the path is hot, evaluate zero-copy.

Tier 4 — set the loaded account data limit.

Most of my transactions don't need anywhere near the 64 MB default. Setting setLoadedAccountsDataSizeLimit to a realistic value reclaims a chunk of CU overhead I was paying for unused headroom.

Tier 5 — things I am explicitly not doing yet.

I am not rewriting in a lower-level framework right now. I am not chasing the 50-CU shaves that Dean Little warned about. I am not going to rebuild my deserialization in unsafe Rust on a Tuesday afternoon for a 5% gain. Premature optimization in this domain looks exactly like premature optimization everywhere else: an enormous amount of effort buying a small fraction of the wins that bigger structural changes would deliver in a fraction of the time.

Why This Lever Compounds

There is one more reason I am taking CU optimization seriously, and it's specific to the kind of work I'm doing. When the same tip can land or fail depending on milliseconds of scheduling priority, anything that systematically improves my rank is a compounding edge. If I improve my Priority score by, say, 30% on average — same fee, just less CU — that 30% applies to every transaction I send. Not just the lucky ones. Not just the big ones. Every one.

The Solana fee structure docs note that priority fees are distributed entirely to the producing validator. The base fee is half-burned, half to the validator. That has a quiet implication: the network is structurally set up to reward the validator for picking the highest-Priority transactions, because that's what maximizes their take. The scheduler is on my side as long as I give it a transaction that's cheap to schedule.

The naive read of all of this is "pay a bigger tip." That works, for a while. The slightly less naive read — and the one I'm trying to internalize this week — is "pay the right tip, in the right shape, on the right transaction." The right shape is the small one.

Key Takeaways

Priority fee is calculated from your declared CU limit, not your actual consumption. Asking for slack you don't use is paying for slack you don't use, twice — in fee and in scheduler rank.
The scheduler formula Priority = reward * 1,000,000 / (cost + 1) rewards lighter transactions directly. Same tip plus half the CU equals roughly double the priority, with no extra spend.
The biggest cheap wins are mundane: prune Base58 pubkey logs, cache PDA bumps, tighten loaded-account data limits, drop unnecessary accounts. Published benchmarks show 80–97% reductions on individual operations.
Use the simulate-measure-set workflow with a 10% buffer. One extra RPC call buys a much tighter CU budget than any fixed default.
Don't over-rotate on micro-optimizations. Validator-side data suggests CU explains only a small slice of total reward variance — the gains are real but bounded. Spend the optimization budget where the structural wins live, not on shaving five CU off a function nobody calls.

Disclaimer

This article is for informational and educational purposes only and does not constitute financial, investment, legal, or professional advice. Content is produced independently and supported by advertising revenue. While we strive for accuracy, this article may contain unintentional errors or outdated information. Readers should independently verify all facts and data before making decisions. Company names and trademarks are referenced for analysis purposes under fair use principles. Always consult qualified professionals before making financial or legal decisions.