Real-time chains expose a subtle correctness trap: when a transaction is "accepted" by the API but silently dropped before commit, naïve nonce tracking diverges from chain truth and every subsequent transaction stalls. This page documents how Subtick's demo wallet eliminates that class of failure — with a strict per-tx confirmation gate, a contiguous-window assignment invariant, and event-driven commit detection. The numbers are reproducible against the public testnet.
A naïve "demo wallet" assigns the next nonce, sends the transaction, and increments a local counter. If anything between the API and the executor silently drops that transaction (mempool eviction under back-pressure, future-nonce TTL expiry, transient executor stall — none of which raise a rejection), the chain account stays put while the local counter races ahead. Every later transaction is assigned a future nonce; the chain holds them as out-of-order and eventually drops them too.
The visible symptom is a wallet that returns accepted: true
forever while chain.nonce never moves. We hit this on
Subtick's own demo (repo),
reproduced it deterministically, and rebuilt the gate to make it
structurally impossible.
The wallet operates on three rules.
Local nonce state is a cache, never an authority. Every
assignment reads chain.nonce first; if the chain has
moved, the local last_committed is lifted. Local state
is allowed to lead the chain (in-flight transactions), never to
contradict it.
The set of currently in-flight nonces is required to form a
contiguous range
[last_committed, last_committed + N) where
N = in_flight.len(). The next nonce we hand out is
always last_committed + N. Holes are forbidden by
construction — we never assign past a missing nonce.
If a commit poll times out (the chain didn't move past our assigned
nonce within the deadline), the wallet enters a frozen
state immediately — a circuit-breaker, not a flag we
check later. The very next request reads chain truth, clears the
in-flight set, and resumes assignment from
chain.nonce. The abandoned slot is reused by this
recovery request, which is what closes the chain's nonce gap.
max_assigned − last_committed ≤ K,
where K is the configured concurrency permit count. No holes; no
future-nonce pile-up; no silent drift.
The gate has three concrete components, all in
subtick/src/api/mod.rs:
K=1 for single-sender demos..await boundary that involves I/O.BatchExecuted
broadcast before mempool insert, then
tokio::select! between event arrivals and a hard
deadline. No 5 ms polling loops.// Sketch — full source in subtick/src/api/mod.rs struct NonceState { last_committed: u64, in_flight: BTreeMap<u64, Instant>, frozen: bool, initialized: bool, } // Acquire-and-assign protocol: async fn acquire_demo_slot(...) -> Result<Slot, Error> { let permit = wallet.permits.acquire_owned().await?; let mut g = wallet.nonce.lock().await; sync_chain_truth(&mut g, ...); // Rule A if g.frozen { rebase(&mut g); } // Rule C let assigned = g.last_committed + g.in_flight.len() as u64; g.in_flight.insert(assigned, Instant::now()); // Rule B Ok(Slot { assigned, permit, ... }) } // On the way out — finalize based on commit poll outcome: match executed_ms { Some(_) => { g.in_flight.remove(&assigned); g.last_committed = assigned + 1; } None => { g.frozen = true; } // circuit-breaker }
Reproducible: 200 sequential and 50 parallel POST /demo/transfer
against https://subtick.dev, with and without 10% server-side
drop injection. Methodology is in the project repo's
web/_bench.mjs + web/_canary_inject.mjs.
| scenario | p50 | p95 | p99 | success | drift end |
|---|---|---|---|---|---|
| sequential ×200 | 109 ms | 137 ms | 140 ms | 100.0% | 0 |
| parallel ×50 (K=1) | 2977 ms | 5445 ms | 5623 ms | 100.0% | 0 |
The parallel-50 latency is not chain latency — it's queue time through the gate. K=1 serializes all transactions from the shared demo sender; throughput converges to the single-sender chain commit rate (~7-8 tx/s).
| scenario | p50 | success | injected | timeout | reconcile |
|---|---|---|---|---|---|
| sequential ×200 | 110 ms | 90.0% | 20 | 20 | 20 |
Self-heal is exact and 1:1. Every injected drop produces
exactly one timeout, one reconcile event, and the gate clears the gap
before the next request lands. Success rate matches
1 − injection_rate precisely; the chain commits exactly
the un-dropped transactions; drift ends at 0 every time.
drift = 0 at end-of-run,
reconcile == timeout, and alice.nonce_delta == metric_committed.
Any deviation halts the build.
The wallet's state is exposed as plain JSON for ops + dev tools:
// GET https://subtick.dev/demo/state.wallet { "last_committed": 42, "in_flight_current": 0, "oldest_in_flight_age_ms": 0, "frozen": false, "max_in_flight": 1 } // GET https://subtick.dev/demo/metrics { "tx_assigned_total": 42, "tx_committed_total": 42, "tx_timeout_total": 0, "tx_mempool_reject_total": 0, "tx_injected_drop_total": 0, "reconcile_total": 0, "in_flight_current": 0, "oldest_in_flight_age_ms": 0, "frozen": false, "max_in_flight": 1, "inject_fail_pct": 0 }
Two hard alarms wire to these without any framework: frozen=true
lasting more than ~10 s with no committed advance, or
oldest_in_flight_age_ms exceeding the commit deadline.
Either signal indicates the chain itself has stalled, not the gate.
Single-sender K=1 was both safe and optimal because the chain enforces strict per-account nonce ordering: any K>1 from one account just queued behind itself. Throughput was therefore capped at the single-sender chain commit rate, ~7-8 tx/s.
The unlock was structural — distribute requests across a pool of pre-funded sender accounts, each with its own independent gate. Different accounts have independent nonce sequences; the chain can commit them in parallel batches. Per-account ordering still holds, all four invariants from §2 still hold per-sender, and throughput rises with the pool size.
| config | p50 | p99 | elapsed | throughput |
|---|---|---|---|---|
| single sender (K=1) | 2977 ms | 5623 ms | 5969 ms | 7.9 req/s |
| pool of 4 (K=1/sender) | 802 ms | 1894 ms | 2118 ms | 23.6 req/s |
≈3× throughput · 3.7× lower p50 · 3× lower p99. The gate's per-sender invariants are unchanged — every sender enforces its own contiguous-window assignment, its own circuit-breaker freeze on first timeout, its own self-heal on the next request. There is no shared mutable state between senders besides the round-robin counter, which only assigns the slot.
At pool size 4, raising K above 1 gives no additional throughput under burst load — the chain's commit rate per account is the ceiling, and queueing K transactions on the same account just shifts latency from the queue to the wait. The structural response is the same as before: more independent accounts (a bigger pool, or per-visitor ephemeral keypairs in a future batch). Pool=4, K=1 is the sweet spot today.
Three demo endpoints (/demo/transfer,
/pixel/place, /auction/bid) accept an
optional x-session-id header. Same payloads as
before. With the header present the request sticks to one sender
via SHA-256(session_id) % pool_size; absent, it round-robins
across the pool. No body changes — existing clients continue to
work and just load-balance.
Two endpoints surface the multi-sender state in real time:
// per-sender breakdown: GET https://subtick.dev/demo/state { "alice": { ... }, // legacy: pool[0] "bob": { ... }, // shared recipient "pool": [ { "address": "f0f2…7244", "chain_nonce": 142, "in_flight_current": 0, "frozen": false, ... }, { "address": "8ba9…b5a7", "chain_nonce": 138, ... }, { "address": "3e73…f777", ... }, { "address": "e0aa…7c23", ... } ], "pool_size": 4, "sessions_active": 0 } // aggregate counters + per-sender selection counts: GET https://subtick.dev/demo/metrics
drift = 0 at end-of-burst,
reconcile == timeout on every fault, the contiguous-window
invariant. The pool didn't relax correctness — it parallelised
it.
Senders in the pool are spread deterministically across the
executor's 4 shards (by shard_of(pubkey) = pubkey[0] % 4).
The shared recipient (Bob, on a single shard) almost always lives on
a different shard than the sender. Before this batch, the executor's
per-shard commit thread credited the recipient on its own
ShadowState regardless of shard_of(recipient)
— landing the credit as a phantom entry on the sender's shard
instead of the recipient's. The visible symptom: ~30% of cross-shard
transfers reflected in Bob's balance, the other ~70% scattered
invisibly.
The fix is a per-shard inbox: NUM_SHARDS bounded
crossbeam channels, one inbox per shard. When the sender's commit
thread sees shard_of(recipient) ≠ self.shard_id, it
buffers the credit and try_sends it to the recipient
shard's inbox after releasing its inner lock. Each shard's commit
thread drains its inbox at the top of every cycle, before
processing its next batch group. Eventual consistency: the
recipient credit trails the sender debit by ≤ one commit cycle on
the recipient's shard.
| signal | before fix | after fix |
|---|---|---|
| cross-shard delivery | ~30% (3 of 10) | 100% (Bob Δ = 30 × 100 exact) |
| cross_shard_dropped | n/a | 0 |
| FROZEN events under burst | occasional | 0 |
| drift (assigned vs committed) | visible | 0 |
Three new aggregate counters surface the propagation health on
/demo/metrics:
cross_shard_sent,
cross_shard_applied,
cross_shard_dropped. In steady state the first two
track within ~1–2 (the drain cadence), and the third stays at 0.
/demo/metrics every 60 s.
If cross_shard_dropped > 0,
frozen = true, or
(assigned − committed − in_flight) > 0 at any tick,
it forces SUBTICK_DEMO_POOL_SIZE=1 and restarts the
service before the issue can compound. With pool=4 already at the
safe baseline, the signal is logged without action. Restart cooldown:
5 minutes.
Browser-side signing (per-visitor ephemeral keypairs, server in
relay-only mode) is the next architectural step. The wallet
abstraction that sits behind the pool today —
SessionWallet — is already the unit that swaps
out: the handlers, the gate logic, and the correctness
invariants do not change when the backend goes from
"pre-funded shared pool" to "per-session ephemeral keypair".
If you're integrating Subtick — or building anything that talks to a real-time chain over HTTP — a wallet that diverges silently is indistinguishable from a working one until your customer screenshots a frozen UI. The bug we hit and fixed here is generic to any "server-managed nonce" pattern: token-bucket pacers, payment retry loops, atomic counter wallets. The protocol above (chain truth · contiguous window · circuit-breaker freeze · event-driven detection) ports to all of them.
Source for the demo wallet — including the failure injection, the
recovery test, and the bench harness — is in the
public repo
under subtick/src/api/mod.rs and web/_bench.mjs.