Bug: MemSafe::new panics under concurrent load due to VirtualLock quota exhaustion (Windows) #28

Open
opened 2026-03-14 10:41:36 +00:00 by CleverWild · 0 comments
Member

Description

MemSafe::new(...) on Windows calls VirtualLock, which pins memory pages in RAM to prevent
sensitive data from being swapped to disk. Each process has a Working Set Quota — an OS-enforced
limit on the total amount of locked memory.

When multiple concurrent tasks call MemSafe::new(...).unwrap() simultaneously, they can
collectively exhaust this quota. Once exhausted, VirtualLock returns an error, MemSafe::new
returns Err, and .unwrap() panics.

How it was discovered

RUST_TEST_THREADS = "1" in .cargo/config.toml masks the problem: running tests sequentially
creates no concurrent pressure on the quota. cargo nextest also does not reproduce it, because it
isolates each test in a separate process with its own quota.

In production the server is a single process — all concurrent requests compete for the same quota.

Failure points

All .unwrap() calls on MemSafe::new in paths that can be concurrent:

File Location Data
crates/arbiter-server/src/actors/keyholder/mod.rs:228 try_unseal ciphertext from DB
crates/arbiter-server/src/actors/keyholder/mod.rs:282 decrypt ciphertext from DB
crates/arbiter-server/src/actors/keyholder/encryption/v1.rs:64 KeyCell::try_from 32-byte key
crates/arbiter-server/src/actors/keyholder/encryption/v1.rs:76 KeyCell::new_secure_random 32-byte key
crates/arbiter-server/src/actors/keyholder/encryption/v1.rs:152 derive_seal_key 32-byte key
crates/arbiter-server/src/evm/safe_signer.rs:47 generate 32-byte key
crates/arbiter-server/src/evm/safe_signer.rs:78 SafeSigner::new SigningKey
crates/arbiter-server/src/actors/evm/mod.rs:104 sign path plaintext key bytes
crates/arbiter-server/src/actors/user_agent/session.rs:236 unseal flow seal key buffer

Production failure scenario

N concurrent EVM sign-transaction requests arrive. Each goes through:

  1. keyholder.decrypt()MemSafe::new(ciphertext).unwrap() + MemSafe::new(Key) inside KeyCell
  2. evm::sign_transactionMemSafe::new(key_bytes).unwrap() + SafeSigner::new → another MemSafe

At N ≈ 20–50 (depending on data size and system quota), the quota is exhausted. The next .unwrap()
panics — the tokio task crashes, or if the panic propagates past the actor framework boundary, the
process terminates.

Fix options

  1. Propagate the error — replace .unwrap() with ? / map_err, return
    Error::MemSafeAllocation — signal the client with an error instead of panicking
  2. Increase quota at startup — call SetProcessWorkingSetSize with headroom during server
    initialization (workaround, does not scale)
  3. Pooling — maintain a pool of pre-allocated MemSafe buffers of fixed size, reuse across
    requests
  4. Limit keyholder concurrency — rate-limit decrypt operations via Semaphore so the number
    of live MemSafe allocations at any given moment stays within a safe bound
## Description `MemSafe::new(...)` on Windows calls `VirtualLock`, which pins memory pages in RAM to prevent sensitive data from being swapped to disk. Each process has a Working Set Quota — an OS-enforced limit on the total amount of locked memory. When multiple concurrent tasks call `MemSafe::new(...).unwrap()` simultaneously, they can collectively exhaust this quota. Once exhausted, `VirtualLock` returns an error, `MemSafe::new` returns `Err`, and `.unwrap()` panics. ## How it was discovered `RUST_TEST_THREADS = "1"` in `.cargo/config.toml` masks the problem: running tests sequentially creates no concurrent pressure on the quota. `cargo nextest` also does not reproduce it, because it isolates each test in a separate process with its own quota. In production the server is a single process — all concurrent requests compete for the same quota. ## Failure points All `.unwrap()` calls on `MemSafe::new` in paths that can be concurrent: | File | Location | Data | |------|----------|------| | `crates/arbiter-server/src/actors/keyholder/mod.rs:228` | `try_unseal` | ciphertext from DB | | `crates/arbiter-server/src/actors/keyholder/mod.rs:282` | `decrypt` | ciphertext from DB | | `crates/arbiter-server/src/actors/keyholder/encryption/v1.rs:64` | `KeyCell::try_from` | 32-byte key | | `crates/arbiter-server/src/actors/keyholder/encryption/v1.rs:76` | `KeyCell::new_secure_random` | 32-byte key | | `crates/arbiter-server/src/actors/keyholder/encryption/v1.rs:152` | `derive_seal_key` | 32-byte key | | `crates/arbiter-server/src/evm/safe_signer.rs:47` | `generate` | 32-byte key | | `crates/arbiter-server/src/evm/safe_signer.rs:78` | `SafeSigner::new` | SigningKey | | `crates/arbiter-server/src/actors/evm/mod.rs:104` | sign path | plaintext key bytes | | `crates/arbiter-server/src/actors/user_agent/session.rs:236` | unseal flow | seal key buffer | ## Production failure scenario N concurrent EVM sign-transaction requests arrive. Each goes through: 1. `keyholder.decrypt()` → `MemSafe::new(ciphertext).unwrap()` + `MemSafe::new(Key)` inside `KeyCell` 2. `evm::sign_transaction` → `MemSafe::new(key_bytes).unwrap()` + `SafeSigner::new` → another `MemSafe` At N ≈ 20–50 (depending on data size and system quota), the quota is exhausted. The next `.unwrap()` panics — the tokio task crashes, or if the panic propagates past the actor framework boundary, the process terminates. ## Fix options 1. **Propagate the error** — replace `.unwrap()` with `?` / `map_err`, return `Error::MemSafeAllocation` — signal the client with an error instead of panicking 2. **Increase quota at startup** — call `SetProcessWorkingSetSize` with headroom during server initialization (workaround, does not scale) 3. **Pooling** — maintain a pool of pre-allocated `MemSafe` buffers of fixed size, reuse across requests 4. **Limit keyholder concurrency** — rate-limit decrypt operations via `Semaphore` so the number of live `MemSafe` allocations at any given moment stays within a safe bound
CleverWild added the
Reviewed
Pending
4
Kind
Bug
Priority
High
2
labels 2026-03-14 10:41:36 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: MarketTakers/arbiter#28