Stress Scenarios
Stress scenarios are parameterized test programs that subject an extension function to adversarial conditions that rarely appear in normal unit tests. Run them with ext_memcheck.run_scenario(). Refer to the API documentation for usage details.
growth_benchmark
Section titled “growth_benchmark”Invokes the target SQL a configurable number of iterations and tracks the per-MemoryContext used bytes at log-spaced checkpoints (iteration 1, 10, 100, …, up to a maximum of 8 checkpoints). Contexts that grow monotonically across at least two checkpoints and exceed pg_ext_memcheck.bloat_min_bytes (default 8 KiB) are reported as ctx_bloat violations.
SELECT ext_memcheck.run_scenario(scenario_name := 'growth_benchmark', iterations := 1000, workload := 'SELECT 1');How it works:
- A private
AllocSetcontext (pg_ext_memcheck bench) is created to own the bookkeeping data — this context is excluded from analysis. - Checkpoint iteration numbers are computed as
1, 10, 100, …(powers of 10, capped atiterations). Up to 8 checkpoints are used. - After each checkpoint iteration,
context_walkersnapshots the fullMemoryContexttree and records used bytes (allocated − freed) for every context, identified by (name, depth, parent hash). - Contexts that appear mid-run are back-filled with
0; contexts that disappear are carried forward at their last observed value. - After all iterations complete, each tracked context is analyzed:
- Must have grown from first to last checkpoint.
- Must show monotonic (non-decreasing) growth.
- Must have increased at 2+ checkpoints (single-shot spikes are ignored).
- Total growth must be ≥
bloat_min_bytes.
- Growth shape is classified as superlinear when the per-iteration rate in the last checkpoint interval exceeds 1.5× the rate in the first interval; otherwise linear.
- Violations are written to the ring buffer as
ctx_bloat. The generic before/after diff is skipped for this scenario to avoid double-reporting the same growth ascontext_leak.
Severity:
| Level | Base condition | Escalation |
|---|---|---|
ERROR | Total growth > 1 MiB | or WARNING + superlinear growth |
WARNING | Total growth > 64 KiB | or INFO + superlinear growth |
INFO | Total growth ≥ bloat_min_bytes | — |
What it catches: Slow cumulative leaks that are invisible in single-call tests; monotonic context bloat; accelerating (superlinear) allocation patterns that will exhaust memory under sustained load.
Tuning:
-- Raise sensitivity: report any context that grew by at least 4 KiBSET pg_ext_memcheck.bloat_min_bytes = 4096;
-- Run 1000 iterations; checkpoints will be at 1, 10, 100, 1000SELECT ext_memcheck.run_scenario('growth_benchmark', 1000, 'SELECT your_ext.fn()');SELECT ext_memcheck.flush_violations();SELECT * FROM ext_memcheck.violation_log WHERE check_type = 'ctx_bloat';tx_abort_loop
Section titled “tx_abort_loop”Invokes the target function inside a savepoint, then rolls back to the savepoint before releasing it. Repeats many times. Exposes resources that are only released on COMMIT, not ROLLBACK.
SELECT ext_memcheck.run_scenario(scenario_name := 'tx_abort_loop', iterations := 100, workload := 'SELECT 1');How it works: Similar to growth_benchmark, but wraps each invocation in a SAVEPOINT / ROLLBACK TO SAVEPOINT block. The context walker compares pre/post snapshots across the entire loop, so any resources that accumulate across iterations without being cleaned up on rollback will show up as a leak.
What it catches: Context leaks that only manifest on transaction abort; resources tied to transaction callbacks that are never called on abort.
shmem_sentinel_probe
Section titled “shmem_sentinel_probe”Plants 0xDE sentinel bytes just past the declared boundary of each registered shared memory segment, runs the workload, then verifies all sentinels are intact. A corrupted sentinel means the extension under test wrote past its declared shmem boundary.
SELECT ext_memcheck.run_scenario('shmem_sentinel_probe', 10, 'SELECT 1');
-- Inspect any overrun violationsSELECT * FROM ext_memcheck.violation_log WHERE check_type = 'shmem_overrun';
-- Reset registry between test runsSELECT ext_memcheck.clear_shmem_registry();How it works: shmem_probe.c allocates a ProbeRegistry in shared memory. The scenario calls probe_register(seg_name, alloc_size, data_end) for the ViolationLog and DsmTrackerState segments — both allocated with an extra byte in _PG_init, so alloc_size = sizeof(struct) + 1 and the sentinel is placed at data_end = sizeof(struct). After the workload loop, probe_check_all() reads each sentinel byte. Any value other than 0xDE logs a shmem_overrun ERROR violation.
To probe your own extension's shared memory segment, call ext_memcheck.register_shmem_probe(seg_name, allocated_size) before running this scenario.
What it catches: Off-by-one writes past a segment's declared boundary; memset or memcpy calls whose length is computed incorrectly.
wrong_context_probe
Section titled “wrong_context_probe”Runs the target workload SQL a configurable number of times, then diffs the MemoryContext tree before and after to detect allocations that landed in long-lived global contexts (TopMemoryContext, CacheMemoryContext). Unlike the generic executor-hook path, this scenario skips the context_leak diff entirely — it runs only check_wrong_context_alloc, so results are focused exclusively on wrong-context violations without noise from ordinary leak detection.
SELECT ext_memcheck.run_scenario('wrong_context_probe', 50, 'SELECT your_ext.fn()');SELECT ext_memcheck.flush_violations();SELECT * FROM ext_memcheck.violation_log WHERE check_type = 'wrong_ctx_alloc';How it works:
- Before the first iteration,
context_walkersnapshots the fullMemoryContexttree. - The workload SQL is executed
iterationstimes via SPI. - After the last iteration, a second snapshot is taken and
check_wrong_context_allocis called with the before/after pair. check_wrong_context_allocruns two passes:- Growth pass — any named global context (
TopMemoryContext,CacheMemoryContext) whosetotalAllocatedincreased since the pre-run snapshot emits awrong_ctx_allocWARNING with the byte delta. - New-child pass — any context that appeared in the post-run snapshot but not the pre-run snapshot and whose parent is a known global emits a
wrong_ctx_allocWARNING identifying the newly created child.
- Growth pass — any named global context (
What it catches: Extension functions that allocate in long-lived contexts (TopMemoryContext, CacheMemoryContext) instead of query-local contexts — a common source of per-backend memory growth that accumulates silently across sessions.
When to use over the default executor hook: The executor hook also calls check_wrong_context_alloc, but it does so alongside context-leak detection and only for queries passing through the hook. Use wrong_context_probe when you want a clean, isolated signal across N controlled repetitions without unrelated leak noise.
use_after_reset
Section titled “use_after_reset”Runs the use_after_reset crash scenario inside a BGWorker process. The worker calls elog(FATAL) to simulate a use-after-reset crash, then exits with a non-zero exit code. The calling backend detects the crash via WorkerSlot.exit_code and logs the result.
SELECT ext_memcheck.run_scenario('use_after_reset', 1, 'SELECT 1');SELECT ext_memcheck.flush_violations();How it works:
launch_crash_isolation_worker("use_after_reset")fills a shared-memoryWorkerSlotwith the scenario name and current database.- A
BackgroundWorkeris registered and launched. The caller blocks onWaitForBackgroundWorkerShutdown(). - The worker calls
run_use_after_reset_in_worker(), which callselog(FATAL)— a cleanproc_exit(1)that does not trigger postmaster crash recovery. - The calling backend reads
exit_code != 0from the slot and reports the confirmed crash.
What it catches: Verifies that a known use-after-reset bug produces a crash signal without terminating the test session. Used to validate that crash-isolation infrastructure works correctly.
oom_simulation
Section titled “oom_simulation”Allocates 1 MiB chunks via palloc_extended(MCXT_ALLOC_NO_OOM) inside a BGWorker until the allocator returns NULL (or 256 MiB are consumed), then exits with elog(FATAL). The calling backend detects the crash via exit code.
SELECT ext_memcheck.run_scenario('oom_simulation', 1, 'SELECT 1');SELECT ext_memcheck.flush_violations();How it works: Mirrors use_after_reset but exercises the OOM allocation path. MCXT_ALLOC_NO_OOM suppresses the normal ERROR and returns NULL instead, so the loop can drain available memory to a controlled limit (capped at 256 MB for CI environments with Linux memory overcommit).
What it catches: Confirms that an extension's OOM behavior (or simulated OOM) is crash-detected in isolation without affecting the test session or other backends.
dsm_lifecycle_check (Phase 2)
Section titled “dsm_lifecycle_check (Phase 2)”Runs the target function inside a DSM segment allocation/deallocation cycle and verifies that the extension correctly attaches and detaches.
SELECT ext_memcheck.run_scenario('dsm_lifecycle_check', target => 'my_ext.fn', iterations => 20);What it catches: Leaked DSM segment handles; missing dsm_detach() calls.
context_reset_storm (Phase 2)
Section titled “context_reset_storm (Phase 2)”Rapidly and repeatedly resets the current MemoryContext between invocations of the target function. Reveals extensions that hold raw pointers across context boundaries.
SELECT ext_memcheck.run_scenario('context_reset_storm', target => 'my_ext.fn', iterations => 200);| Parameter | Default | Description |
|---|---|---|
iterations | 50 | Number of invocations |
reset_frequency | 1 | Reset context every N invocations |
What it catches: Use-after-reset dereferences; context-local data that survives a reset by accident.
concurrent_backends (Phase 2)
Section titled “concurrent_backends (Phase 2)”Spawns multiple background workers that simultaneously invoke the target function. Stresses shared data structures and LWLock usage.
SELECT ext_memcheck.run_scenario('concurrent_backends', target => 'my_ext.fn', iterations => 50);| Parameter | Default | Description |
|---|---|---|
workers | 4 | Number of concurrent backend workers |
iterations | 50 | Invocations per worker |
What it catches: Shmem overruns under concurrent access; DSM lifecycle races; LWLock starvation.