Stress Scenarios

Stress scenarios are parameterized test programs that subject an extension function to adversarial conditions that rarely appear in normal unit tests. Run them with ext_memcheck.run_scenario(). Refer to the API documentation for usage details.

growth_benchmark

Invokes the target SQL a configurable number of iterations and tracks the per-MemoryContext used bytes at log-spaced checkpoints (iteration 1, 10, 100, …, up to a maximum of 8 checkpoints). Contexts that grow monotonically across at least two checkpoints and exceed pg_ext_memcheck.bloat_min_bytes (default 8 KiB) are reported as ctx_bloat violations.

SELECT ext_memcheck.run_scenario(scenario_name := 'growth_benchmark', iterations := 1000, workload := 'SELECT 1');

How it works:

A private AllocSet context (pg_ext_memcheck bench) is created to own the bookkeeping data — this context is excluded from analysis.
Checkpoint iteration numbers are computed as 1, 10, 100, … (powers of 10, capped at iterations). Up to 8 checkpoints are used.
After each checkpoint iteration, context_walker snapshots the full MemoryContext tree and records used bytes (allocated − freed) for every context, identified by (name, depth, parent hash).
Contexts that appear mid-run are back-filled with 0; contexts that disappear are carried forward at their last observed value.
After all iterations complete, each tracked context is analyzed:
- Must have grown from first to last checkpoint.
- Must show monotonic (non-decreasing) growth.
- Must have increased at 2+ checkpoints (single-shot spikes are ignored).
- Total growth must be ≥ bloat_min_bytes.
Growth shape is classified as superlinear when the per-iteration rate in the last checkpoint interval exceeds 1.5× the rate in the first interval; otherwise linear.
Violations are written to the ring buffer as ctx_bloat. The generic before/after diff is skipped for this scenario to avoid double-reporting the same growth as context_leak.

Severity:

Level	Base condition	Escalation
`ERROR`	Total growth > 1 MiB	or `WARNING` + superlinear growth
`WARNING`	Total growth > 64 KiB	or `INFO` + superlinear growth
`INFO`	Total growth ≥ `bloat_min_bytes`	—

What it catches: Slow cumulative leaks that are invisible in single-call tests; monotonic context bloat; accelerating (superlinear) allocation patterns that will exhaust memory under sustained load.

Tuning:

-- Raise sensitivity: report any context that grew by at least 4 KiB
SET pg_ext_memcheck.bloat_min_bytes = 4096;

-- Run 1000 iterations; checkpoints will be at 1, 10, 100, 1000
SELECT ext_memcheck.run_scenario('growth_benchmark', 1000, 'SELECT your_ext.fn()');
SELECT ext_memcheck.flush_violations();
SELECT * FROM ext_memcheck.violation_log WHERE check_type = 'ctx_bloat';

tx_abort_loop

Invokes the target function inside a savepoint, then rolls back to the savepoint before releasing it. Repeats many times. Exposes resources that are only released on COMMIT, not ROLLBACK.

SELECT ext_memcheck.run_scenario(scenario_name := 'tx_abort_loop', iterations := 100, workload := 'SELECT 1');

How it works: Similar to growth_benchmark, but wraps each invocation in a SAVEPOINT / ROLLBACK TO SAVEPOINT block. The context walker compares pre/post snapshots across the entire loop, so any resources that accumulate across iterations without being cleaned up on rollback will show up as a leak.

What it catches: Context leaks that only manifest on transaction abort; resources tied to transaction callbacks that are never called on abort.

shmem_sentinel_probe

Plants 0xDE sentinel bytes just past the declared boundary of each registered shared memory segment, runs the workload, then verifies all sentinels are intact. A corrupted sentinel means the extension under test wrote past its declared shmem boundary.

SELECT ext_memcheck.run_scenario('shmem_sentinel_probe', 10, 'SELECT 1');

-- Inspect any overrun violations
SELECT * FROM ext_memcheck.violation_log WHERE check_type = 'shmem_overrun';

-- Reset registry between test runs
SELECT ext_memcheck.clear_shmem_registry();

How it works: shmem_probe.c allocates a ProbeRegistry in shared memory. The scenario calls probe_register(seg_name, alloc_size, data_end) for the ViolationLog and DsmTrackerState segments — both allocated with an extra byte in _PG_init, so alloc_size = sizeof(struct) + 1 and the sentinel is placed at data_end = sizeof(struct). After the workload loop, probe_check_all() reads each sentinel byte. Any value other than 0xDE logs a shmem_overrun ERROR violation.

To probe your own extension's shared memory segment, call ext_memcheck.register_shmem_probe(seg_name, allocated_size) before running this scenario.

What it catches: Off-by-one writes past a segment's declared boundary; memset or memcpy calls whose length is computed incorrectly.

wrong_context_probe

Runs the target workload SQL a configurable number of times, then diffs the MemoryContext tree before and after to detect allocations that landed in long-lived global contexts (TopMemoryContext, CacheMemoryContext). Unlike the generic executor-hook path, this scenario skips the context_leak diff entirely — it runs only check_wrong_context_alloc, so results are focused exclusively on wrong-context violations without noise from ordinary leak detection.

SELECT ext_memcheck.run_scenario('wrong_context_probe', 50, 'SELECT your_ext.fn()');
SELECT ext_memcheck.flush_violations();
SELECT * FROM ext_memcheck.violation_log WHERE check_type = 'wrong_ctx_alloc';

How it works:

Before the first iteration, context_walker snapshots the full MemoryContext tree.
The workload SQL is executed iterations times via SPI.
After the last iteration, a second snapshot is taken and check_wrong_context_alloc is called with the before/after pair.
check_wrong_context_alloc runs two passes:
- Growth pass — any named global context (TopMemoryContext, CacheMemoryContext) whose totalAllocated increased since the pre-run snapshot emits a wrong_ctx_alloc WARNING with the byte delta.
- New-child pass — any context that appeared in the post-run snapshot but not the pre-run snapshot and whose parent is a known global emits a wrong_ctx_alloc WARNING identifying the newly created child.

What it catches: Extension functions that allocate in long-lived contexts (TopMemoryContext, CacheMemoryContext) instead of query-local contexts — a common source of per-backend memory growth that accumulates silently across sessions.

When to use over the default executor hook: The executor hook also calls check_wrong_context_alloc, but it does so alongside context-leak detection and only for queries passing through the hook. Use wrong_context_probe when you want a clean, isolated signal across N controlled repetitions without unrelated leak noise.

use_after_reset

Runs the use_after_reset crash scenario inside a BGWorker process. The worker calls elog(FATAL) to simulate a use-after-reset crash, then exits with a non-zero exit code. The calling backend detects the crash via WorkerSlot.exit_code and logs the result.

SELECT ext_memcheck.run_scenario('use_after_reset', 1, 'SELECT 1');
SELECT ext_memcheck.flush_violations();

How it works:

launch_crash_isolation_worker("use_after_reset") fills a shared-memory WorkerSlot with the scenario name and current database.
A BackgroundWorker is registered and launched. The caller blocks on WaitForBackgroundWorkerShutdown().
The worker calls run_use_after_reset_in_worker(), which calls elog(FATAL) — a clean proc_exit(1) that does not trigger postmaster crash recovery.
The calling backend reads exit_code != 0 from the slot and reports the confirmed crash.

What it catches: Verifies that a known use-after-reset bug produces a crash signal without terminating the test session. Used to validate that crash-isolation infrastructure works correctly.

oom_simulation

Allocates 1 MiB chunks via palloc_extended(MCXT_ALLOC_NO_OOM) inside a BGWorker until the allocator returns NULL (or 256 MiB are consumed), then exits with elog(FATAL). The calling backend detects the crash via exit code.

SELECT ext_memcheck.run_scenario('oom_simulation', 1, 'SELECT 1');
SELECT ext_memcheck.flush_violations();

How it works: Mirrors use_after_reset but exercises the OOM allocation path. MCXT_ALLOC_NO_OOM suppresses the normal ERROR and returns NULL instead, so the loop can drain available memory to a controlled limit (capped at 256 MB for CI environments with Linux memory overcommit).

What it catches: Confirms that an extension's OOM behavior (or simulated OOM) is crash-detected in isolation without affecting the test session or other backends.

dsm_lifecycle_check (Phase 2)

Runs the target function inside a DSM segment allocation/deallocation cycle and verifies that the extension correctly attaches and detaches.

SELECT ext_memcheck.run_scenario('dsm_lifecycle_check', target => 'my_ext.fn', iterations => 20);

What it catches: Leaked DSM segment handles; missing dsm_detach() calls.

context_reset_storm (Phase 2)

Rapidly and repeatedly resets the current MemoryContext between invocations of the target function. Reveals extensions that hold raw pointers across context boundaries.

SELECT ext_memcheck.run_scenario('context_reset_storm', target => 'my_ext.fn', iterations => 200);

Parameter	Default	Description
`iterations`	`50`	Number of invocations
`reset_frequency`	`1`	Reset context every N invocations

What it catches: Use-after-reset dereferences; context-local data that survives a reset by accident.

concurrent_backends (Phase 2)

Spawns multiple background workers that simultaneously invoke the target function. Stresses shared data structures and LWLock usage.

SELECT ext_memcheck.run_scenario('concurrent_backends', target => 'my_ext.fn', iterations => 50);

Parameter	Default	Description
`workers`	`4`	Number of concurrent backend workers
`iterations`	`50`	Invocations per worker

What it catches: Shmem overruns under concurrent access; DSM lifecycle races; LWLock starvation.