On Codex-OAuth sessions, normal turns can use the Codex runtime while compaction fallback can still resolve to the plain openai provider. If there is no direct OpenAI API key for that path, compaction fails at the moment the context is full.
Executive Summary
Recent issues also show stale thread bindings, early preflight compactions, untracked Codex runtime overhead, event-loop stalls, and stuck locks.
origin/main contains several fixes after the stable branch, including direct Codex compaction routing and Codex boundary hardening.
origin/main and has dirty changes in the exact compaction runtime-context files. But the dirty local patch is in the right place: it maps the context-engine runtime context from openai to openai-codex when the harness runtime is Codex.
Failure Mechanism
The concrete failure chain reported in the newest issues looks like this:
Session grows
Long tool-heavy or chat-heavy session reaches preflight or provider context pressure.
Native compact attempted
Codex app-server compaction needs an existing thread/session binding.
Binding missing
If the binding is missing or stale, older paths fall back to the context engine.
Provider mismatch
The fallback can resolve openai/gpt-5.5 instead of openai-codex/gpt-5.5.
Compaction fails
The direct OpenAI path asks for a normal API key and fails, or the session stalls around locks/timeouts.
Newest Relevant Issues
| Issue | Signal | Why it matters here | Status read |
|---|---|---|---|
| #86820 | primary Codex OAuth fallback tries direct OpenAI | Matches the observed symptom: compaction reaches fallback, then fails because the plain OpenAI provider has no API key. | Open, updated 2026-05-26. |
| #86373 | routing embedded compaction fallback target mismatch | Describes the provider/auth split directly: provider=openai with authProfileId=openai-codex:.... |
Recent; connected to the same fix train. |
| #86470 | doctor rewrites Codex profiles | Explains how a valid openai-codex/* route can be normalized into an apparently valid but compaction-breaking openai/* setup. |
Still relevant on stable v2026.5.22. |
| #86819 | accounting untracked runtime overhead | /context detail can account for only a small fraction of reported context, leaving about 62k tokens as provider/runtime overhead. |
Open; exact live Codex baseline still needs proof. |
| #86358 | runtime event-loop starvation | Compaction can stall the Node event loop long enough that unrelated fetches time out, making a recovering session look broken. | Open, P1-class behavior. |
| #85712 | preflight tiny context after compact-only route | Preflight can decide to compact only, then continue with a tiny assembled context and no user-facing warning. | Open; likely adjacent to repeated compaction reports. |
| #81178 | stale state repeated early preflight compactions | After a successful compact, stale pre-compaction usage can trigger another premature compact. | Recent comments, regression-shaped. |
| #70334 | stuck session processing lock remains | Older but still explains the user-visible “it went quiet” pattern after context overflow handling. | Historical but aligned. |
Last Four Days of Relevant Code Changes
c08400ea7d Fix context pressure preflight for tool-heavy sessions
Introduces stronger preflight pressure estimation and routes such as compact_only, truncate_tool_results_only, and compact_then_truncate. This is useful, but it means compaction can now be triggered before the model call by OpenClaw's own estimator.
46de078b2a Bound embedded compaction write locks
Targets a known failure class where compaction/session locks can remain held too long and block later session progress.
dd47e479ae Fail Codex compaction at the Codex boundary
Important hardening: when the harness runtime is Codex, missing or stale native compaction binding should not silently fall through into the wrong context-engine path.
f0061ddc54 Preserve partial summary on mid-chain chunk failure
Improves recovery when chunked compaction fails partway through, reducing all-or-nothing loss.
f4cfa012e1 Route compaction through Codex auth provider
The direct embedded compaction path now maps OpenAI + Codex runtime/auth to openai-codex before resolving model auth. This directly addresses the missing direct OpenAI API key failure for that path.
bcde7b138a Handle preflight compaction no-op budgets
Targets repeated/no-op compaction behavior after the preflight estimator believes compaction is needed but the effective budget situation has not improved.
Local Branch Read
Branch state
Local source is on stable-v2026.5.22-guest, not a clean fast-forward ancestor of origin/main. A blind merge would mix guest branch changes with the current upstream fix train.
Dirty files are relevant
Two dirty files are exactly in the compaction runtime-context area: compaction-runtime-context.ts and its test.
What the dirty patch appears to fix
It resolves the harness policy, maps openai + Codex runtime to openai-codex for context-engine runtime context, and preserves openai-codex:... auth profiles only when the provider is deliberately changed to the Codex provider. That covers a gap still visible in origin/main where buildEmbeddedCompactionRuntimeContext returns provider: resolved.provider.
Recommended Next Actions
1. Test the dirty runtime-context patch
Add a minimal Codex-OAuth compaction fallback repro and run the targeted test file. This patch is likely not cosmetic; it covers the remaining context-engine runtime-context mismatch.
2. Do not run doctor fixes blindly
Until #86470 is resolved, avoid automatic rewrites that turn openai-codex/* into openai/* without proving compaction still routes through Codex auth.
3. Fail visibly at the right boundary
If Codex native compaction has no valid thread binding, report that specific condition instead of falling into a misleading direct OpenAI API-key failure.
4. Separate runtime overhead in reports
/context detail should label Codex native/cache/runtime overhead separately so users do not chase AGENTS.md, tools, or memory ghosts for a 62k-token residual.
5. Guard preflight no-op loops
After compaction, next preflight should use post-compaction active replay, not stale transcript records. No-op compactions need an escalation path.
6. Watch event-loop stalls
Large compaction should not starve Telegram/API fetches. Keep timer-delay diagnostics and consider offloading CPU-heavy summary assembly or token accounting.
Evidence Used
- GitHub issues queried with
gh issue listandgh issue view: #86820, #86819, #86373, #86470, #86358, #85712, #81178, #70334. - Source read from
origin/main:src/agents/pi-embedded-runner/compact.queued.ts:52-65,140-167,388-435. - Source read from
origin/main:src/agents/pi-embedded-runner/compaction-runtime-context.ts:105-143. - Source read from
origin/main:src/agents/pi-embedded-runner/compact.ts:494-524,558-615. - Source read from
origin/main:src/agents/openai-codex-routing.ts:217-258. - Local diff read from
/home/hakalya/openclaw:src/agents/pi-embedded-runner/compaction-runtime-context.tsandcompaction-runtime-context.test.ts.