-
Notifications
You must be signed in to change notification settings - Fork 3k
Description
Describe the bug
When EventsCompactionConfig is configured on an App, the compaction mechanism correctly summarizes old events and appends compaction events to the session. However, Runner._get_or_create_session still calls get_session without any GetSessionConfig, loading the full event history on every invocation.
This means compaction only reduces the LLM context window — it does nothing to reduce the session loading overhead. As conversations grow longer, get_session latency degrades linearly, even though compaction has already produced summaries that could make the old raw events unnecessary for the Runner.
The architectural disconnect
EventsCompactionConfig (configured on App)
→ Compacts events for LLM context ✅
→ Does NOT reduce get_session load ❌
Runner._get_or_create_session (runners.py L375)
→ Calls get_session() WITHOUT GetSessionConfig
→ Loads ALL events every time
→ Latency grows linearly with conversation length
The GetSessionConfig class already supports num_recent_events and after_timestamp filtering, and all session service implementations (InMemorySessionService, DatabaseSessionService, VertexAiSessionService) honor these parameters. But the Runner never uses them.
To Reproduce
- Create an App with
EventsCompactionConfig:
app = App(
name="my_app",
root_agent=agent,
events_compaction_config=EventsCompactionConfig(
compaction_interval=5,
overlap_size=1,
summarizer=LlmEventSummarizer(llm=my_llm),
),
)- Use
DatabaseSessionService(PostgreSQL) or any persistent session service - Have a multi-turn conversation (10+ rounds)
- Observe that
get_sessionbecomes progressively slower on each new invocation
Real-world numbers
In our production setup with a 6-stage sequential pipeline agent (Science Navigator), each conversation turn generates ~480 events (streaming chunks, tool calls, tool responses, text finals). After 10 turns:
- ~4,800 events accumulated per session
get_sessiontakes 70+ seconds (viaHttpSessionService→ PostgreSQLDatabaseSessionService)- Compaction events exist in the session but the Runner still loads all 4,800 raw events
Expected behavior
When EventsCompactionConfig is configured, the Runner should be aware that compacted summaries exist and avoid loading the full event history. Possible approaches:
-
Runner should pass
GetSessionConfigtoget_session— After compaction has run, the Runner knows which events are summarized. It should only load the compaction event(s) + recent un-compacted events. -
EventsCompactionConfigshould inform session loading — The compaction config could derive appropriateGetSessionConfigparameters (e.g., only load events after the last compaction timestamp). -
At minimum, expose
GetSessionConfigviaRunConfig— As proposed in Allow limiting num. of Session events fetched when calling Runner.run_async #3562 and PR feat(runners): Add get_session_config property to RunConfig #3662, allow users to control how many events are loaded. But ideally the framework should do this automatically when compaction is configured.
Environment
- ADK version: 1.26.0
- Python: 3.12
- Session service:
DatabaseSessionService(PostgreSQL, accessed via HTTP microservice) - Agent type: Multi-stage
SequentialAgentwithParallelAgentsub-stages
Related issues
- VertexAiSessionService.get_session Becomes Super Slow When There Are Higher Numbers of Session Events #3039 —
VertexAiSessionService.get_sessionbecomes super slow with many events - Allow limiting num. of Session events fetched when calling Runner.run_async #3562 — Allow limiting num. of Session events fetched when calling
Runner.run_async - PR feat(runners): Add get_session_config property to RunConfig #3662 —
feat(runners): Add get_session_config property to RunConfig(still open)
Analysis
The root cause is that compaction and session loading are designed as two independent systems:
- Compaction (in
apps/compaction.py) runs post-invocation and appends summary events - Session loading (in
runners.py) runs pre-invocation and loads everything
There is no feedback loop between them. Even after compaction has run successfully, the next invocation still loads all original events + the compaction event.
PR #3662 would help as a workaround (letting users manually set num_recent_events), but the ideal fix is for the Runner to automatically leverage compaction metadata to optimize session loading — making EventsCompactionConfig a true end-to-end optimization rather than just an LLM context optimization.