Add replay-contract test for compacted vs raw event prompt equivalence#4668
Add replay-contract test for compacted vs raw event prompt equivalence#4668davidahmann wants to merge 1 commit intogoogle:mainfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical need for regression testing around event compaction behavior. It introduces a new contract test to rigorously verify that the process of compacting event streams does not alter the effective prompt semantics when reconstructing prompts. This ensures replay determinism and prevents silent changes to prompt reconstruction logic due to compaction. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adds a valuable contract test to ensure that the effective prompt remains equivalent for both raw and compacted event streams. The implementation is correct and achieves its goal. I have one suggestion to improve the maintainability of the new test by reducing some code duplication.
| def test_replay_contract_compacted_and_raw_events_match_effective_prompt( | ||
| self, | ||
| ): | ||
| raw_events = [ | ||
| self._create_event(1.0, 'inv1', 'User asks about weather'), | ||
| self._create_event(2.0, 'inv1', 'Agent asks clarifying question'), | ||
| self._create_event(3.0, 'inv2', 'User clarifies location'), | ||
| self._create_event(4.0, 'inv2', 'Agent proposes plan'), | ||
| self._create_event(5.0, 'inv3', 'User asks for final answer'), | ||
| ] | ||
|
|
||
| compacted_events = [ | ||
| self._create_compacted_event( | ||
| 1.0, | ||
| 4.0, | ||
| ( | ||
| 'User asks about weather\n' | ||
| 'Agent asks clarifying question\n' | ||
| 'User clarifies location\n' | ||
| 'Agent proposes plan' | ||
| ), | ||
| appended_ts=4.5, | ||
| ), | ||
| self._create_event(5.0, 'inv3', 'User asks for final answer'), | ||
| ] | ||
|
|
||
| raw_prompt = '\n'.join( | ||
| part.text | ||
| for content in contents._get_contents(None, raw_events) | ||
| for part in content.parts | ||
| if part.text | ||
| ) | ||
| compacted_prompt = '\n'.join( | ||
| part.text | ||
| for content in contents._get_contents(None, compacted_events) | ||
| for part in content.parts | ||
| if part.text | ||
| ) | ||
|
|
||
| self.assertEqual( | ||
| compacted_prompt, | ||
| raw_prompt, | ||
| 'Compaction should preserve deterministic replay prompt semantics.', | ||
| ) |
There was a problem hiding this comment.
To improve maintainability and reduce code duplication, you can extract the logic for reconstructing the prompt from a list of events into a local helper function. This makes the test cleaner and avoids repeating the same generator expression.
def test_replay_contract_compacted_and_raw_events_match_effective_prompt(
self,
):
raw_events = [
self._create_event(1.0, 'inv1', 'User asks about weather'),
self._create_event(2.0, 'inv1', 'Agent asks clarifying question'),
self._create_event(3.0, 'inv2', 'User clarifies location'),
self._create_event(4.0, 'inv2', 'Agent proposes plan'),
self._create_event(5.0, 'inv3', 'User asks for final answer'),
]
compacted_events = [
self._create_compacted_event(
1.0,
4.0,
(
'User asks about weather\n'
'Agent asks clarifying question\n'
'User clarifies location\n'
'Agent proposes plan'
),
appended_ts=4.5,
),
self._create_event(5.0, 'inv3', 'User asks for final answer'),
]
def _reconstruct_prompt(events_list):
return '\n'.join(
part.text
for content in contents._get_contents(None, events_list)
for part in content.parts
if part.text
)
raw_prompt = _reconstruct_prompt(raw_events)
compacted_prompt = _reconstruct_prompt(compacted_events)
self.assertEqual(
compacted_prompt,
raw_prompt,
'Compaction should preserve deterministic replay prompt semantics.',
)|
Implemented the scoped fix for #4667 with a replay-contract test that validates effective prompt equivalence between raw and compacted events. Validation run:
Current CI snapshot: triage bot, CLA, and header checks are green. This contribution was informed by patterns from Wrkr. Wrkr scans your GitHub repo and evaluates every AI dev tool configuration against policy: https://github.com/Clyra-AI/wrkr |
Problem
Compaction behavior lacked a direct regression assertion that replay prompt semantics remain equivalent between raw and compacted event streams.
Why now
Replay determinism needs a clear contract test so compaction changes cannot silently alter effective prompt reconstruction.
What changed
test_replay_contract_compacted_and_raw_events_match_effective_promptintests/unittests/apps/test_compaction.py.contents._get_contentsfor:Validation
uv run pyink --check --diff tests/unittests/apps/test_compaction.pyuv run pytest -q tests/unittests/apps/test_compaction.py -k replay_contract_compacted_and_raw_events_match_effective_promptRefs #4667