Skip to content

feat: auto-detect Engine API GetPayload version for Osaka fork#3113

Open
tac0turtle wants to merge 1 commit intomainfrom
feat/auto-detect-osaka-getpayload-version
Open

feat: auto-detect Engine API GetPayload version for Osaka fork#3113
tac0turtle wants to merge 1 commit intomainfrom
feat/auto-detect-osaka-getpayload-version

Conversation

@tac0turtle
Copy link
Contributor

@tac0turtle tac0turtle commented Feb 27, 2026

GetPayload now automatically selects between engine_getPayloadV4 (Prague) and engine_getPayloadV5 (Osaka) by caching the last successful version and retrying with the alternative on "Unsupported fork" errors (code -38005).

This handles Prague chains, Osaka-at-genesis chains, and time-based Prague-to-Osaka upgrades with zero configuration. At most one extra RPC call occurs at the fork transition point.

Overview

Summary by CodeRabbit

  • New Features

    • Implemented automatic fallback mechanism for payload fetching across different protocol versions
    • Added intelligent caching to optimize payload retrieval performance
  • Tests

    • Added comprehensive test suite for payload fetching behavior across various fork scenarios

GetPayload now automatically selects between engine_getPayloadV4 (Prague)
and engine_getPayloadV5 (Osaka) by caching the last successful version and
retrying with the alternative on "Unsupported fork" errors (code -38005).

This handles Prague chains, Osaka-at-genesis chains, and time-based
Prague-to-Osaka upgrades with zero configuration. At most one extra
RPC call occurs at the fork transition point.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link
Contributor

claude bot commented Feb 27, 2026

Claude finished @tac0turtle's task in 1m 46s —— View job


Code Review

The implementation is clean, well-motivated, and uses the right concurrency primitive (atomic.Bool). Test coverage is solid. A few issues worth addressing:


Bug: Stale trace attribute in engine_rpc_tracing.go

execution/evm/engine_rpc_tracing.go:69

ctx, span := t.tracer.Start(ctx, "Engine.GetPayload",
    trace.WithAttributes(
        attribute.String("method", "engine_getPayloadV4"),  // ← always hardcoded to V4

The tracing wrapper hardcodes "engine_getPayloadV4" in the method span attribute. After an Osaka upgrade, all GetPayload traces will misreport the actual method used. Since tracedEngineRPCClient delegates to inner.GetPayload, it has no visibility into which version was ultimately called. Consider either removing the method attribute from this span, recording it as "engine_getPayloadV4_or_V5", or exposing the resolved method from the underlying client. Fix this →


Concern: "At most one extra RPC call" is only true for sequential callers

execution/evm/engine_rpc_client.go:43-70

The PR description states: "At most one extra RPC call occurs at the fork transition point." This holds for sequential calls but not for concurrent ones. If multiple goroutines call GetPayload simultaneously at the fork boundary, each reads useV5=false, each tries V4, each gets -38005, and each then calls V5 — all before any of them writes the cached value. This is benign (just extra work, no corruption), but the documentation claim is slightly inaccurate for concurrent usage. If the call site is always sequential this is fine; otherwise consider clarifying in the comment.


Minor: Method names should be constants

execution/evm/engine_rpc_client.go:44-48

The string literals "engine_getPayloadV4" and "engine_getPayloadV5" appear twice each (once in GetPayload, once in the isUnsupportedForkErr fallback logic) and are also present in engine_rpc_tracing.go. Extracting them to package-level constants would prevent typos and make future additions (e.g., V6) easier. Fix this →


Testing: TestGetPayload_ForkUpgrade_SwitchesV4ToV5 doesn't verify method calls

execution/evm/engine_rpc_client_test.go:172-215

This test only asserts that no error occurred. Unlike the other tests, it doesn't track and verify which methods were actually called (V4 then V5 on first post-upgrade call, then V5-only on subsequent calls). Adding method-call tracking consistent with the other tests would make it more rigorous. Fix this →


Nit: The altMethod == "engine_getPayloadV5" comparison is implicit

execution/evm/engine_rpc_client.go:68

e.useV5.Store(altMethod == "engine_getPayloadV5")

This works correctly today, but if the string assignment logic at the top of GetPayload ever changes, this will silently compute the wrong value. It would be clearer and more robust to flip the flag based on the current useV5 state:

e.useV5.Store(!e.useV5.Load())

This is semantically equivalent and directly expresses "flip to the other version."


Summary

Category Status
Code quality ✅ Clean, idiomatic Go
Concurrency safety atomic.Bool correct (minor doc caveat)
Error handling ✅ Proper errors.As usage
Testing ⚠️ One test missing method-call assertions
Tracing ❌ Stale method attribute post-upgrade
Performance ✅ At most one extra RPC call per transition (sequential)

The tracing bug is the most impactful issue. The rest are improvements to consider before merging.

@github-actions
Copy link
Contributor

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedFeb 27, 2026, 1:18 PM

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 27, 2026

📝 Walkthrough

Walkthrough

The changes introduce dynamic version negotiation to the Engine RPC client. When the getPayload method receives an "Unsupported fork" error from one API version, it automatically switches to the alternate version and caches the successful choice for future calls, reducing redundant fallback attempts during network operations.

Changes

Cohort / File(s) Summary
Engine RPC Client Implementation
execution/evm/engine_rpc_client.go
Adds engineErrUnsupportedFork error detection, useV5 atomic flag for version caching, and fallback logic to GetPayload that attempts V4 first, switches to V5 on unsupported fork errors, and caches the successful version.
Engine RPC Client Tests
execution/evm/engine_rpc_client_test.go
Comprehensive test suite with fake JSON-RPC server setup, covering Prague (V4), Osaka (V5), fork upgrades (V4→V5), error propagation, and isUnsupportedForkErr validation across multiple scenarios.

Sequence Diagram

sequenceDiagram
    participant Client
    participant EngineRPCClient
    participant Engine_V4 as Engine API V4
    participant Engine_V5 as Engine API V5
    
    Client->>EngineRPCClient: GetPayload()
    
    alt useV5 flag not set (initial)
        EngineRPCClient->>Engine_V4: engine_getPayloadV4
        Engine_V4-->>EngineRPCClient: Unsupported fork error (-38005)
        EngineRPCClient->>Engine_V5: engine_getPayloadV5
        Engine_V5-->>EngineRPCClient: Payload + set useV5=true
    else useV5 flag already set
        EngineRPCClient->>Engine_V5: engine_getPayloadV5
        Engine_V5-->>EngineRPCClient: Payload
    end
    
    EngineRPCClient-->>Client: Payload
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 A hop, a skip, a version switch!
When forks cry "nay," we find the pitch.
V4 or V5, we'll seek the way,
Then cache our choice to save the day!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: auto-detecting between Engine API GetPayload versions for the Osaka fork with caching logic.
Description check ✅ Passed The description covers the key implementation details and use cases, but lacks formal section structure matching the template's Overview section requirement.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/auto-detect-osaka-getpayload-version

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Feb 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 60.93%. Comparing base (c449847) to head (ce75825).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3113      +/-   ##
==========================================
- Coverage   60.97%   60.93%   -0.04%     
==========================================
  Files         113      113              
  Lines       11617    11617              
==========================================
- Hits         7083     7079       -4     
- Misses       3736     3739       +3     
- Partials      798      799       +1     
Flag Coverage Δ
combined 60.93% <ø> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
execution/evm/engine_rpc_client.go (1)

52-64: Wrap RPC errors with call context to improve diagnosability.

The current returns lose method-level context when failures bubble up. Wrapping here will make operational debugging much easier.

♻️ Proposed patch
 import (
 	"context"
 	"errors"
+	"fmt"
 	"sync/atomic"
@@
 	err := e.client.CallContext(ctx, &result, method, payloadID)
 	if err == nil {
 		return &result, nil
 	}
 
 	if !isUnsupportedForkErr(err) {
-		return nil, err
+		return nil, fmt.Errorf("%s failed for payload %x: %w", method, payloadID, err)
 	}
@@
 	err = e.client.CallContext(ctx, &result, altMethod, payloadID)
 	if err != nil {
-		return nil, err
+		return nil, fmt.Errorf("%s failed after %s unsupported fork for payload %x: %w", altMethod, method, payloadID, err)
 	}

As per coding guidelines "Wrap errors with context using fmt.Errorf".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@execution/evm/engine_rpc_client.go` around lines 52 - 64, The RPC CallContext
error returns currently lose method context; when returning errors from the
calls to e.client.CallContext (both the primary call using method and the
fallback using altMethod) wrap the original error using fmt.Errorf to include
the method name and payloadID (e.g. fmt.Errorf("engine rpc %s payload %v: %w",
method, payloadID, err)), and do the same for the fallback call (include
altMethod in the message); update returns in the block around
isUnsupportedForkErr(err) and the subsequent fallback so errors preserve call
context for diagnosability.
execution/evm/engine_rpc_client_test.go (1)

96-215: Consider table-driving the GetPayload scenario tests.

These three scenario tests share setup/teardown and method-tracking logic; table-driven structure would reduce duplication and make new fork-transition cases easier to add.

As per coding guidelines "Use table-driven tests where appropriate".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@execution/evm/engine_rpc_client_test.go` around lines 96 - 215, The three
tests TestGetPayload_PragueChain_UsesV4, TestGetPayload_OsakaChain_FallsBackToV5
and TestGetPayload_ForkUpgrade_SwitchesV4ToV5 duplicate setup/teardown and
call-tracking; refactor them into a single table-driven test that iterates over
cases (name, server behavior closure for fakeEngineServer, expected call
sequences before/after cache or upgrade, and any state transitions like toggling
osakaActive), reuse NewEngineRPCClient and client.GetPayload(ctx,
engine.PayloadID{}) in a loop, and assert calledMethods for each subcase; keep
unique identifiers (fakeEngineServer, NewEngineRPCClient, GetPayload,
calledMethods, osakaActive) so each case can provide the server callback and
expected sequences, reducing duplication and making future fork scenarios easier
to add.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@execution/evm/engine_rpc_client_test.go`:
- Around line 172-215: The test TestGetPayload_ForkUpgrade_SwitchesV4ToV5
currently only asserts success but not that the client cached the switch to V5;
instrument the fakeEngineServer handler with counters (e.g., v4Calls, v5Calls)
or a boolean flag to record how many times engine_getPayloadV4 and
engine_getPayloadV5 were invoked, call client.GetPayload as already written,
then after the first post-upgrade call assert that v4Calls == 1 and v5Calls >= 1
(first fallback used V5), and after the subsequent call assert v4Calls remains 1
and v5Calls increased (or v4Calls did not increase) to prove the client cached
and directly used V5; reference the test function
TestGetPayload_ForkUpgrade_SwitchesV4ToV5, the fakeEngineServer handler, and
client.GetPayload when adding these assertions.

---

Nitpick comments:
In `@execution/evm/engine_rpc_client_test.go`:
- Around line 96-215: The three tests TestGetPayload_PragueChain_UsesV4,
TestGetPayload_OsakaChain_FallsBackToV5 and
TestGetPayload_ForkUpgrade_SwitchesV4ToV5 duplicate setup/teardown and
call-tracking; refactor them into a single table-driven test that iterates over
cases (name, server behavior closure for fakeEngineServer, expected call
sequences before/after cache or upgrade, and any state transitions like toggling
osakaActive), reuse NewEngineRPCClient and client.GetPayload(ctx,
engine.PayloadID{}) in a loop, and assert calledMethods for each subcase; keep
unique identifiers (fakeEngineServer, NewEngineRPCClient, GetPayload,
calledMethods, osakaActive) so each case can provide the server callback and
expected sequences, reducing duplication and making future fork scenarios easier
to add.

In `@execution/evm/engine_rpc_client.go`:
- Around line 52-64: The RPC CallContext error returns currently lose method
context; when returning errors from the calls to e.client.CallContext (both the
primary call using method and the fallback using altMethod) wrap the original
error using fmt.Errorf to include the method name and payloadID (e.g.
fmt.Errorf("engine rpc %s payload %v: %w", method, payloadID, err)), and do the
same for the fallback call (include altMethod in the message); update returns in
the block around isUnsupportedForkErr(err) and the subsequent fallback so errors
preserve call context for diagnosability.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c449847 and ce75825.

📒 Files selected for processing (2)
  • execution/evm/engine_rpc_client.go
  • execution/evm/engine_rpc_client_test.go

Comment on lines +172 to +215
func TestGetPayload_ForkUpgrade_SwitchesV4ToV5(t *testing.T) {
var mu sync.Mutex
osakaActive := false

srv := fakeEngineServer(t, func(method string) (string, int, string) {
mu.Lock()
active := osakaActive
mu.Unlock()

if active {
// Post-Osaka: V5 works, V4 rejected
if method == "engine_getPayloadV5" {
return minimalPayloadEnvelopeJSON, 0, ""
}
return "", -38005, "Unsupported fork"
}
// Pre-Osaka: V4 works, V5 rejected
if method == "engine_getPayloadV4" {
return minimalPayloadEnvelopeJSON, 0, ""
}
return "", -38005, "Unsupported fork"
})
defer srv.Close()

client := NewEngineRPCClient(dialTestServer(t, srv.URL))
ctx := context.Background()

// Pre-upgrade: V4 works.
_, err := client.GetPayload(ctx, engine.PayloadID{})
require.NoError(t, err)

// Simulate fork activation.
mu.Lock()
osakaActive = true
mu.Unlock()

// First post-upgrade call: V4 fails, falls back to V5, caches.
_, err = client.GetPayload(ctx, engine.PayloadID{})
require.NoError(t, err)

// Subsequent calls: V5 directly.
_, err = client.GetPayload(ctx, engine.PayloadID{})
require.NoError(t, err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fork-upgrade test misses the cache-behavior assertion.

This test can still pass even if post-upgrade calls keep retrying V4 first; it currently validates success, not cached-method selection.

✅ Suggested assertion enhancement
 func TestGetPayload_ForkUpgrade_SwitchesV4ToV5(t *testing.T) {
 	var mu sync.Mutex
 	osakaActive := false
+	var calledMethods []string
 
 	srv := fakeEngineServer(t, func(method string) (string, int, string) {
 		mu.Lock()
+		calledMethods = append(calledMethods, method)
 		active := osakaActive
 		mu.Unlock()
@@
 	_, err := client.GetPayload(ctx, engine.PayloadID{})
 	require.NoError(t, err)
+	mu.Lock()
+	assert.Equal(t, []string{"engine_getPayloadV4"}, calledMethods)
+	calledMethods = nil
+	mu.Unlock()
@@
 	_, err = client.GetPayload(ctx, engine.PayloadID{})
 	require.NoError(t, err)
+	mu.Lock()
+	assert.Equal(t, []string{"engine_getPayloadV4", "engine_getPayloadV5"}, calledMethods)
+	calledMethods = nil
+	mu.Unlock()
 
 	// Subsequent calls: V5 directly.
 	_, err = client.GetPayload(ctx, engine.PayloadID{})
 	require.NoError(t, err)
+	mu.Lock()
+	assert.Equal(t, []string{"engine_getPayloadV5"}, calledMethods)
+	mu.Unlock()
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@execution/evm/engine_rpc_client_test.go` around lines 172 - 215, The test
TestGetPayload_ForkUpgrade_SwitchesV4ToV5 currently only asserts success but not
that the client cached the switch to V5; instrument the fakeEngineServer handler
with counters (e.g., v4Calls, v5Calls) or a boolean flag to record how many
times engine_getPayloadV4 and engine_getPayloadV5 were invoked, call
client.GetPayload as already written, then after the first post-upgrade call
assert that v4Calls == 1 and v5Calls >= 1 (first fallback used V5), and after
the subsequent call assert v4Calls remains 1 and v5Calls increased (or v4Calls
did not increase) to prove the client cached and directly used V5; reference the
test function TestGetPayload_ForkUpgrade_SwitchesV4ToV5, the fakeEngineServer
handler, and client.GetPayload when adding these assertions.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Spamoor Trace Benchmarks'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.50.

Benchmark suite Current: ce75825 Previous: c449847 Ratio
SpamoorSmoke - Batch.SetHeight (avg) 13.424698795180722 us 8.335329341317365 us 1.61
SpamoorSmoke - BlockExecutor.CreateBlock (avg) 4.484939759036145 us 2.4564564564564564 us 1.83
SpamoorSmoke - BlockExecutor.RetrieveBatch (avg) 43.8222891566265 us 28.236526946107784 us 1.55
SpamoorSmoke - ForcedInclusionRetriever.RetrieveForcedIncludedTxs (avg) 7.286144578313253 us 4.023952095808383 us 1.81
SpamoorSmoke - Sequencer.GetNextBatch (avg) 33.725903614457835 us 21.377245508982035 us 1.58

This comment was automatically generated by workflow using github-action-benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant