Enforce auction timeout in orchestrator wait logic by ChristianPavilonis · Pull Request #469 · IABTechLab/trusted-server

ChristianPavilonis · 2026-03-10T01:52:40Z

Summary

Enforce the configured auction timeout (settings.auction.timeout_ms) in the orchestrator's select() loop — previously, waits could extend to the backend's hardcoded 15s first_byte_timeout, ignoring the auction deadline.
Two complementary mechanisms: (1) each auction provider's backend first_byte_timeout is set to timeout_ms instead of 15s, and (2) the orchestrator checks elapsed time after each select() return, dropping remaining requests when the deadline is exceeded.
Mediator receives remaining time budget after the bidding phase, preventing mediation from extending the auction past the configured deadline.

Changes

File	Change
`crates/common/src/backend.rs`	Add configurable `first_byte_timeout` field to `BackendConfig` (default 15s), builder method, timeout-aware backend naming, and `from_url_with_first_byte_timeout()` convenience method
`crates/common/src/auction/orchestrator.rs`	Add deadline enforcement in `select()` loop — track `auction_start`, drop remaining requests when timeout exceeded, pass only remaining time to mediator
`crates/common/src/integrations/prebid.rs`	Use `from_url_with_first_byte_timeout()` with `context.timeout_ms` for bid requests
`crates/common/src/integrations/aps.rs`	Use `from_url_with_first_byte_timeout()` with `context.timeout_ms` for bid requests; rename `_context` → `context`
`crates/common/src/integrations/adserver_mock.rs`	Use `from_url_with_first_byte_timeout()` with `context.timeout_ms` for mediation requests

Closes

Closes #405

Test plan

cargo test --workspace
cargo clippy --all-targets --all-features -- -D warnings
cargo fmt --all -- --check
JS tests: cd crates/js/lib && npx vitest run
JS format: cd crates/js/lib && npm run format
Docs format: cd docs && npm run format
WASM build: cargo build --bin trusted-server-fastly --release --target wasm32-wasip1
Manual testing via fastly compute serve
Other:

Checklist

Changes follow CLAUDE.md conventions
No unwrap() in production code — use expect("should ...")
Uses tracing macros (not println!)
New code has tests
No secrets or credentials committed

Closes #405

- Pass remaining time budget to each provider instead of full timeout, so backend first_byte_timeout cannot exceed the auction deadline - Extract remaining_budget_ms helper and add unit tests for it - Simplify from_url_with_first_byte_timeout to take Duration (not Option) - Fix misleading doc on ensure() to note timeout is not in backend name - Update TODO comment with specific untested timeout paths

aram356

Staff-level review of auction timeout enforcement

Good work overall. The approach of threading remaining_budget_ms through the orchestrator and into backend first_byte_timeout is sound. The doc comments are thorough and the compute_name / ensure split in BackendConfig is a clean refactor. A few issues to address before merging, ranging from a semantic correctness concern to testing gaps.

Summary of findings

Priority	Issue
P0	Backend timeout is first-registration-wins; later providers in the same auction may inherit a stale timeout
P1	Requests still launch when remaining budget is already 0 ms
P1	`select()` blocking means wall-clock can exceed `timeout_ms` (documented, but consider mitigation)
P2	Per-provider `timeout_ms()` trait method is ignored at the transport layer
P2	Critical timeout paths remain untested (acknowledged in TODO, but risky to merge without)
P3	`as u32` truncation in `remaining_budget_ms`
P3	URL parsing logic triplicated across three methods

crates/common/src/backend.rs

crates/common/src/auction/orchestrator.rs

crates/common/src/backend.rs

aram356

Summary

This PR adds auction deadline enforcement to the orchestrator — providers now receive a shrinking time budget via remaining_budget_ms(), backends get a matching first_byte_timeout, and the select() loop checks the deadline after each response. The mediator is skipped when the budget is exhausted. The backend_name() trait method no longer registers backends as a side-effect.

Blocking

🔧 wrench

0ms budget guard missing in provider loop: requests still launch when remaining budget is 0ms — the mediator path checks for this but the provider loop does not (orchestrator.rs:284)
Backend timeout poisoning: first_byte_timeout is not part of the backend name, so first-registration-wins — if bidders and mediators share an origin, the mediator's tighter timeout is silently ignored (backend.rs:128)

❓ question

Can bidders and mediators share the same origin? This determines whether the timeout poisoning is a real issue in the current deployment topology (backend.rs:128)

Non-blocking

🤔 thinking

as u32 truncation: remaining_budget_ms casts u128 to u32 — safe in practice but technically unsound (orchestrator.rs:21)
Provider timeout_ms() unused: per-provider timeout config is ignored at the transport layer (orchestrator.rs:289)

♻️ refactor

Triplicated URL parsing: three methods have identical parsing logic — extract a helper (backend.rs:222)

🌱 seedling

Untested timeout paths: core behavioral change has no integration test coverage (orchestrator.rs:691)

🏕 camp site

Log ordering: "Running auction with strategy" log on line 83 fires after the auction completes — reads as though the auction is starting. Consider moving it before the if/else or changing to "Auction completed with strategy".

👍 praise

backend_name() side-effect removed: no longer registers a backend on read (prebid.rs:955)
Remaining-budget mediator design: clean early-return and fallback (orchestrator.rs:121)

crates/common/src/auction/orchestrator.rs

crates/common/src/backend.rs

crates/common/src/auction/orchestrator.rs

crates/common/src/backend.rs

crates/common/src/auction/orchestrator.rs

crates/common/src/integrations/prebid.rs

crates/common/src/auction/orchestrator.rs

…ate URL parsing - Include first_byte_timeout in backend name to prevent first-registration-wins poisoning where later requests silently inherit an earlier timeout - Add 0ms budget guard in provider loop (consistent with mediator path) - Use min(provider.timeout_ms(), remaining_budget) to respect per-provider caps - Fix as u32 truncation in remaining_budget_ms with try_from/unwrap_or - Extract parse_origin helper to deduplicate URL parsing across 3 methods - Update backend_name trait method to accept timeout_ms for correct name prediction - Add comment documenting select() blocking dependency on first_byte_timeout - Expand TODO with follow-up guidance for testability improvements

ChristianPavilonis · 2026-03-11T18:33:27Z

Review feedback addressed (`48fb712`)

Thanks for the thorough review! All findings have been addressed in the latest push. Here's a breakdown:

Blocking — resolved

🔧 0ms budget guard in provider loop (orchestrator.rs)
Added an if effective_timeout == 0 { continue; } guard after computing the remaining budget, consistent with the mediator path at line 123. Providers are now skipped with a warning log instead of launching a doomed request with a 0ms timeout.

🔧 Backend timeout poisoning (backend.rs)
first_byte_timeout is now part of the backend name (appended as _t{ms}), so different timeout values produce different backend registrations. This eliminates the first-registration-wins issue entirely. In practice bidders and mediators use distinct origins in the current deployment topology, but the name now encodes the timeout defensively.

The backend_name() trait method and backend_name_for_url() were updated to accept the timeout so the predicted name matches the actual registration. The orchestrator loop was reordered to compute effective_timeout before calling backend_name(effective_timeout).

Non-blocking — resolved

🤔 as u32 truncation (orchestrator.rs:21)
Replaced start.elapsed().as_millis() as u32 with u32::try_from(start.elapsed().as_millis()).unwrap_or(u32::MAX) — the saturating_sub then correctly produces 0 for any absurdly large elapsed time.

🤔 Provider timeout_ms() unused at transport layer (orchestrator.rs)
The effective timeout is now remaining_ms.min(provider.timeout_ms()), respecting both the auction deadline and the provider's own configured latency expectation.

♻️ Triplicated URL parsing (backend.rs)
Extracted a private parse_origin() helper that from_url_with_first_byte_timeout and backend_name_for_url both call, eliminating the duplicated Url::parse → scheme() → host_str() → port() chain.

🌱 Untested timeout paths (orchestrator.rs)
Expanded the TODO comment with concrete follow-up guidance: introduce a trait abstraction over select() for unit-testability, and consider an #[ignore] Viceroy integration test. Also added the new provider-skip path to the list of untested paths.

Created issue #473

Comment about select() blocking (orchestrator.rs)
Added a note above the select() loop explaining that hard deadline enforcement depends on every backend's first_byte_timeout being set to at most the remaining auction budget, which Phase 1 guarantees.

All changes pass cargo fmt, cargo clippy -D warnings, and cargo test --workspace (477 tests).

Enforce auction timeout in orchestrator wait logic

26f1cc7

Closes #405

ChristianPavilonis self-assigned this Mar 10, 2026

ChristianPavilonis requested review from aram356 and prk-Jr March 10, 2026 16:33

Merge branch 'main' into fix/auction-timeout

410eb63

aram356 reviewed Mar 11, 2026

View reviewed changes

aram356 requested changes Mar 11, 2026

View reviewed changes

ChristianPavilonis mentioned this pull request Mar 11, 2026

Introduce trait abstraction over select() to enable unit-testing timeout enforcement #473

Open

ChristianPavilonis requested a review from aram356 March 11, 2026 19:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enforce auction timeout in orchestrator wait logic#469

Enforce auction timeout in orchestrator wait logic#469
ChristianPavilonis wants to merge 4 commits intomainfrom
fix/auction-timeout

ChristianPavilonis commented Mar 10, 2026 •

edited

Loading

Uh oh!

aram356 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aram356 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ChristianPavilonis commented Mar 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ChristianPavilonis commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Closes

Test plan

Checklist

Uh oh!

aram356 left a comment

Choose a reason for hiding this comment

Staff-level review of auction timeout enforcement

Summary of findings

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aram356 left a comment

Choose a reason for hiding this comment

Summary

Blocking

🔧 wrench

❓ question

Non-blocking

🤔 thinking

♻️ refactor

🌱 seedling

🏕 camp site

👍 praise

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ChristianPavilonis commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review feedback addressed (48fb712)

Blocking — resolved

Non-blocking — resolved

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ChristianPavilonis commented Mar 10, 2026 •

edited

Loading

ChristianPavilonis commented Mar 11, 2026 •

edited

Loading

Review feedback addressed (`48fb712`)