Skip to content

feat(core): integrate RetryManager into SegmentDestination upload pipeline#1160

Open
abueide wants to merge 13 commits intotapi/retry-manager-testsfrom
tapi/segment-destination
Open

feat(core): integrate RetryManager into SegmentDestination upload pipeline#1160
abueide wants to merge 13 commits intotapi/retry-manager-testsfrom
tapi/segment-destination

Conversation

@abueide
Copy link
Contributor

@abueide abueide commented Mar 10, 2026

Summary

  • Wire RetryManager into SegmentDestination for TAPI-compliant retry handling
  • Add uploadBatch() with error classification, retry-after parsing, and configurable retry behavior
  • Add aggregateErrors() to separate batch results by status (success/429/transient/permanent)
  • Add updateRetryState() that delegates to RetryManager and returns whether limits were exceeded
  • Implement per-event age pruning via _queuedAt timestamps and pruneExpiredEvents() — events older than maxTotalBackoffDuration are dropped individually
  • Add canRetry() upload gate check before sending batches
  • Add X-Retry-Count header propagation to API requests
  • Replace QueueFlushingPlugin.dequeue() with dequeueByMessageIds() for messageId-based dequeue
  • Replace isPendingUpload flag with promise-based concurrent flush guard
  • Override shutdown() (instead of standalone destroy()) to integrate with plugin lifecycle — prevents auto-flush timer leak on client cleanup
  • Handle RetryResult.limit_exceeded: log warning and let per-event age pruning handle drops (events are NOT bulk-dropped on global counter reset)
  • Add comprehensive tests for pruning logic, retry integration, and API retry count

PR 5 of 5 in the TAPI backoff/retry stack. Depends on #1159.

Test plan

  • All 42 SegmentDestination tests pass
  • All 461 total tests pass
  • shutdown() properly cleans up RetryManager timers via plugin lifecycle
  • Events only dropped by per-event _queuedAt age, not by global retry counter reset

🤖 Generated with Claude Code

@abueide abueide force-pushed the tapi/retry-manager-tests branch 2 times, most recently from ba89956 to aac6169 Compare March 12, 2026 14:57
@abueide abueide force-pushed the tapi/segment-destination branch 2 times, most recently from af03054 to 872790a Compare March 12, 2026 14:57
@abueide abueide force-pushed the tapi/retry-manager-tests branch from aac6169 to cf2abd2 Compare March 12, 2026 15:36
@abueide abueide force-pushed the tapi/segment-destination branch from 872790a to f6282d0 Compare March 12, 2026 15:42
@abueide abueide force-pushed the tapi/retry-manager-tests branch from cf2abd2 to 23ae940 Compare March 12, 2026 16:11
@abueide abueide force-pushed the tapi/segment-destination branch from f6282d0 to a49f9ec Compare March 12, 2026 16:11
@abueide abueide force-pushed the tapi/retry-manager-tests branch from 23ae940 to 9f2575b Compare March 12, 2026 16:40
@abueide abueide force-pushed the tapi/segment-destination branch from 2b7adb3 to 3fb2dce Compare March 12, 2026 16:40
@abueide abueide force-pushed the tapi/retry-manager-tests branch from 9f2575b to 6981059 Compare March 12, 2026 16:48
@abueide abueide force-pushed the tapi/segment-destination branch from 3fb2dce to ef94825 Compare March 12, 2026 16:48
@abueide abueide force-pushed the tapi/retry-manager-tests branch from 6981059 to c49e640 Compare March 12, 2026 17:38
@abueide abueide force-pushed the tapi/segment-destination branch from ef94825 to 1be5046 Compare March 12, 2026 17:38
@abueide abueide force-pushed the tapi/segment-destination branch from 1be5046 to bd77e18 Compare March 18, 2026 21:14
@abueide abueide force-pushed the tapi/retry-manager-tests branch 2 times, most recently from 02cca5f to 42d7cd1 Compare March 18, 2026 22:12
@abueide abueide force-pushed the tapi/segment-destination branch from bd77e18 to 2f3fd0f Compare March 18, 2026 22:12
@abueide abueide force-pushed the tapi/retry-manager-tests branch from 42d7cd1 to 6bf54b1 Compare March 18, 2026 22:32
@abueide abueide force-pushed the tapi/segment-destination branch 2 times, most recently from fa537b0 to fb0788b Compare March 19, 2026 16:02
@abueide abueide force-pushed the tapi/retry-manager-tests branch 2 times, most recently from f52a425 to e83d5ab Compare March 19, 2026 16:15
@abueide abueide force-pushed the tapi/segment-destination branch from fb0788b to 78b3bec Compare March 19, 2026 16:15
@abueide abueide force-pushed the tapi/retry-manager-tests branch from e83d5ab to 91c9ce3 Compare March 19, 2026 17:03
@abueide abueide force-pushed the tapi/segment-destination branch from 78b3bec to 556e28a Compare March 19, 2026 17:03
@abueide abueide force-pushed the tapi/retry-manager-tests branch from 91c9ce3 to 7242cbf Compare March 19, 2026 17:56
@abueide abueide force-pushed the tapi/segment-destination branch from 6efe106 to 731d4ae Compare March 19, 2026 17:57
@abueide abueide force-pushed the tapi/retry-manager-tests branch from 7242cbf to b60f971 Compare March 19, 2026 18:22
@abueide abueide force-pushed the tapi/segment-destination branch from eecc8d3 to 34aeb65 Compare March 19, 2026 18:23
abueide and others added 13 commits March 19, 2026 13:28
…eline

Wire RetryManager into SegmentDestination for TAPI-compliant retry
handling: uploadBatch() with error classification, event pruning on
partial failures, retry count header propagation, and QueueFlushingPlugin
error type handling updates.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Don't reset retry state on partial success when concurrent batches have
  429/transient errors
- Use if/else if so 429 takes precedence over transient error handling
- Remove redundant return Promise.resolve() in async function
- Fix duplicate keepalive property from master merge

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use res.ok instead of res.status === 200 for 2xx success range
- Remove dead dequeue() method from QueueFlushingPlugin
- Add destroy() to SegmentDestination for RetryManager timer cleanup
- Reset retry state when queue is empty at flush time or after pruning
- Call handle429 per result instead of pre-aggregating, so
  RetryManager.applyRetryStrategy respects eager/lazy consolidation
- Simplify retryAfterSeconds fallback to ?? 60

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ination

Extract pruneExpiredEvents() and updateRetryState() from sendEvents to
improve readability. Remove redundant/obvious comments, merge duplicate
switch cases, and simplify return statements throughout.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… pruning

- Override shutdown() instead of standalone destroy() to integrate with
  the plugin lifecycle — prevents auto-flush timer leak on client cleanup
- Handle RetryResult 'limit_exceeded' from RetryManager: log warning and
  let per-event age pruning (pruneExpiredEvents via _queuedAt) handle
  event drops rather than dropping all retryable events on global counter
  reset
- Import RetryResult type for type-safe limit checking

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Root cause: when the CDN settings response had no httpConfig field,
analytics.ts guarded the entire merge block with if (resJson.httpConfig),
so this.httpConfig stayed undefined. This prevented RetryManager creation
in SegmentDestination, disabling all retry features (error classification
overrides, canRetry() gating, retry counting, maxRetries enforcement).

Changes:
- Remove httpConfig guard in analytics.ts — always merge defaultHttpConfig
  as baseline, with CDN and config.httpConfig overrides on top
- Add httpConfig?: DeepPartial<HttpConfig> to Config type for client-side
  overrides (e.g. maxRetries from test harness)
- Drop retryable events when retry limits exceeded in SegmentDestination
  instead of leaving them in the queue indefinitely
- Update fetchSettings test to expect defaultHttpConfig when CDN has no
  httpConfig

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tion tests

Remove autoFlushOnRetryReady check, setAutoFlushCallback call, and shutdown()
method. Add tests for CDN integrations validation (null, array, string, and
no-defaults scenarios).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wire up the droppedEventCount counter (added in cli-flush-retry-loop)
at the two places SegmentDestination permanently removes events:
permanent errors and retry limit exceeded.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When CDN returns a valid 200 with no integrations field (e.g. {}),
treat it as authoritative "no integrations configured" rather than
falling back to defaultSettings. This ensures SegmentDestination is
correctly disabled when the server has no integrations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… as drops

- Remove retryStrategy from Config type, defaultConfig, and RetryManager
  constructor — always use eager (shortest wait) for concurrent batch errors
- Count events pruned by maxTotalBackoffDuration toward droppedEventCount
- Add SDD-default config tests verifying all spec status codes against
  defaultHttpConfig overrides (408/410/460=retry, 501/505=drop)
- Update RetryManager tests for eager-only behavior

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Property declaration was lost during branch rebase.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@abueide abueide force-pushed the tapi/retry-manager-tests branch from b60f971 to e00cdf7 Compare March 19, 2026 18:29
@abueide abueide force-pushed the tapi/segment-destination branch from 34aeb65 to 0db00e3 Compare March 19, 2026 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant