Skip to content

feat: add CDP-based interactive live view for headless Chromium (Approach 2 probably what BB is doing)#176

Open
hiroTamada wants to merge 9 commits intomainfrom
headless-cdp-live-view
Open

feat: add CDP-based interactive live view for headless Chromium (Approach 2 probably what BB is doing)#176
hiroTamada wants to merge 9 commits intomainfrom
headless-cdp-live-view

Conversation

@hiroTamada
Copy link
Contributor

@hiroTamada hiroTamada commented Mar 9, 2026

Summary

  • Add a lightweight Go service (cdp-live-view) that streams browser frames via CDP Page.startScreencast and forwards mouse/keyboard input via Input.dispatchMouseEvent/Input.dispatchKeyEvent
  • Replaces the previous noVNC/Xvfb approach (PR feat: add interactive live view to headless Chromium image (approach 1 not truly headless) #174) with near-zero overhead while keeping Chromium in true headless mode (--headless=new --ozone-platform=headless)
  • Opt-in via ENABLE_LIVE_VIEW=true env var; serves an HTML5 canvas-based viewer on port 8080

Changes

  • server/cmd/cdp-live-view/main.go: Go server that connects to CDP, discovers the browser WebSocket URL dynamically, tracks active targets via Target.setDiscoverTargets, streams screencast frames to connected viewers, and dispatches input events back to the browser
  • server/cmd/cdp-live-view/viewer.html: HTML5 canvas client that renders streamed JPEG frames and captures mouse/keyboard events for interactive control
  • images/chromium-headless/image/supervisor/services/cdp-live-view.conf: Supervisor config for the new service
  • images/chromium-headless/image/Dockerfile: Build and copy the cdp-live-view binary
  • images/chromium-headless/image/wrapper.sh: Conditional startup of cdp-live-view behind ENABLE_LIVE_VIEW=true
  • images/chromium-headless/run-docker.sh: Expose port 8080 and pass ENABLE_LIVE_VIEW env
  • images/chromium-headless/run-unikernel.sh: Expose port 443:8080/http+tls when live view is enabled

Benchmark (unikernel, 1 vCPU / 3 GB)

Metric Baseline (v29) CDP Live View Delta
Memory used (after workload) 365 MB 406 MB +41 MB
Screenshot latency 199ms 65ms -67%
JS Evaluate 1.6ms 1.7ms ~same
CPU during navigation 70.2% 64.8% ~same

The live view adds ~40 MB memory overhead under load with negligible CPU impact. Screenshots, JS evaluation, and other CDP operations are unaffected.

Checklist

Made with Cursor


Note

Medium Risk
Introduces a new network-exposed service that can drive browser input via CDP and adds new port mappings, so misconfiguration could widen access even though it’s opt-in via ENABLE_LIVE_VIEW.

Overview
Adds an opt-in interactive live view for headless Chromium by introducing a new Go service (cdp-live-view) that streams CDP screencast frames over WebSockets and forwards mouse/keyboard/navigation events back to the browser.

The container image now builds and ships the new binary, registers it under supervisord, and conditionally starts/stops it from wrapper.sh when ENABLE_LIVE_VIEW=true. Local Docker and Unikernel run scripts expose the viewer on port 8080 (and map 443 to 8080 for Unikernel when enabled).

Written by Cursor Bugbot for commit 2d9527e. This will update automatically on new commits. Configure here.

Add a lightweight Go service (cdp-live-view) that streams browser frames
via CDP Page.startScreencast and forwards mouse/keyboard input via
Input.dispatchMouseEvent/dispatchKeyEvent. This replaces the previous
noVNC/Xvfb approach with near-zero overhead (~40 MB memory under load,
negligible CPU impact) while keeping Chromium in true headless mode
(--headless=new --ozone-platform=headless).

The service is opt-in via ENABLE_LIVE_VIEW=true and serves an HTML5
canvas-based viewer on port 8080.

Made-with: Cursor
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Risk assessment: Medium-High.

Why:

  • Adds a new production runtime component (cdp-live-view) with substantial new behavior (~900 LOC) in server/cmd.
  • Introduces a new externally reachable live-view surface (/ + /ws) and browser input forwarding path (Input.dispatchMouseEvent / Input.dispatchKeyEvent).
  • Changes container/image startup and deployment scripts (Dockerfile, wrapper.sh, run-docker.sh, run-unikernel.sh) including additional port exposure and conditional service startup.

Decision:

  • Code review is required.
  • Assigned 2 reviewers with history in these codepaths.
  • Not self-approving due risk level.

Open in Web View Automation 

@cursor cursor bot requested review from Sayan- and archandatta March 9, 2026 23:03
@hiroTamada hiroTamada changed the title feat: add CDP-based interactive live view for headless Chromium feat: add CDP-based interactive live view for headless Chromium (Approach 2 probably what BB is doing) Mar 10, 2026
- Protect sessionID/targetID/currentURL/pageTitle with sync.RWMutex
  to prevent data races between concurrent goroutines
- Capture sessionID before goroutine in screencast frame ack to avoid
  sending ack to wrong CDP session after target switch
- Add Chrome-like browser toolbar with back/forward/reload and address bar
- Fix double-typing by using rawKeyDown + char instead of keyDown + char
- Set Emulation.setDeviceMetricsOverride for full 1920x1080 viewport

Made-with: Cursor
- Reset s.cdp to nil when CDP readLoop exits so connectCDP can
  reconnect on next viewer action instead of staying broken forever
- Clear sessions map and target state on disconnect
- Capture msgId before setTimeout closure in benchmark-local.mjs
  to avoid checking/deleting the wrong pending entry

Made-with: Cursor
- Return proper errors for CDP protocol errors instead of nil
- Drain all pending call channels when readLoop exits to prevent
  goroutine leaks on reconnection cycles
- Add nil checks for s.cdp in screencast ack goroutine and
  switchToTarget to prevent panics during concurrent disconnect

Made-with: Cursor
These were ad-hoc local/unikernel benchmark scripts not intended
for the repository.

Made-with: Cursor
- Hold viewer mutex for both frame_meta text and JPEG binary writes
  to prevent url_update messages from interleaving between the pair
- Also mutex-protect text writes in broadcastURLUpdate
- Clean up pending map entry on context cancellation in call/callSession
  to prevent memory leak from accumulated timed-out entries

Made-with: Cursor
Use the sessionId returned by Target.attachToTarget instead of
busy-waiting up to 5s polling the sessions map. Also remove the
unused sendBinary method on the viewer struct.

Made-with: Cursor
Only allow http/https schemes in the live view URL bar to prevent
file:// and javascript: URI abuse. Default ENABLE_LIVE_VIEW to false
in run-docker.sh so the feature is truly opt-in.

Made-with: Cursor
…ents

Only null s.cdp in onDisconnect if it still points to the disconnected
instance, preventing a late-firing callback from clobbering a newer
connection. Use e.detail for clickCount in mousedown/mouseup so the
browser sees correct double-click sequences, and remove the redundant
dblclick handler that sent phantom extra events.

Made-with: Cursor
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

hiroTamada added a commit that referenced this pull request Mar 11, 2026
Benchmark tool and results comparing CDP operation latency across four
image variants: headless baseline, Approach 1 (Xvfb/noVNC, PR #174),
Approach 2 (CDP screencast, PR #176), and headful.

Covers 40+ CDP operations across 9 categories (screenshot, JS eval, DOM,
input, network, page, emulation, target, composite) plus concurrent load
testing. Includes results from Docker (4 vCPU / 1 GB headless, 8 vCPU /
8 GB headful), Docker with constrained headful (4 vCPU / 1 GB), and
KraftCloud (Unikraft) environments.

Key findings:
- Approach 2 (CDP screencast) adds near-zero overhead vs baseline
- Approach 1 (Xvfb/noVNC) adds ~30% overhead on input/screenshot ops
- Headful under headless constraints is not viable (39% idle memory)
- On Unikraft, Approach 2 remains the better choice for live view

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant