Skip to content

Add evaluator security note and remediation plan#115

Open
josusanmartin wants to merge 1 commit intogpu-mode:mainfrom
josusanmartin:evaluator-security-report
Open

Add evaluator security note and remediation plan#115
josusanmartin wants to merge 1 commit intogpu-mode:mainfrom
josusanmartin:evaluator-security-report

Conversation

@josusanmartin
Copy link

Summary

This PR adds a repository-level security note describing the in-process evaluator trust-boundary issue that affects multiple challenge families, along with an immediate/short-term/long-term remediation plan.

What this adds

  • EVALUATOR_SECURITY.md with:
    • a concise description of the evaluator issue
    • the evaluator families that share the pattern
    • a conservative record of what was directly verified on the live service
    • a quantitative note on implausible public timings such as matmul_v2 at 0.001 µs
    • a staged remediation proposal
  • a short pointer from README.md

Scope

This PR is documentation-first. It does not include exploit payloads or attempt to land a large evaluator refactor in one change.

Why docs first

The current issue is architectural and spans multiple evaluator families (amd_202602, pmpp_v2, nvidia, amd, amd_distributed, helion, bioml, pmpp). A docs-first PR creates a clear remediation target without mixing disclosure and a partial code fix.

Proposed follow-up work

  1. add evaluator self-integrity checks as a short-term mitigation
  2. split trusted evaluator logic from untrusted submission.py execution in a follow-up refactor
  3. re-run affected leaderboards after patching

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant