Skip to content

EPIC: Support regex syntax /pattern/ in search queries #110

@shouze

Description

@shouze

Problem

GitHub's web UI supports regex search via /pattern/ syntax (e.g. /from.*axios/), but the GitHub REST API (api.github.com/search/code) does not — confirmed by spike testing.

Currently, passing a regex query like /from.*axios/ to the CLI returns No results found because the API silently ignores the pattern.

Goal

Make regex queries work transparently in github-code-search by:

  1. Detecting /pattern/ syntax in the query
  2. Automatically deriving a meaningful literal search term to send to the GitHub API (casting a wide net)
  3. Post-processing the raw API results locally with the original regex (narrowing to true matches)
  4. Displaying the regex mode clearly to the user

Real-world use cases driving this feature

  • Semver auditfilename:package.json /"axios":\s*"[~^]?[0-9]/ + --regex-hint axios — track which repos still use an old/vulnerable version of a dependency
  • Deprecated library hunt/require\(['"]old-lib['"]\)/ or /from.*['"]old-lib['"]/
  • TODO/FIXME triage/TODO|FIXME|HACK/ — surface all annotation types at once
  • Import pattern matching/from.*['"]axios/ — find all axios import variants

Architecture

This is implemented in two steps:

Key design decision: deterministic buildApiQuery()

The query sent to the GitHub API is derived algorithmically — no AI involved, fully deterministic and testable:

Input API query Logic
/from.*['"]axios/ axios Longest literal sequence outside [...], quantifiers, alternations
/TODO|FIXME|HACK/ TODO OR FIXME OR HACK Top-level alternation → GitHub OR operator
/require\(['"]old-lib['"]\)/ old-lib Longest literal
filename:package.json /["']axios["']:/ filename:package.json axios Qualifiers preserved, regex term extracted
/[~^]?[0-9]+\.[0-9]+/ ⚠️ warn + --regex-hint required No exploitable literal
/useState/ useState Trivial case

Known limitations (to be documented)

  • Regex filtering applies to the first 1 000 results only (GitHub API hard cap)
  • If buildApiQuery() produces a very broad term, the API returns a lot of noise before filtering — --regex-hint lets the user override
  • Flag g has no effect (GitHub doesn't return all inline occurrences); flag i is applied client-side

Definition of Done (EPIC)

The EPIC is considered done when:

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions