Data-driven Clusters v4.1 page (11 clusters from Google Sheet) by LukasWallrich · Pull Request #720 · forrtproject/forrtproject.github.io

LukasWallrich · 2026-03-22T22:24:43Z

Summary

Replaces the 7 hardcoded cluster pages (v3) with a fully data-driven approach powered by the FORRT Clusters v4.1 Google Doc and a structured Google Sheet.

What changed

11 clusters (was 7), 93 sub-clusters, ~1300 publications with DOI-resolved APA references
New parsing script (scripts/parse_clusters_to_sheet.py) that:
- Fetches the Google Doc as plain text and parses the hierarchical structure
- Resolves ~1050 DOIs via doi.org content negotiation for clean APA references + BibTeX
- Writes structured data to a Google Sheet (3 tabs: Clusters, Sub-Clusters, Publications with data validation)
- Exports data/clusters_v4.json for Hugo to consume at build time
New Hugo shortcode (layouts/shortcodes/clusters_display.html) that renders all clusters from the JSON data with:
- Sidebar navigation with collapsible cluster tree and colored arrows
- Tabbed sub-clusters (matching the previous UI pattern) with wrapping support
- Sub-cluster headings, italic descriptions, and bulleted reference lists
- Full-text search across clusters, sub-clusters, and all references (with match highlighting and click-to-scroll)
- DOI links rendered as clickable URLs; HTML formatting (e.g. <i> for italics) preserved from doi.org
- Responsive layout (sidebar collapses on mobile with toggle button)
Updated intro text to reflect 11 clusters (was 9)
Deactivated old cluster1.md–cluster7.md (set active = false)

Data pipeline

Google Doc (v4.1)
    ↓  parse_clusters_to_sheet.py
Google Sheet (3 tabs with data validation)
    ↓  --export-json flag
data/clusters_v4.json (committed to repo)
    ↓  Hugo build
clusters_display.html shortcode renders the page

The script supports --dry-run, --skip-doi, --json-only, and --export-json flags. DOI lookups are cached in scripts/doi_cache.json (gitignored) for fast reruns.

Screenshots

The page preserves the established tab-based UI for sub-clusters while adding sidebar navigation and full-text search. Each cluster section has an alternating pastel background color.

Test plan

Run python3 scripts/parse_clusters_to_sheet.py --dry-run to verify parsing (expect 11 clusters, ~93 sub-clusters, ~1297 publications)
Run hugo server and verify /clusters/ renders correctly
Test tab switching within clusters
Test sidebar navigation (expand clusters, click sub-clusters)
Test full-text search (e.g. search for an author name, click result to scroll)
Test on mobile viewport (sidebar toggle, content layout)
Verify print view shows all tab content

🤖 Generated with Claude Code

Replace the 7 hardcoded cluster markdown files with a data-driven approach that reads from a generated JSON file (clusters_v4.json). The data originates from the FORRT Clusters v4.1 Google Doc and is parsed into a Google Sheet, then exported as JSON for Hugo to consume at build time. Key changes: - New script (parse_clusters_to_sheet.py) that parses the GDoc, resolves DOIs via doi.org for clean APA references + BibTeX, writes to Google Sheet, and exports JSON for Hugo - New Hugo shortcode (clusters_display.html) renders all clusters with sidebar navigation, tabbed sub-clusters, and full-text search - Updated intro text to reflect 11 clusters (was 9) - Deactivated old cluster1-7.md files (replaced by data-driven rendering) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-03-22T22:25:07Z

👍 All image files/references (if any) are in webp format, in line with our policy.

LukasWallrich · 2026-03-22T22:28:11Z

✅ Staging Deployment Status

This PR has been successfully deployed to staging as part of an aggregated deployment.

Deployed at: 2026-03-23 23:50:03 UTC
Staging URL: https://staging.forrt.org

The staging site shows the combined state of all compatible open PRs.

The clusters page now has its own full-text search that covers clusters, sub-clusters, and all references. The site-wide Academic search is redundant and has been disabled. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-03-22T22:36:29Z

📝 Spell Check Results

Found 6 potential spelling issue(s) when checking 30 changed file(s):

📄 `static/js/clusters-page.js`

Line	Issue
80	tabEl ==> table
81	tabEl ==> table
83	tabEl ==> table
85	tabEl ==> table
474	tabEl ==> table
475	tabEl ==> table

ℹ️ How to address these issues:

Fix the typo: If it's a genuine typo, please correct it.
Add to whitelist: If it's a valid word (e.g., a name, technical term), add it to .codespell-ignore.txt
False positive: If this is a false positive, please report it in the PR comments.

_{🤖 This check was performed by codespell}

This reverts commit 9ad47b9.

richarddushime · 2026-03-23T18:00:52Z

we now have 2 searches box funcs
I m proposing that we remove the custom search on the left and leave the search on top of clusters

meanwhile i will continue enhancing it , would be good if you can check it asap
@LukasWallrich @flavioazevedo

LukasWallrich · 2026-03-23T18:33:48Z

Thanks @richarddushime! I agree that we need to get rid of one of the searches.

There is also now too much going on in this area - too many boxes. Maybe the syllabus does not need to be in a box?

Can we also remove the outdated figure and really condense the text? I think the following is all we need above the clusters - unless @flavioazevedo disagrees (but Richard, please make the change so that he can look at a complete new draft)

Teaching Open and Reproducible Science shouldn't require educators to spend months sifting through a decade of literature. FORRT simplifies this process by providing a curated, expert-backed framework. Developed by over 50 scholars, our taxonomy organizes open scholarship into 11 distinct clusters, offering a clear pathway for integrating these tenets into your teaching and mentoring, regardless of your field or level of expertise.

richarddushime · 2026-03-23T20:58:14Z

I am from making other adjustements
removed the left search and enhanced the functionality of the search (I limited the search not to go through references because it was getting a lot of results from references and making a user loose necessary text of the clusters)

I would like also clarification about the below

Teaching Open and Reproducible Science shouldn't require educators to spend months sifting through a decade of literature. FORRT simplifies this process by providing a curated, expert-backed framework. Developed by over 50 scholars, our taxonomy organizes open scholarship into 11 distinct clusters, offering a clear pathway for integrating these tenets into your teaching and mentoring, regardless of your field or level of expertise.

Do you mean all the contents before the forrt syllabus and the figure all removed and replaced by this paragraph ?

About the figure i think its good to keep having it as we wait for the updated one (may be flavio can push for its design quickly ?)

richarddushime · 2026-03-23T23:53:36Z

Additionally here is something i am proposing

in the latest commit I Introduces dedicated, indexable URLs for each FORRT cluster (/clusters/cluster-N/) alongside the existing taxonomy hub (/clusters/), so each cluster is a first-class page for search and sharing.

The reason i Added this is that Clusters in sitemap are only covered by 1 url (the main cluster page) or we can have each cluster indexable

by :
Canonical URLs per topic — One clear URL per cluster (and its sub-clusters in-page), instead of relying on a single long hub page or hash-only navigation for discovery.
Unique metadata per URL — Each cluster page can carry its own <title>, meta description, and Open Graph / Twitter fields from front matter, improving relevance for queries and snippet quality.
Structured data — Per-page JSON-LD (cluster_seo_jsonld) ties each URL to explicit taxonomy/entity signals for that cluster.
Topic-cluster information architecture — The hub remains the overview and entry point; cluster pages act as satellites with internal links between hub and subpages, supporting crawl paths and topical grouping.
Stable deep links — Shareable URLs (including hash targets for sub-clusters where used) support accurate social previews, backlinks, and citations to the right slice of the taxonomy.

you can check the preview by https://staging.forrt.org/clusters/cluster- [cluster-number-eg:2 or 2] eg: https://staging.forrt.org/clusters/cluster-2/

LukasWallrich requested a review from a team as a code owner March 22, 2026 22:24

forrtproject deleted a comment from github-actions bot Mar 22, 2026

LukasWallrich and others added 3 commits March 22, 2026 22:42

Revert "Disable site-wide search (replaced by clusters page search)"

9b9bdc4

This reverts commit 9ad47b9.

Merge branch 'master' into data-driven-clusters-v4

30d65f9

enhancement:clusters page

671781c

rm: left custom search, enhancement of the UI

2f3a56b

richarddushime added 2 commits March 23, 2026 22:03

jump to active when tab clicked of sub cluster

476afad

SEO booster for clusters

d328ce1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data-driven Clusters v4.1 page (11 clusters from Google Sheet)#720

Data-driven Clusters v4.1 page (11 clusters from Google Sheet)#720
LukasWallrich wants to merge 8 commits intomasterfrom
data-driven-clusters-v4

LukasWallrich commented Mar 22, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 22, 2026

Uh oh!

LukasWallrich commented Mar 22, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 22, 2026 •

edited

Loading

Uh oh!

richarddushime commented Mar 23, 2026

Uh oh!

LukasWallrich commented Mar 23, 2026 •

edited

Loading

Uh oh!

richarddushime commented Mar 23, 2026

Uh oh!

richarddushime commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

LukasWallrich commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Data pipeline

Screenshots

Test plan

Uh oh!

github-actions bot commented Mar 22, 2026

Uh oh!

LukasWallrich commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📝 Spell Check Results

📄 static/js/clusters-page.js

ℹ️ How to address these issues:

Uh oh!

richarddushime commented Mar 23, 2026

Uh oh!

LukasWallrich commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

richarddushime commented Mar 23, 2026

Uh oh!

richarddushime commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

LukasWallrich commented Mar 22, 2026 •

edited

Loading

LukasWallrich commented Mar 22, 2026 •

edited

Loading

github-actions bot commented Mar 22, 2026 •

edited

Loading

📄 `static/js/clusters-page.js`

LukasWallrich commented Mar 23, 2026 •

edited

Loading