Skip to content

Add Kaiju cluster support: pixi env, PETSc build script, and documentation#79

Open
jcgraciosa wants to merge 10 commits intounderworldcode:developmentfrom
jcgraciosa:development
Open

Add Kaiju cluster support: pixi env, PETSc build script, and documentation#79
jcgraciosa wants to merge 10 commits intounderworldcode:developmentfrom
jcgraciosa:development

Conversation

@jcgraciosa
Copy link
Contributor

Summary

  • Adds kaiju pixi feature and environment to pixi.toml (linux-64, conda-forge Python packages — sympy, scipy, pint, pydantic, gmsh, etc.)
  • Adds petsc-custom/build-petsc-kaiju.sh — PETSc configure script for Kaiju (spack OpenMPI, AMR tools, petsc4py)
  • Adds docs/developer/guides/kaiju-cluster-setup.md — full installation and usage guide for the Kaiju HPC cluster
  • Updates docs/developer/index.md to include the new guide in the toctree

Background
Kaiju is a Rocky Linux 8.10 HPC cluster (Slurm + spack) used by the Underworld development team. MPI-dependent packages (mpi4py, PETSc+AMR+petsc4py, h5py) must be built from source against spack's OpenMPI to be compatible with Slurm's parallel interconnect.

The architecture is:

pixi kaiju env → Python 3.12, sympy, scipy, pint, ... (conda-forge, no MPI)
spack → openmpi@4.1.6 (cluster MPI)
source build → mpi4py, PETSc+AMR+petsc4py, h5py (linked to spack MPI)

Key fixes included

  • build-petsc-kaiju.sh: -DUSE_SCOTCH=OFF in MMG cmake arguments — fixes PARMMG configure failure with pixi's conda ld 14.x (pixi's stricter linker requires transitive shared library deps to be explicit; libmmg.so built with SCOTCH causes MMG_WORKS link test to fail)
  • load_env(): builds LD_LIBRARY_PATH from all spack transitive dep prefixes via CMAKE_PREFIX_PATH — required for pixi's ld at link time
  • pixi shell-hook (not pixi shell) used for Slurm batch job compatibility

Test plan

  • Per-user install verified on Kaiju head node (verify_install passed)
  • Shared install deployed to /opt/cluster/software/underworld3, accessible via module load underworld3/development-12Mar26
  • Slurm job script verified (multi-node MPI run)
  • Docs build (pixi run docs-build) — verify new page appears in developer guides

Notes
Install scripts and Slurm job templates are maintained in the kaiju-admin-notes repo (admin/cluster-specific tooling kept separate from the framework).

Underworld development team with AI support from Claude Code

Juan Carlos Graciosa and others added 10 commits March 11, 2026 14:38
pixi's conda ld (14.3.0) requires explicit transitive shared lib deps.
libmmg.so built with SCOTCH caused MMG_WORKS link test to fail in
PARMMG's FindMMG.cmake because libscotch.so wasn't explicitly linked.
MMG's SCOTCH is only used for mesh renumbering (optional perf feature);
PARMMG uses ptscotch separately for parallel partitioning, unaffected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add shared installation section (admin, Lmod module)
- Add troubleshooting entries from install experience:
  h5py replacing mpi4py, numpy ABI mismatch, PARMMG/pixi ld issue

Underworld development team with AI support from Claude Code
@jcgraciosa jcgraciosa requested a review from lmoresi as a code owner March 12, 2026 03:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant