Contributor Guide
Use this page when you are working on the flopscope repository itself rather than only consuming the published API.
You will learn:
- How the repository is organized across three packages
- How to set up your development environment and run tests
- How to work with client, server, and Docker workflows
- How auto-generated documentation is maintained
Repository layout
This repository contains three Python packages plus docs and Docker assets:
| Path | Purpose |
|---|---|
src/flopscope/ | Core library backed by NumPy |
flopscope-client/src/flopscope/ | Client proxy used in sandboxed participant environments |
flopscope-server/src/flopscope_server/ | ZMQ server that executes the real library |
tests/ | Core library test suite |
flopscope-client/tests/ | Client unit, integration, and adversarial tests |
flopscope-server/tests/ | Server unit tests |
website/content/docs/ | Docs source for the published site |
website/public/ops.json | Generated slim API operation index consumed by /docs/api |
website/public/api-data/ops/*.json | Generated per-operation detail payloads for canonical operation pages |
website/.generated/public-api-routes.json | Generated canonical route manifest for /docs/api/... pages |
website/.generated/op-doc-imports.ts | Generated static import map for operation docs |
website/.generated/symbol-doc-imports.ts | Generated static import map for public helper and object docs |
website/.generated/public-api-symbols.json | Generated manifest of non-registry public API pages |
scripts/generate_api_docs.py | Regenerates API route manifests, per-operation payloads, and public symbol docs |
docker/ | Local client-server and hardened evaluation images |
Initial setup
For normal work on the core package, docs, and root test suite:
git clone https://github.com/AIcrowd/flopscope.git
cd flopscope
make installmake install runs uv sync --all-extras and configures the local git hooks.
Which environment to use
The root environment covers the core package, linting, docs, and the main test
suite. The client and server each also have their own pyproject.toml.
One important caveat: flopscope-server depends on the local flopscope
package, which is not resolved from a package index in a fresh source checkout.
For server development, run commands from the repository root with
PYTHONPATH=src:flopscope-server/src instead of relying on cd flopscope-server && uv run ....
Common commands
Core library
make lint
make test
make test-numpy-compat
make docs-build
make docs-serve
make ciIf you prefer direct uv commands:
uv run pytest
uv run mkdocs serveWhen running the local docs site and you want flopscope error messages to link to your local copy instead of the hosted site, set:
export FLOPSCOPE_DOCS_ROOT=http://localhost:3000/docsIf FLOPSCOPE_DOCS_ROOT is unset, flopscope falls back to the hosted docs at
https://aicrowd.github.io/flopscope/docs.
Client package
The client package is independently installable, so its test suite can run via its own project file:
uv run --project flopscope-client pytest flopscope-client/testsClient integration and adversarial tests start a real server subprocess using
the repository root .venv/bin/python, so run make install first.
Server package
Run server tests from the repository root so the local core package is on
PYTHONPATH:
PYTHONPATH=src:flopscope-server/src \
uv run --with pyzmq --with msgpack pytest flopscope-server/testsTo launch the server manually from a source checkout:
PYTHONPATH=src:flopscope-server/src \
uv run --with pyzmq --with msgpack \
python -m flopscope_server --url ipc:///tmp/flopscope.sockRunning client and server together without Docker
From a source checkout, use repo-root commands so both packages resolve correctly:
# Terminal 1
PYTHONPATH=src:flopscope-server/src \
uv run --with pyzmq --with msgpack \
python -m flopscope_server --url ipc:///tmp/flopscope.sock# Terminal 2
export FLOPSCOPE_SERVER_URL=ipc:///tmp/flopscope.sock
PYTHONPATH=flopscope-client/src \
uv run --with pyzmq --with msgpack python your_script.pySee Running with Docker if you want the same split using containers.
Generated documentation
Do not hand-edit website/public/ops.json,
website/public/api-data/ops/*.json, website/.generated/public-api-routes.json,
website/.generated/op-doc-imports.ts, website/.generated/symbol-doc-imports.ts,
or website/.generated/public-api-symbols.json. The interactive API reference,
canonical API pages, and legacy redirect routes consume those generated
artifacts directly.
Instead, update scripts/generate_api_docs.py, the relevant source docstrings,
or the operation registry, then regenerate and verify:
uv run python scripts/generate_api_docs.py
uv run python scripts/generate_api_docs.py --verifyNumPy Compatibility Testing
flopscope's goal is NumPy API compatibility on the counted surface: import flopscope.numpy as np should work for supported functions. To verify this, we run NumPy's own test suite against flopscope.
How it works
A pytest conftest at tests/numpy_compat/conftest.py monkeypatches numpy functions with their flopscope equivalents at session start. When we point pytest at NumPy's installed test files using --pyargs, every test that calls np.sum(...), np.mean(...), etc. actually calls flopscope's version.
NumPy test file conftest.py flopscope
calls np.sum(x) ──────> np.sum = fnp.sum ──────> fnp.sum(x)
asserts result (monkeypatch) (FLOP-counted)Avoiding infinite recursion
flopscope functions internally call numpy (for example, fnp.dot eventually delegates to _np.dot inside the implementation modules). Since _np is the numpy module, patching numpy.dot = fnp.dot without isolating those backend references would cause infinite recursion: fnp.dot → _np.dot → numpy.dot → fnp.dot → ...
We solve this by freezing numpy before patching: the conftest creates a snapshot of the numpy module (and its submodules like numpy.linalg, numpy.fft), then rebinds every flopscope module's _np reference to the frozen copy. Now flopscope's internal calls go to the original numpy functions, while the test suite sees flopscope's versions.
# Simplified flow in conftest.py:
frozen_np = freeze_numpy() # snapshot of original numpy
rebind_flopscope_np(frozen_np) # flopscope internals → frozen copy
patch_numpy() # np.sum = fnp.sum, etc.
# Now: test calls np.sum → fnp.sum → frozen_np.sum (original) ✓What gets patched
Of flopscope's 508 registered functions, most non-ufunc functions are patched onto numpy during testing. The only categories skipped:
| Category | Count | Why skipped |
|---|---|---|
| Ufuncs | 101 | flopscope functions are plain callables, not ufuncs -- they lack .reduce, .accumulate, .outer, .nargs. Tests check these attributes at collection time. |
| Blacklisted | 32 | Intentionally unsupported |
linalg.outer | 1 | fnp.linalg.outer delegates to np.outer (not np.linalg.outer), which has different validation behavior |
Everything else -- free ops, counted custom ops (dot, einsum, etc.), submodule functions (linalg, fft), reductions, and special functions -- is patched.
Test suites
We run 7 NumPy test modules covering core math, ufuncs, numerics, linear algebra, FFT, polynomials, and random:
| Suite | Module | Passed | xfailed |
|---|---|---|---|
| Core math | numpy._core.tests.test_umath | 4,668 | 13 |
| Ufunc infrastructure | numpy._core.tests.test_ufunc | 795 | 7 |
| Numeric operations | numpy._core.tests.test_numeric | 1,560 | 20 |
| Linear algebra | numpy.linalg.tests.test_linalg | 48 | 255 |
| FFT | numpy.fft.tests.test_pocketfft | 114 | 34 |
| Polynomials | numpy.polynomial.tests.test_polynomial | 36 | 2 |
| Random | numpy.random.tests.test_random | 142 | 0 |
| Total | 7,363 | 331 |
All failures are tracked as xfails in tests/numpy_compat/xfails.py.
Running the tests
Tests use pytest-xdist for parallel execution across all CPU cores.
# Run everything (recommended)
make test-numpy-compat
# Run a single suite
uv run pytest tests/numpy_compat/ --pyargs numpy._core.tests.test_umath -n auto -q
# Filter to specific functions
uv run pytest tests/numpy_compat/ --pyargs numpy._core.tests.test_umath -k "sqrt" -n auto -v
# Run without parallelism (for debugging)
uv run pytest tests/numpy_compat/ --pyargs numpy._core.tests.test_umath -v --tb=shortThe numpy_compat tests are excluded from the default pytest run (via pyproject.toml addopts) to prevent the monkeypatch from contaminating the main test suite. They run as a separate step in CI.
Known divergences (xfails)
Tests that fail due to known, accepted differences are tracked in tests/numpy_compat/xfails.py. Each entry maps a test pattern to a categorized reason:
| Category | Meaning | Examples |
|---|---|---|
NOT_IMPLEMENTED | Function exists but lacks a kwarg or edge case | Missing out=, where=, subok= kwargs |
UNSUPPORTED_DTYPE | flopscope doesn't support this dtype | timedelta, object arrays |
UFUNC_INTERNALS | Test relies on ufunc protocol | .reduce, __array_ufunc__ |
BUDGET_SIDE_EFFECT | Test assumes no global state changes | Budget deduction during assertions |
NUMPY_INTERNAL | Test uses numpy internals | _umath_tests, internal type tables |
The linalg suite has the most xfails (255) because flopscope's linalg wrappers don't support stacked/batched arrays, 0-size arrays, or some advanced kwargs that numpy's linalg tests exercise extensively.
Triaging new failures
- Run a suite:
uv run pytest tests/numpy_compat/ --pyargs <module> -n auto --tb=line - Categorize each failure
- If it's a bug we should fix, create an issue
- If it's an accepted divergence, add it to
xfails.py
Why monkeypatching (not subclassing)
We considered alternatives:
- Array subclass with
__array_ufunc__: Would intercept ufunc calls, but flopscope arrays are plainnumpy.ndarrayby design -- no custom tensor class. - Running tests with
import flopscope as np: NumPy's test files import fromnumpy._core,numpy.testing, etc. -- can't redirect all internal imports. - Monkeypatching with frozen numpy: Simple, works with NumPy's existing test infrastructure, tests exactly what users experience (same function signatures), and the frozen-numpy trick prevents infinite recursion.
Related pages
- Running with Docker — containerized client-server setup
- Client-Server Model — architecture overview