Spaces:

napoles3d
/

Demo_The_Well

Sleeping

App Files Files Community

Demo_The_Well / README.md

napoles3d

Update README.md

2c03391 verified about 2 months ago

preview code

raw

history blame contribute delete

16.4 kB

	---
	title: PDE Leaderboard (The Well)
	emoji: 📈
	colorFrom: indigo
	colorTo: blue
	sdk: gradio
	sdk_version: 5.47.1
	app_file: app.py
	pinned: false
	---

	# The Well – PDE Baselines: Reproducible Evaluation Guide

	A clean, GPU-friendly recipe to evaluate PDE surrogate models from The Well on the `acoustic_scattering_maze` task and generate a `submit.json` ready for a team leaderboard or Hugging Face Space.



	---

	## TL;DR

	```bash
	# 0) create venv
	python -m venv .venv && source .venv/bin/activate
	pip install --upgrade pip

	# 1) install PyTorch (adjust cu121 to your CUDA)
	pip install --extra-index-url https://download.pytorch.org/whl/cu121 torch torchvision torchaudio

	# 2) get The Well (full repo with benchmark extras)
	git clone https://github.com/PolymathicAI/the_well.git
	cd the_well
	pip install -e ".[benchmark]"
	cd ..

	# 3) place the evaluation script in this folder (eval_wdm_full.py)
	python eval_wdm_full.py

	# 4) see results
	cat submit.json
	```

	---

	## What you get

	- A fully reproducible eval script (`eval_wdm_full.py`) that:
	- streams the dataset from HF (or reads it locally if you prefer),
	- assembles the 14-channel input feature tensor expected by the FNO baseline,
	- applies Z-Score normalization (same convention as the benchmark),
	- runs inference and produces a VRMSE@1 summary in `submit.json`.

	- An optional variant (`eval_wdm_full_psamples.py`) that also writes per-sample errors to `per_sample_vrmse.jsonl`.

	---

	## Requirements

	- Linux (recommended) with CUDA GPU.
	- Python 3.10–3.12 recommended (3.13 can work, but some libs lag behind).
	- CUDA/cuDNN compatible with your installed PyTorch wheels (e.g., `cu121`).
	- Git; optionally Git LFS if you plan to clone large checkpoints.

	---

	## Environment Setup

	```bash
	# create and activate a virtual environment
	python -m venv .venv
	source .venv/bin/activate

	# upgrade pip
	pip install --upgrade pip

	# install torch + CUDA wheels (change cu121 if needed)
	pip install --extra-index-url https://download.pytorch.org/whl/cu121 torch torchvision torchaudio
	```

	---

	## Install The Well (from source, with benchmark extras)

	> The PyPI package is trimmed; use the GitHub repo to get benchmark utilities.

	```bash
	git clone https://github.com/PolymathicAI/the_well.git
	cd the_well
	pip install -e ".[benchmark]"
	cd ..
	```

	Quick sanity check:

	```bash
	python - <<'PY'
	from the_well.benchmark.train import WellDataModule
	from the_well.benchmark.models import FNO
	print("OK: imports work")
	PY
	```

	---

	## Scripts

	Add these two files to your project folder (next to the `the_well` repo directory):

	### 1) `eval_wdm_full.py` (clean, full test split)

	```python
	# eval_wdm_full.py
	import json, warnings, random
	import numpy as np
	import torch
	from torch.utils.data import DataLoader
	from the_well.benchmark.models import FNO
	from the_well.benchmark.train import WellDataModule
	from the_well.data.normalization import ZScoreNormalization

	# Reproducibility
	random.seed(0); np.random.seed(0); torch.manual_seed(0)
	torch.backends.cudnn.benchmark = False
	torch.backends.cudnn.deterministic = True

	# Quiet some noisy warnings
	warnings.filterwarnings("ignore", category=UserWarning, module="tltorch")
	warnings.filterwarnings("ignore", category=UserWarning, module="neuralop")

	device = "cuda" if torch.cuda.is_available() else "cpu"

	def to_bchw_take_t0_any(t: torch.Tensor) -> torch.Tensor:
	if t.ndim == 5: # [B,T,H,W,C]
	t = t[:, 0, ...].permute(0, 3, 1, 2).contiguous()
	elif t.ndim == 4: # [B,H,W,C]
	t = t.permute(0, 3, 1, 2).contiguous()
	elif t.ndim == 3: # [B,H,W]
	t = t.unsqueeze(1).contiguous()
	elif t.ndim == 2: # [B,C] -> [B,C,1,1]
	t = t[:, :, None, None].contiguous()
	else:
	raise ValueError(f"Unsupported rank: {t.ndim}, shape={tuple(t.shape)}")
	return t.float()

	def expand_scalar_plane(x: torch.Tensor, H: int, W: int):
	if x is None: return None
	if x.ndim == 2: x = x[:, 0]
	elif x.ndim != 1: raise ValueError("Expected (B,) or (B,T) for time scalars")
	x = x[:, None, None, None]
	return x.expand(-1, 1, H, W).float()

	def assemble_x_14ch(b: dict) -> torch.Tensor:
	parts = []
	x_if = to_bchw_take_t0_any(b["input_fields"]) # ~3 channels
	B, _, H, W = x_if.shape
	parts.append(x_if)
	if "space_grid" in b and torch.is_tensor(b["space_grid"]):
	parts.append(to_bchw_take_t0_any(b["space_grid"])) # +2
	if "constant_fields" in b and torch.is_tensor(b["constant_fields"]):
	parts.append(to_bchw_take_t0_any(b["constant_fields"])) # +2
	if "boundary_conditions" in b and torch.is_tensor(b["boundary_conditions"]):
	bc = b["boundary_conditions"].view(B, -1)[:, :4]
	bc = bc[:, :, None, None].expand(B, 4, H, W).float() # +4
	parts.append(bc)
	it = b.get("input_time_grid", None)
	ot = b.get("output_time_grid", None)
	if torch.is_tensor(it): parts.append(expand_scalar_plane(it, H, W)) # +1
	if torch.is_tensor(ot): parts.append(expand_scalar_plane(ot, H, W)) # +1
	x = torch.cat([p for p in parts if p is not None], dim=1)
	IN_CH = 14
	c = x.shape[1]
	if c < IN_CH:
	pad = torch.zeros(B, IN_CH - c, H, W, dtype=x.dtype)
	x = torch.cat([x, pad], dim=1)
	elif c > IN_CH:
	x = x[:, :IN_CH]
	return x

	def zscore_per_channel(x: torch.Tensor, eps=1e-6) -> torch.Tensor:
	mu = x.mean(dim=(2,3), keepdim=True)
	sd = x.std(dim=(2,3), keepdim=True).clamp_min(eps)
	return (x - mu) / sd

	def try_apply_dm_normalizer(dm, x, y):
	for name in ("normalization", "normalizer", "norm"):
	norm = getattr(dm, name, None)
	if norm is None:
	continue
	if hasattr(norm, "normalize_input") and hasattr(norm, "normalize_output"):
	return norm.normalize_input(x), norm.normalize_output(y)
	try:
	return norm(x, which="input"), norm(y, which="output")
	except Exception:
	pass
	try:
	return norm.forward(x, is_input=True), norm.forward(y, is_input=False)
	except Exception:
	pass
	return None

	def main(out_path="submit.json"):
	dm = WellDataModule(
	well_base_path="hf://datasets/polymathic-ai/",
	well_dataset_name="acoustic_scattering_maze",
	batch_size=1,
	data_workers=0,
	use_normalization=True,
	normalization_type=ZScoreNormalization,
	n_steps_input=1, n_steps_output=1,
	boundary_return_type="padding",
	)
	loader = dm.test_dataloader() if hasattr(dm, "test_dataloader") \
	else DataLoader(getattr(dm, "test"), batch_size=1, shuffle=False)

	model = FNO.from_pretrained("polymathic-ai/FNO-acoustic_scattering_maze").to(device).eval()

	scores = []
	with torch.no_grad():
	for b in loader:
	x = assemble_x_14ch(b).to(device)
	y = to_bchw_take_t0_any(b["output_fields"]).to(device)

	applied = try_apply_dm_normalizer(dm, x, y)
	if applied is not None:
	x, y = applied
	else:
	x = zscore_per_channel(x)
	y = zscore_per_channel(y)

	yp = model(x)
	c = min(yp.shape[1], y.shape[1])
	yp_np, y_np = yp[:, :c].detach().cpu().numpy(), y[:, :c].detach().cpu().numpy()
	vrmse = float((((yp_np - y_np)2).mean() / (y_np.var() + 1e-12))0.5)
	scores.append(vrmse)

	result = {
	"task": "acoustic_scattering_maze",
	"dataset_repo": "polymathic-ai/acoustic_scattering_maze",
	"model_family": "FNO",
	"model_repo": "polymathic-ai/FNO-acoustic_scattering_maze",
	"seed": 0,
	"split": "test",
	"horizons": [1],
	"metrics": {
	"VRMSE@1": {
	"mean": float(np.mean(scores)),
	"std": float(np.std(scores)),
	"n": int(len(scores)),
	}
	},
	}
	with open(out_path, "w") as f:
	json.dump(result, f, indent=2)
	print("[ok] wrote", out_path)
	print(json.dumps(result, indent=2))

	if __name__ == "__main__":
	main(out_path="submit.json")
	```

	### 2) `eval_wdm_full_psamples.py` (also logs per-sample VRMSE)

	Use this one if you want `per_sample_vrmse.jsonl` to analyze outliers.

	```python
	# eval_wdm_full_psamples.py
	import json, warnings, random
	import numpy as np
	import torch
	from torch.utils.data import DataLoader
	from the_well.benchmark.models import FNO
	from the_well.benchmark.train import WellDataModule
	from the_well.data.normalization import ZScoreNormalization

	random.seed(0); np.random.seed(0); torch.manual_seed(0)
	torch.backends.cudnn.benchmark = False
	torch.backends.cudnn.deterministic = True
	warnings.filterwarnings("ignore", category=UserWarning, module="tltorch")
	warnings.filterwarnings("ignore", category=UserWarning, module="neuralop")
	device = "cuda" if torch.cuda.is_available() else "cpu"

	def to_bchw_take_t0_any(t):
	if t.ndim == 5: t = t[:, 0, ...].permute(0, 3, 1, 2).contiguous()
	elif t.ndim == 4: t = t.permute(0, 3, 1, 2).contiguous()
	elif t.ndim == 3: t = t.unsqueeze(1).contiguous()
	elif t.ndim == 2: t = t[:, :, None, None].contiguous()
	else: raise ValueError(f"rank no soportado: {t.ndim}")
	return t.float()

	def expand_scalar_plane(x, H, W):
	if x is None: return None
	if x.ndim == 2: x = x[:, 0]
	elif x.ndim != 1: raise ValueError("esperaba (B,) o (B,T)")
	x = x[:, None, None, None]
	return x.expand(-1, 1, H, W).float()

	def assemble_x_14ch(b):
	parts = []
	x_if = to_bchw_take_t0_any(b["input_fields"])
	B, _, H, W = x_if.shape
	parts.append(x_if)
	if "space_grid" in b and torch.is_tensor(b["space_grid"]):
	parts.append(to_bchw_take_t0_any(b["space_grid"]))
	if "constant_fields" in b and torch.is_tensor(b["constant_fields"]):
	parts.append(to_bchw_take_t0_any(b["constant_fields"]))
	if "boundary_conditions" in b and torch.is_tensor(b["boundary_conditions"]):
	bc = b["boundary_conditions"].view(B, -1)[:, :4]
	bc = bc[:, :, None, None].expand(B, 4, H, W).float()
	parts.append(bc)
	it = b.get("input_time_grid", None); ot = b.get("output_time_grid", None)
	if torch.is_tensor(it): parts.append(expand_scalar_plane(it, H, W))
	if torch.is_tensor(ot): parts.append(expand_scalar_plane(ot, H, W))
	x = torch.cat([p for p in parts if p is not None], dim=1)
	if x.shape[1] < 14:
	pad = torch.zeros(B, 14 - x.shape[1], H, W, dtype=x.dtype); x = torch.cat([x, pad], dim=1)
	elif x.shape[1] > 14:
	x = x[:, :14]
	return x

	def zscore_per_channel(x, eps=1e-6):
	mu = x.mean(dim=(2,3), keepdim=True)
	sd = x.std(dim=(2,3), keepdim=True).clamp_min(eps)
	return (x - mu) / sd

	def try_apply_dm_normalizer(dm, x, y):
	for name in ("normalization", "normalizer", "norm"):
	norm = getattr(dm, name, None)
	if norm is None: continue
	if hasattr(norm, "normalize_input") and hasattr(norm, "normalize_output"):
	return norm.normalize_input(x), norm.normalize_output(y)
	try: return norm(x, which="input"), norm(y, which="output")
	except Exception: pass
	try: return norm.forward(x, is_input=True), norm.forward(y, is_input=False)
	except Exception: pass
	return None

	def main(out_path="submit.json", per_sample_path="per_sample_vrmse.jsonl"):
	dm = WellDataModule(
	well_base_path="hf://datasets/polymathic-ai/",
	well_dataset_name="acoustic_scattering_maze",
	batch_size=1,
	data_workers=0,
	use_normalization=True,
	normalization_type=ZScoreNormalization,
	n_steps_input=1, n_steps_output=1,
	boundary_return_type="padding",
	)
	loader = dm.test_dataloader() if hasattr(dm, "test_dataloader") else DataLoader(getattr(dm, "test"), batch_size=1, shuffle=False)
	model = FNO.from_pretrained("polymathic-ai/FNO-acoustic_scattering_maze").to(device).eval()

	scores = []
	with torch.no_grad(), open(per_sample_path, "w") as fout:
	for b in loader:
	x = assemble_x_14ch(b).to(device)
	y = to_bchw_take_t0_any(b["output_fields"]).to(device)

	applied = try_apply_dm_normalizer(dm, x, y)
	if applied is not None: x, y = applied
	else: x = zscore_per_channel(x); y = zscore_per_channel(y)

	yp = model(x)
	c = min(yp.shape[1], y.shape[1])
	yp_np, y_np = yp[:, :c].detach().cpu().numpy(), y[:, :c].detach().cpu().numpy()
	vrmse = float((((yp_np - y_np)2).mean() / (y_np.var() + 1e-12))0.5)
	scores.append(vrmse)
	fout.write(f"{vrmse}\n")

	result = {
	"task": "acoustic_scattering_maze",
	"dataset_repo": "polymathic-ai/acoustic_scattering_maze",
	"model_family": "FNO",
	"model_repo": "polymathic-ai/FNO-acoustic_scattering_maze",
	"seed": 0,
	"split": "test",
	"horizons": [1],
	"metrics": {
	"VRMSE@1": {"mean": float(np.mean(scores)), "std": float(np.std(scores)), "n": int(len(scores))}
	},
	}
	with open(out_path, "w") as f:
	json.dump(result, f, indent=2)
	print("[ok] wrote", out_path, "and", per_sample_path)

	if __name__ == "__main__":
	main()
	```

	---

	## Run

	```bash
	python eval_wdm_full.py
	# or
	python eval_wdm_full_psamples.py
	```

	Artifacts:
	- `submit.json` — summary for the leaderboard
	- (optional) `per_sample_vrmse.jsonl` — one number per line

	---

	## Faster runs (optional)

	### Use local data instead of HF streaming

	```bash
	# Download only the needed split
	the-well-download --base-path /data/the_well --dataset acoustic_scattering_maze --split test
	```

	Then in the script, set:
	```python
	dm = WellDataModule(
	well_base_path="/data/the_well", # <— local path now
	well_dataset_name="acoustic_scattering_maze",
	...
	)
	```

	### Increase throughput
	- Try `batch_size=2..4` and `data_workers=2..4` if VRAM allows.
	- Keep seeds and determinism if you need strict reproducibility.

	---

	## Docker (optional, for reproducibility)

	Dockerfile:

	```dockerfile
	FROM nvidia/cuda:12.1.0-cudnn8-runtime-ubuntu22.04

	RUN apt-get update && apt-get install -y git python3 python3-pip && rm -rf /var/lib/apt/lists/*
	RUN python3 -m pip install --upgrade pip

	# Install torch (adjust cu121 if needed)
	RUN pip install --extra-index-url https://download.pytorch.org/whl/cu121 torch torchvision torchaudio

	# Install The Well (benchmark extras)
	RUN git clone https://github.com/PolymathicAI/the_well.git && \
	cd the_well && pip install -e ".[benchmark]"

	WORKDIR /workspace
	COPY eval_wdm_full.py /workspace/eval_wdm_full.py

	CMD ["python", "eval_wdm_full.py"]
	```

	Run:
	```bash
	docker build -t well-eval .
	docker run --gpus all -v $PWD:/workspace well-eval
	```

	---

	## Hugging Face Space (optional)

	- Create a Docker Space and paste the Dockerfile above.
	- Add your eval script(s).
	- Optionally provide a minimal UI (Gradio/Streamlit) that triggers the script and displays the JSON.

	---

	## Troubleshooting

	- `ModuleNotFoundError: the_well.benchmark.data`
	You installed from PyPI. Install from GitHub with `pip install -e ".[benchmark]"`.

	- Huge VRMSE (e.g., 1e5+)
	Input features not assembled/normalized like in training. Use the provided assembly to 14 channels, and ZScoreNormalization (either via the DM or per-channel fallback).

	- No `setup()` or `prepare_data()` on `WellDataModule`
	Different versions expose different hooks. Use `dm.test_dataloader()` directly (as in the script).

	- GPU / VRAM errors
	Lower `batch_size` to 1; reduce workers; ensure Torch/CUDA versions match.

	- Deprecation warnings from `timm`, `tltorch`, `neuralop`
	Cosmetic for evaluation; silenced in the script.

	---

	## Contributing

	- Add new tasks: duplicate `eval_wdm_full.py` and change `well_dataset_name`.
	- Add new models: swap `FNO.from_pretrained(...)` with other checkpoints.
	- Keep scripts stateless and deterministic for fair comparisons.
	- Open a PR with your changes, include a sample `submit.json`, and note CUDA/PyTorch versions used.

	---

	## License

	- Respect The Well’s license and any dataset/model licenses when sharing results.
	- Include your project’s license here (e.g., MIT/Apache-2.0).