Developer API Guide
Everything you need to use the IPS Unlabeled Learning codebase.
Quick Start
From zero to results in 4 steps. The self-test method needs no trajectory labels — just unlabeled snapshots.
git clone https://github.com/ViskaWei/lips_unlabeled_data
cd lips_unlabeled_data && pip install -e .
from core.potentials import HarmonicPotential, GaussianInteraction
from core.sde_simulator import SDESimulator
from lib.basis import get_basis
from lib.solvers import solve_selftest
from lib.eval import evaluate_kde # 1. Simulate particle data
V = HarmonicPotential(k=2.0)
Phi = GaussianInteraction(A=1.0, sigma=0.8)
sim = SDESimulator(V=V, Phi=Phi, sigma=1.0, dt=0.001)
data, t_obs = sim.simulate(N=10, d=2, T=1.0, L=100, M=2000)
# 2. Learn from unlabeled data (no labels needed!)
build_V, build_Phi, K_V, K_Phi, _ = get_basis('oracle', 'model_e')
alpha, beta, info = solve_selftest(
data, t_obs, sigma=1.0,
build_V_fn=build_V, build_Phi_fn=build_Phi,
K_V=K_V, K_Phi=K_Phi, reg='auto'
)
# 3. Evaluate
v_err, phi_err = evaluate_kde('model_e', d=2, alpha=alpha, beta=beta,
build_V_fn=build_V, build_Phi_fn=build_Phi)
print("V: %.1f%%, Phi: %.1f%%" % (100 * v_err, 100 * phi_err)) Two Pipelines
Basis Regression
Expand V, Φ in known basis functions. Linear least-squares with a closed-form solve — fast within the chosen basis.
load_data → get_basis → solve_selftest → evaluate_kde Neural Network
MLP for V, Symmetric MLP for Φ. Gradient descent — flexible, no hand-specified oracle basis.
load_data → create_networks → train_loop(selftest_loss) → evaluate Available Potentials
All potentials implement evaluate(x) and gradient(x).
| Class | Formula | Model |
|---|---|---|
HarmonicPotential(k) | V(x) = k|x|²/2 | C, E |
QuadraticConfinement(α1, α2) | V = α1|x|/2 + α2|x|² | A |
DoubleWellPotential() | V = (|x|²-1)²/4 | B, D |
GaussianInteraction(A, σ) | Φ(r) = A exp(-r²/2σ²) | E |
PiecewiseInteraction(β1, β2) | Smoothed indicator Φ | A |
InverseInteraction(γ) | Φ(r) = γ/(r+1) | B |
LennardJonesPotential(ε, σ) | 4ε[(σ/r)¹² - (σ/r)&sup6;] | C |
MorsePotential(D, a, r0) | D(1-e-a(r-r0))² | D |
Solver API
solve_selftest(data, t_obs, sigma, build_V_fn, build_Phi_fn, K_V, K_Phi, reg='auto')
Trajectory-free learning via the weak form self-test. No labels needed.
data: ndarray (M, L, N, d) — unlabeled snapshot ensemblet_obs: ndarray (L,) — observation timessigma: float — diffusion coefficientreg: 'auto' | float — Tikhonov regularization (auto = Hansen L-curve)solve_mle(data_labeled, t_obs, ...)
Maximum likelihood estimation. Requires labeled trajectory pairs.
solve_sinkhorn(data_unlabeled, t_obs, ..., eps_factor=0.01)
Optimal transport label imputation + MLE. Unlabeled input, but degrades at large Δt.
Evaluation
evaluate_kde(model_name, d, alpha, beta, build_V_fn, build_Phi_fn)
Compute L²(ρ)-weighted errors against true gradients on a 2000-point grid.
(V_error, Phi_error) as relative errors in [0, 1]. Multiply by 100 for percentages.Neural Network Training
import torch
from core.nn_models import RadialNet, RadialInteractionNet
from core.selftest_loss import compute_selftest_loss_batch
# Create networks
V_net = RadialNet(hidden_dims=(64, 64, 64), activation='softplus').cuda()
Phi_net = RadialInteractionNet(d=2, hidden_dims=(64, 64, 64)).cuda()
# Train
optimizer = torch.optim.Adam(
list(V_net.parameters()) + list(Phi_net.parameters()), lr=1e-3
)
for epoch in range(200):
# Sample batch
idx = np.random.choice(M, 32)
X_curr = torch.tensor(data[idx, :-1], device='cuda', dtype=torch.float32)
X_next = torch.tensor(data[idx, 1:], device='cuda', dtype=torch.float32)
loss = compute_selftest_loss_batch(V_net, Phi_net, X_curr, X_next, dt, sigma)
optimizer.zero_grad()
loss.mean().backward()
optimizer.step() Constraints & Gotchas
Self-test loss converges to a negative value, not zero. L(V*, Φ*) = -½ E[J_diss · Δt] < 0.
L-curve regularization fails at dt_obs=0.1 (Bakushinskii phenomenon). Use fixed reg=1e-6 instead.
NN activation must be C²-smooth (Softplus or Tanh, not ReLU) — second derivatives required for AD-based Laplacians.
KDE evaluation only works for radial models (model_a/b/lj/morse/e). Non-radial models (aniso/dipole) use grid evaluation.
Mean-field energy 1/(2N²) has N(N-1) terms → gradients are O(1), not O(1/N²). This is normalization, not signal suppression.
Potentials are identifiable only up to additive constants, so equivalent shifted formulas may appear in the paper or visualizations. Reported quantitative errors are gradient errors.