Developer API Guide

Everything you need to use the IPS Unlabeled Learning codebase.

Quick Start

From zero to results in 4 steps. The self-test method needs no trajectory labels — just unlabeled snapshots.

Install & Import
git clone https://github.com/ViskaWei/lips_unlabeled_data
cd lips_unlabeled_data && pip install -e .

from core.potentials import HarmonicPotential, GaussianInteraction
from core.sde_simulator import SDESimulator
from lib.basis import get_basis
from lib.solvers import solve_selftest
from lib.eval import evaluate_kde
Generate → Learn → Evaluate
# 1. Simulate particle data
V = HarmonicPotential(k=2.0)
Phi = GaussianInteraction(A=1.0, sigma=0.8)
sim = SDESimulator(V=V, Phi=Phi, sigma=1.0, dt=0.001)
data, t_obs = sim.simulate(N=10, d=2, T=1.0, L=100, M=2000)

# 2. Learn from unlabeled data (no labels needed!)
build_V, build_Phi, K_V, K_Phi, _ = get_basis('oracle', 'model_e')
alpha, beta, info = solve_selftest(
    data, t_obs, sigma=1.0,
    build_V_fn=build_V, build_Phi_fn=build_Phi,
    K_V=K_V, K_Phi=K_Phi, reg='auto'
)

# 3. Evaluate
v_err, phi_err = evaluate_kde('model_e', d=2, alpha=alpha, beta=beta,
                              build_V_fn=build_V, build_Phi_fn=build_Phi)
print("V: %.1f%%, Phi: %.1f%%" % (100 * v_err, 100 * phi_err))

Two Pipelines

Basis Regression

Expand V, Φ in known basis functions. Linear least-squares with a closed-form solve — fast within the chosen basis.

load_data → get_basis → solve_selftest → evaluate_kde
Modules: lib.config, lib.basis, lib.solvers, lib.eval Time: ~seconds

Neural Network

MLP for V, Symmetric MLP for Φ. Gradient descent — flexible, no hand-specified oracle basis.

load_data → create_networks → train_loop(selftest_loss) → evaluate
Modules: core.nn_models, core.selftest_loss Time: ~hours (GPU)

Available Potentials

All potentials implement evaluate(x) and gradient(x).

ClassFormulaModel
HarmonicPotential(k)V(x) = k|x|²/2C, E
QuadraticConfinement(α1, α2)V = α1|x|/2 + α2|x|²A
DoubleWellPotential()V = (|x|²-1)²/4B, D
GaussianInteraction(A, σ)Φ(r) = A exp(-r²/2σ²)E
PiecewiseInteraction(β1, β2)Smoothed indicator ΦA
InverseInteraction(γ)Φ(r) = γ/(r+1)B
LennardJonesPotential(ε, σ)4ε[(σ/r)¹² - (σ/r)&sup6;]C
MorsePotential(D, a, r0)D(1-e-a(r-r0)D

Solver API

solve_selftest(data, t_obs, sigma, build_V_fn, build_Phi_fn, K_V, K_Phi, reg='auto')

Trajectory-free learning via the weak form self-test. No labels needed.

data: ndarray (M, L, N, d) — unlabeled snapshot ensemble
t_obs: ndarray (L,) — observation times
sigma: float — diffusion coefficient
reg: 'auto' | float — Tikhonov regularization (auto = Hansen L-curve)
Returns: alpha (K_V,), beta (K_Phi,), info dict

solve_mle(data_labeled, t_obs, ...)

Maximum likelihood estimation. Requires labeled trajectory pairs.

solve_sinkhorn(data_unlabeled, t_obs, ..., eps_factor=0.01)

Optimal transport label imputation + MLE. Unlabeled input, but degrades at large Δt.

Evaluation

evaluate_kde(model_name, d, alpha, beta, build_V_fn, build_Phi_fn)

Compute L²(ρ)-weighted errors against true gradients on a 2000-point grid.

Returns: (V_error, Phi_error) as relative errors in [0, 1]. Multiply by 100 for percentages.

Neural Network Training

Complete NN Pipeline
import torch
from core.nn_models import RadialNet, RadialInteractionNet
from core.selftest_loss import compute_selftest_loss_batch

# Create networks
V_net = RadialNet(hidden_dims=(64, 64, 64), activation='softplus').cuda()
Phi_net = RadialInteractionNet(d=2, hidden_dims=(64, 64, 64)).cuda()

# Train
optimizer = torch.optim.Adam(
    list(V_net.parameters()) + list(Phi_net.parameters()), lr=1e-3
)

for epoch in range(200):
    # Sample batch
    idx = np.random.choice(M, 32)
    X_curr = torch.tensor(data[idx, :-1], device='cuda', dtype=torch.float32)
    X_next = torch.tensor(data[idx, 1:], device='cuda', dtype=torch.float32)

    loss = compute_selftest_loss_batch(V_net, Phi_net, X_curr, X_next, dt, sigma)
    optimizer.zero_grad()
    loss.mean().backward()
    optimizer.step()

Constraints & Gotchas

CRITICAL

Self-test loss converges to a negative value, not zero. L(V*, Φ*) = -½ E[J_diss · Δt] < 0.

CRITICAL

L-curve regularization fails at dt_obs=0.1 (Bakushinskii phenomenon). Use fixed reg=1e-6 instead.

WARNING

NN activation must be C²-smooth (Softplus or Tanh, not ReLU) — second derivatives required for AD-based Laplacians.

WARNING

KDE evaluation only works for radial models (model_a/b/lj/morse/e). Non-radial models (aniso/dipole) use grid evaluation.

INFO

Mean-field energy 1/(2N²) has N(N-1) terms → gradients are O(1), not O(1/N²). This is normalization, not signal suppression.

INFO

Potentials are identifiable only up to additive constants, so equivalent shifted formulas may appear in the paper or visualizations. Reported quantitative errors are gradient errors.