Training

Training#

Configuration reference for jaqmc solid train. This page shows the effective defaults for the train workflow preset. Use --dry-run to see the resolved config for your run, or add workflow.config.verbose=true to include field descriptions. Keys use the same dot notation as CLI overrides, such as train.run.iterations=5000. Defaults are resolved in this order: schema defaults, workflow preset, YAML config, then CLI overrides. For evaluation config, see Evaluation.

Workflow (`workflow.*`)#

These keys control workflow-level settings shared across all stages.

workflow.seed

Default: None · Type: int | None

Fixed random seed.

workflow.batch_size

Default: 4096 · Type: int

Number of walkers (samples) to use in each iteration.

workflow.save_path

Default: '' · Type: str

Path to save checkpoints and logs.

workflow.restore_path

Default: '' · Type: str

Path to restore checkpoints from.

workflow.config.ignore_extra

Default: False · Type: bool

If True, silently ignore unrecognized config keys.

workflow.config.verbose

Default: False · Type: bool

If True, print the fully resolved config with field descriptions at startup.

workflow.disable_jit

Default: False · Type: bool

Disable JAX JIT compilation (for debugging).

System (`system.*`)#

Defines the periodic solid system to simulate. The implementation is selected by system.module.

Default module selection: unset, so system.* is read directly as an arbitrary crystal config. Built-in choices are:
- unset: arbitrary crystal config
- rock_salt: rock-salt crystal generator
- two_atom_chain: two-atom chain generator

Arbitrary crystals (default)#

system.atoms

Default: (required) · Type: list[Atom]

List of atoms in the system.

system.basis

Default: 'sto-3g' · Type: str | dict[str, str]

The basis set for Hartree-Fock pretrain.

system.ecp

Default: None · Type: str | dict[str, str] | None

Effective core potential specification.

system.electron_spins

Default: (required) · Type: tuple[int, int]

Tuple of two integers representing the number of up and down electrons.

system.fixed_spins_per_atom

Default: None · Type: list[tuple[int, int]] | None

Optional list of fixed spin configurations per atom.

system.electron_init_width

Default: 1.0 · Type: float

Width of the Gaussian distribution for initializing electron positions.

system.lattice_vectors

Default: [] · Type: list[list[float]]

system.supercell_matrix

Default: [[1, 0, 0], [0, 1, 0], [0, 0, 1]] · Type: list[list[int]]

system.twist

Default: [0.0, 0.0, 0.0] · Type: list[float]

Rock salt (`system.module=rock_salt`)#

system.symbol_a

Default: 'Li' · Type: str

Symbol of the atom at the origin.

system.symbol_b

Default: 'H' · Type: str

Symbol of the atom at the body center.

system.lattice_constant

Default: 4.0 · Type: float

Lattice constant.

system.unit

Default: angstrom · Type: LengthUnit

Unit of the lattice constant (‘angstrom’ or ‘bohr’).

system.supercell

Default: None · Type: list[int] | None

Supercell dimensions [nx, ny, nz].

system.basis

Default: 'sto-3g' · Type: str

Basis set name for HF pretrain.

system.ecp

Default: None · Type: Any

Effective core potential specification.

system.electron_init_width

Default: 1.0 · Type: float

Width for electron position initialization.

Two-atom chain (`system.module=two_atom_chain`)#

system.symbol

Default: 'H' · Type: str

Atomic symbol.

system.bond_length

Default: 1.8 · Type: float

Distance between atoms along the chain.

system.unit

Default: bohr · Type: LengthUnit

Unit of the bond length (‘angstrom’ or ‘bohr’).

system.supercell

Default: 1 · Type: int

Supercell expansion factor along the chain direction.

system.vacuum_separation

Default: 100.0 · Type: float

Lattice constant in y and z directions (in Bohr) to isolate the 1D chain.

system.spin

Default: 0 · Type: int

Total spin polarization (n_up - n_down) for the primitive cell.

system.basis

Default: 'sto-3g' · Type: str

Basis set name for HF pretrain.

system.ecp

Default: None · Type: Any

Effective core potential specification.

system.electron_init_width

Default: 1.0 · Type: float

Width for electron position initialization.

Wavefunction (`wf.*`)#

Default module selection: solid. Its effective defaults are listed below. The built-in choice is solid.

wf.hidden_dims_single

Default: [256, 256, 256, 256] · Type: list[int]

Hidden dimensions for single-electron streams.

wf.hidden_dims_double

Default: [32, 32, 32, 32] · Type: list[int]

Hidden dimensions for pairwise streams.

wf.ndets

Default: 16 · Type: int

Number of determinants.

wf.distance_type

Default: nu · Type: DistanceType

Method to compute distances (e.g., ‘nu’ for nearest image).

wf.envelope_type

Default: abs_isotropic · Type: EnvelopeType

Type of envelope function to use.

wf.sym_type

Default: minimal · Type: SymmetryType

Symmetry type for features.

wf.orbitals_spin_split

Default: True · Type: bool

If True, use separate orbital layer and envelope parameters for each spin channel.

wf.full_det

Default: True · Type: bool

Whether to use full determinants (default: True).

Train Stage (`train.*`)#

The main VMC optimization loop. Samples electron configurations, computes energy, and updates wavefunction parameters.

Run options (`train.run.*`)#

train.run.check_vma

Default: True · Type: bool

Enable JAX validity checks during shard_map.

train.run.iterations

Default: 200000 · Type: int

Total number of iterations to run.

train.run.burn_in

Default: 100 · Type: int

Sampling iterations to discard before the main loop for MCMC equilibration.

train.run.save_time_interval

Default: 600 · Type: int

Minimum wall-clock seconds between checkpoint saves.

train.run.save_step_interval

Default: 1000 · Type: int

Save checkpoints only at steps that are multiples of this value.

train.run.stop_on_nan

Default: 'loss' · Type: bool | str

Abort training when NaN is detected in step statistics. True checks all stat keys, False disables the check, or pass a comma-separated string of specific keys to monitor (e.g. "loss").

Optimizer (`train.optim.*`)#

Default optimizer module: kfac. Effective defaults for the built-in optimizers are listed below.

KFAC options#

train.optim.learning_rate

Default: Standard · Type: swappable

The learning rate. Swappable component; the nested keys below are the options for the current module Standard and change when train.optim.learning_rate.module changes.

train.optim.learning_rate.module

Default: Standard · Type: module path

Select the implementation used for this component.

train.optim.learning_rate.rate

Default: 0.05 · Type: float

Initial learning rate.

train.optim.learning_rate.delay

Default: 2000 · Type: float

Delay in steps before decay starts.

train.optim.learning_rate.decay

Default: 1 · Type: float

Decay rate exponent.

train.optim.norm_constraint

Default: 0.001 · Type: float

The update is scaled down so that its approximate squared Fisher norm \(v^T F v\) is at most the specified value.

train.optim.curvature_ema

Default: 0.95 · Type: float

Decay factor used when calculating the covariance estimate moving averages.

train.optim.l2_reg

Default: 0.0 · Type: float

Tell the optimizer what L2 regularization coefficient you are using.

train.optim.inverse_update_period

Default: 1 · Type: int

Number of steps in between updating the inverse curvature approximation.

train.optim.damping

Default: 0.001 · Type: float

Fixed damping parameter.

SR options#

train.optim.learning_rate

Default: Standard · Type: swappable

Step size (scalar or schedule). Swappable component; the nested keys below are the options for the current module Standard and change when train.optim.learning_rate.module changes.

train.optim.learning_rate.module

Default: Standard · Type: module path

Select the implementation used for this component.

train.optim.learning_rate.rate

Default: 0.05 · Type: float

Initial learning rate.

train.optim.learning_rate.delay

Default: 2000 · Type: float

Delay in steps before decay starts.

train.optim.learning_rate.decay

Default: 1 · Type: float

Decay rate exponent.

train.optim.max_norm

Default: Constant · Type: swappable

Constrained update norm C (scalar or schedule). Swappable component; the nested keys below are the options for the current module Constant and change when train.optim.max_norm.module changes.

train.optim.max_norm.module

Default: Constant · Type: module path

Select the implementation used for this component.

train.optim.max_norm.rate

Default: 0.05 · Type: float

The constant rate.

train.optim.damping

Default: Constant · Type: swappable

Damping lambda (scalar or schedule). Swappable component; the nested keys below are the options for the current module Constant and change when train.optim.damping.module changes.

train.optim.damping.module

Default: Constant · Type: module path

Select the implementation used for this component.

train.optim.damping.rate

Default: 0.05 · Type: float

The constant rate.

train.optim.max_cond_num

Default: 10000000.0 · Type: float | None

Maximum condition number for adaptive damping.

train.optim.spring_mu

Default: Constant · Type: swappable

SPRING momentum coefficient mu (scalar or schedule). Swappable component; the nested keys below are the options for the current module Constant and change when train.optim.spring_mu.module changes.

train.optim.spring_mu.module

Default: Constant · Type: module path

Select the implementation used for this component.

train.optim.spring_mu.rate

Default: 0.05 · Type: float

The constant rate.

train.optim.march_beta

Default: Constant · Type: swappable

Decay factor for the MARCH variance accumulator (scalar or schedule). Swappable component; the nested keys below are the options for the current module Constant and change when train.optim.march_beta.module changes.

train.optim.march_beta.module

Default: Constant · Type: module path

Select the implementation used for this component.

train.optim.march_beta.rate

Default: 0.05 · Type: float

The constant rate.

train.optim.march_mode

Default: 'var' · Type: Literal[var, diff]

MARCH variance mode. "diff" uses update differences and "var" uses score variance along the batch axis.

train.optim.eps

Default: 1e-08 · Type: float

Small numerical constant for stability.

train.optim.mixed_precision

Default: True · Type: bool

Whether to use mixed precision for Gram factorization.

train.optim.score_chunk_size

Default: 128 · Type: int | None

Chunk size for score computation.

train.optim.score_norm_clip

Default: None · Type: float | None

Optional clip value for the mean absolute score per batch row.

train.optim.gram_num_chunks

Default: 4 · Type: int | None

Number of chunks for Gram matrix computation.

train.optim.gram_dot_prec

Default: 'F64' · Type: str | None

Precision mode for Gram matrix dot products.

train.optim.prune_inactive

Default: False · Type: bool

Whether to structurally prune inactive parameter leaves when forming the SR system.

Adam options#

train.optim.learning_rate

Default: Standard · Type: swappable

A global scaling factor, either fixed or evolving along iterations with a scheduler, see optax.scale_by_learning_rate(). Swappable component; the nested keys below are the options for the current module Standard and change when train.optim.learning_rate.module changes.

train.optim.learning_rate.module

Default: Standard · Type: module path

Select the implementation used for this component.

train.optim.learning_rate.rate

Default: 0.05 · Type: float

Initial learning rate.

train.optim.learning_rate.delay

Default: 2000 · Type: float

Delay in steps before decay starts.

train.optim.learning_rate.decay

Default: 1 · Type: float

Decay rate exponent.

train.optim.b1

Default: 0.9 · Type: float

Exponential decay rate to track the first moment of past gradients.

train.optim.b2

Default: 0.999 · Type: float

Exponential decay rate to track the second moment of past gradients.

train.optim.eps

Default: 1e-08 · Type: float

A small constant applied to denominator outside of the square root (as in the Adam paper) to avoid dividing by zero when rescaling.

train.optim.eps_root

Default: 0.0 · Type: float

A small constant applied to denominator inside the square root (as in RMSProp), to avoid dividing by zero when rescaling.

LAMB options#

train.optim.learning_rate

Default: Standard · Type: swappable

A global scaling factor, either fixed or evolving along iterations with a scheduler, see optax.scale_by_learning_rate(). Swappable component; the nested keys below are the options for the current module Standard and change when train.optim.learning_rate.module changes.

train.optim.learning_rate.module

Default: Standard · Type: module path

Select the implementation used for this component.

train.optim.learning_rate.rate

Default: 0.05 · Type: float

Initial learning rate.

train.optim.learning_rate.delay

Default: 2000 · Type: float

Delay in steps before decay starts.

train.optim.learning_rate.decay

Default: 1 · Type: float

Decay rate exponent.

train.optim.b1

Default: 0.9 · Type: float

Exponential decay rate to track the first moment of past gradients.

train.optim.b2

Default: 0.999 · Type: float

Exponential decay rate to track the second moment of past gradients.

train.optim.eps

Default: 1e-06 · Type: float

A small constant applied to denominator outside of the square root (as in the Adam paper) to avoid dividing by zero when rescaling.

train.optim.eps_root

Default: 0.0 · Type: float

A small constant applied to denominator inside the square root (as in RMSProp), to avoid dividing by zero when rescaling.

train.optim.weight_decay

Default: Constant · Type: swappable

Strength of the weight decay regularization. Swappable component; the nested keys below are the options for the current module Constant and change when train.optim.weight_decay.module changes.

train.optim.weight_decay.module

Default: Constant · Type: module path

Select the implementation used for this component.

train.optim.weight_decay.rate

Default: 0.05 · Type: float

The constant rate.

Sampler (`train.sampler.*`)#

Default sampler module: mcmc, and its effective keys are listed below.

train.sampler.steps

Default: 10 · Type: int

Number of Metropolis-Hastings updates per sample draw.

train.sampler.initial_width

Default: 0.1 · Type: float

Initial width (stddev) of the Gaussian proposal.

train.sampler.adapt_frequency

Default: 100 · Type: int

Frequency of adaptive width updates.

train.sampler.pmove_range

Default: (0.5, 0.55) · Type: tuple[float, float]

Target range for acceptance rate.

Writers (`train.writers.*`)#

The train stage enables console, csv, and hdf5 writers by default.

Console writer (`train.writers.console.*`)#

train.writers.console.interval

Default: 1 · Type: int

Step interval for logging.

train.writers.console.fields

Default: 'pmove:.2f,energy=total_energy:.4f,variance=total_energy_var:.4f' · Type: str

Comma-separated list of field specs.

CSV writer (`train.writers.csv.*`)#

train.writers.csv.path_template

Default: '{stage}_stats.csv' · Type: str

Output path template.

HDF5 writer (`train.writers.hdf5.*`)#

train.writers.hdf5.path_template

Default: '{stage}_stats.h5' · Type: str

Output path template.

Loss gradients (`train.grads.*`)#

Loss and gradient estimator. Computes the VMC loss and parameter gradients. See Loss and gradient for the mathematical derivation and outlier clipping details.

train.grads.loss_key

Default: 'total_energy' · Type: str

Key in prev_local_stats to use as the loss.

train.grads.clip_scale

Default: 5.0 · Type: float

Multiplier on the interquartile range (IQR) that sets the clipping window for local energies.

Pretrain Stage (`pretrain.*`)#

Initializes the neural network to approximate Hartree-Fock orbitals before VMC training. It uses the same run, sampler, and writer schemas as the train stage, but with a different optimizer default and a workflow-wired supervised loss.

Run options (`pretrain.run.*`)#

pretrain.run.check_vma

Default: True · Type: bool

Enable JAX validity checks during shard_map.

pretrain.run.iterations

Default: 2000 · Type: int

Total number of iterations to run.

pretrain.run.burn_in

Default: 100 · Type: int

Sampling iterations to discard before the main loop for MCMC equilibration.

pretrain.run.save_time_interval

Default: 600 · Type: int

Minimum wall-clock seconds between checkpoint saves.

pretrain.run.save_step_interval

Default: 1000 · Type: int

Save checkpoints only at steps that are multiples of this value.

pretrain.run.stop_on_nan

Default: 'loss' · Type: bool | str

Abort training when NaN is detected in step statistics. True checks all stat keys, False disables the check, or pass a comma-separated string of specific keys to monitor (e.g. "loss").

Optimizer (`pretrain.optim.*`)#

Default optimizer module: optax:adam, and its effective keys are listed below.

Effective Adam defaults#

pretrain.optim.learning_rate

Default: Standard · Type: swappable

A global scaling factor, either fixed or evolving along iterations with a scheduler, see optax.scale_by_learning_rate(). Swappable component; the nested keys below are the options for the current module Standard and change when pretrain.optim.learning_rate.module changes.

pretrain.optim.learning_rate.module

Default: Standard · Type: module path

Select the implementation used for this component.

pretrain.optim.learning_rate.rate

Default: 0.0003 · Type: float

Initial learning rate.

pretrain.optim.learning_rate.delay

Default: 2000 · Type: float

Delay in steps before decay starts.

pretrain.optim.learning_rate.decay

Default: 1 · Type: float

Decay rate exponent.

pretrain.optim.b1

Default: 0.9 · Type: float

Exponential decay rate to track the first moment of past gradients.

pretrain.optim.b2

Default: 0.999 · Type: float

Exponential decay rate to track the second moment of past gradients.

pretrain.optim.eps

Default: 1e-08 · Type: float

A small constant applied to denominator outside of the square root (as in the Adam paper) to avoid dividing by zero when rescaling.

pretrain.optim.eps_root

Default: 0.0 · Type: float

A small constant applied to denominator inside the square root (as in RMSProp), to avoid dividing by zero when rescaling.

Sampler (`pretrain.sampler.*`)#

Default sampler module: mcmc.

pretrain.sampler.steps

Default: 10 · Type: int

Number of Metropolis-Hastings updates per sample draw.

pretrain.sampler.initial_width

Default: 0.1 · Type: float

Initial width (stddev) of the Gaussian proposal.

pretrain.sampler.adapt_frequency

Default: 100 · Type: int

Frequency of adaptive width updates.

pretrain.sampler.pmove_range

Default: (0.5, 0.55) · Type: tuple[float, float]

Target range for acceptance rate.

Writers (`pretrain.writers.*`)#

The pretrain stage enables console, csv, and hdf5 writers by default.

Console writer (`pretrain.writers.console.*`)#

pretrain.writers.console.interval

Default: 1 · Type: int

Step interval for logging.

pretrain.writers.console.fields

Default: 'loss' · Type: str

Comma-separated list of field specs.

CSV writer (`pretrain.writers.csv.*`)#

pretrain.writers.csv.path_template

Default: '{stage}_stats.csv' · Type: str

Output path template.

HDF5 writer (`pretrain.writers.hdf5.*`)#

pretrain.writers.hdf5.path_template

Default: '{stage}_stats.h5' · Type: str

Output path template.

Loss gradients#

Pretraining does not use configurable pretrain.grads.* settings. The workflow wires a supervised Hartree-Fock orbital-matching loss directly.

Estimators (`estimators.*`)#

Energy estimators are configured programmatically by the workflow and are not typically overridden via config. The same definitions are used by Evaluation. For physics and derivations, see How Estimators Work. For the API, see Estimators.

PotentialEnergy uses Ewald summation for periodic Coulomb interactions and is always present. TotalEnergy automatically sums all energy:-prefixed components. Neither is configurable via a config key.

estimators.enabled.spin defaults to false.

Kinetic energy (`estimators.energy.kinetic.*`)#

Kinetic energy estimator. See Kinetic energy for physics details and Laplacian mode trade-offs.

estimators.energy.kinetic.mode

Default: forward_laplacian · Type: LaplacianMode

Laplacian computation strategy. forward_laplacian is the default for JAX 0.7.1 and later, scan for earlier versions.

estimators.energy.kinetic.sparsity_threshold

Default: 0 · Type: int

Sparsity threshold when using forward_laplacian mode.

ECP energy (`estimators.energy.ecp.*`)#

ECP (pseudopotential) energy estimator. Added automatically when ecp is set in the system config. See Pseudopotentials for physics details and quadrature options.

estimators.energy.ecp.max_core

Default: 2 · Type: int

Maximum number of nearest ECP atoms to consider per electron when evaluating nonlocal integrals.

estimators.energy.ecp.quadrature_id

Default: None · Type: str | None

Spherical quadrature rule used to evaluate nonlocal ECP integrals.

estimators.energy.ecp.electrons_field

Default: 'electrons' · Type: str

Name of electron position field in data.

estimators.energy.ecp.atoms_field

Default: 'atoms' · Type: str

Name of atom position field in data.

Training

Contents

Training#

Workflow (workflow.*)#

System (system.*)#

Arbitrary crystals (default)#

Rock salt (system.module=rock_salt)#

Two-atom chain (system.module=two_atom_chain)#

Wavefunction (wf.*)#

Train Stage (train.*)#

Run options (train.run.*)#

Optimizer (train.optim.*)#

KFAC options#

SR options#

Adam options#

LAMB options#

Sampler (train.sampler.*)#

Writers (train.writers.*)#

Console writer (train.writers.console.*)#

CSV writer (train.writers.csv.*)#

HDF5 writer (train.writers.hdf5.*)#

Loss gradients (train.grads.*)#

Pretrain Stage (pretrain.*)#

Run options (pretrain.run.*)#

Optimizer (pretrain.optim.*)#

Effective Adam defaults#

Sampler (pretrain.sampler.*)#

Writers (pretrain.writers.*)#

Console writer (pretrain.writers.console.*)#

CSV writer (pretrain.writers.csv.*)#

HDF5 writer (pretrain.writers.hdf5.*)#

Loss gradients#

Estimators (estimators.*)#

Kinetic energy (estimators.energy.kinetic.*)#

ECP energy (estimators.energy.ecp.*)#

Workflow (`workflow.*`)#

System (`system.*`)#

Rock salt (`system.module=rock_salt`)#

Two-atom chain (`system.module=two_atom_chain`)#

Wavefunction (`wf.*`)#

Train Stage (`train.*`)#

Run options (`train.run.*`)#

Optimizer (`train.optim.*`)#

Sampler (`train.sampler.*`)#

Writers (`train.writers.*`)#

Console writer (`train.writers.console.*`)#

CSV writer (`train.writers.csv.*`)#

HDF5 writer (`train.writers.hdf5.*`)#

Loss gradients (`train.grads.*`)#

Pretrain Stage (`pretrain.*`)#

Run options (`pretrain.run.*`)#

Optimizer (`pretrain.optim.*`)#

Sampler (`pretrain.sampler.*`)#

Writers (`pretrain.writers.*`)#

Console writer (`pretrain.writers.console.*`)#

CSV writer (`pretrain.writers.csv.*`)#

HDF5 writer (`pretrain.writers.hdf5.*`)#

Estimators (`estimators.*`)#

Kinetic energy (`estimators.energy.kinetic.*`)#

ECP energy (`estimators.energy.ecp.*`)#