Diagnostics#
Host-side diagnostics for sample quality and NRPT run health. These run in
numpy (no XLA compile) over the collected samples and the stats dict returned
by hamon.nrpt / hamon.tune_schedule.
hamon.report_nrpt_diagnostics(stats: dict, samples: Bool[Array, 'n_samples n_variables'] | None = None, *, tau_min: float = 0.01, efficiency_fail: float = 0.2, efficiency_warn: float = 0.35, rej_std_max: float = 0.15, entropy_frozen: float = 0.05, entropy_uniform: float = 0.95, min_attempts: int = 50, ess_warn: float = 0.1) -> NRPTHealthReport
#
Evaluate NRPT stats (and optionally samples) into a single verdict.
For NRPT, round-trip diagnostics are the primary quality signal: states must travel the full temperature ladder for tempering to work. Marginal-convergence checks are reported for information but never used as pass/fail criteria — when PT correctly samples multiple modes, the marginals shift between halves of the run and a naive convergence test produces false "NEED_MORE" verdicts.
Decision criteria (each threshold is a keyword argument):
- ISSUE:
tau_observed < tau_min— no round trips, information is not flowing through the ladder. - ISSUE/WARN:
efficiency < efficiency_fail/< efficiency_warn— the round-trip rate is below the ELE-optimal τ̄. The report setsefficiency_limiterto attribute the cause and point at the right knob:"schedule"when the ladder is not equalized (std(rejection_rates) > rej_std_max— tune further / add chains) or"local_exploration"when it is equalized (an ELE violation — raisegibbs_steps_per_round, or add chains as the alternative lever). A chain-count recommendation is included either way. - ISSUE:
std(rejection_rates) > rej_std_max— schedule not equalized. - ISSUE:
marginal_entropy < entropy_frozen— sampler frozen. - WARN:
marginal_entropy > entropy_uniform— β may be too low. - WARN: a sharp peak in the λ(β) profile (barrier bottleneck).
- WARN:
ess_fraction < ess_warn— worst-mixing variable has low effective sample size (informational; never a hard failure).
All of these statistics are noisy when few swaps were attempted: when
min(attempted) < min_attempts the would-be issues are demoted to
warnings and insufficient_data is set instead of condemning a short
tuning probe.
Arguments:
stats: The stats dict returned byhamon.nrpt/hamon.tune_schedule.samples: Optional node-ordered boolean samples (e.g. fromhamon.nrpt_node_samples); enables the entropy and convergence sections.
Returns:
An :class:NRPTHealthReport. Issues are logged at WARNING level.
hamon.NRPTHealthReport
#
Result of :func:report_nrpt_diagnostics.
Attributes:
| Name | Type | Description |
|---|---|---|
healthy |
|
|
insufficient_data |
|
|
issues |
Hard failures — the samples should not be trusted. |
|
warnings |
Soft findings worth investigating. |
|
acceptance_mean |
/ rejection_std
|
Swap-rate statistics. |
total_round_trips |
/ rejection_std
|
Round-trip diagnostics ( |
barrier_identified |
/ rejection_std
|
|
recommended_n_chains |
/ rejection_std
|
Suggested chain count when efficiency is low. |
efficiency_limiter |
/ rejection_std
|
When round-trip efficiency is low, which knob to
turn — |
barrier_peak_beta |
/ rejection_std
|
Midpoint β of a sharp barrier peak, if detected. |
convergence_status |
/ rank_stability / marginal_entropy
|
Sample-based
metrics ( |
min_ess |
/ median_ess / ess_fraction
|
Effective-sample-size summaries
over the provided samples ( |
summary() -> str
#
Human-readable multi-line summary.
hamon.effective_sample_size(samples: Shaped[Array, 'n_samples n_variables'] | np.ndarray) -> ESSReport
#
Estimate the effective sample size of an autocorrelated MCMC trace.
MCMC draws are autocorrelated, so n correlated samples carry the
information of fewer independent ones. ESS estimates that effective count:
the Monte-Carlo error of any estimate computed from the trace scales as
σ/√ESS, not σ/√n. For an iid trace ESS ≈ n; for a slowly mixing
one it can be far smaller. Pair ess_fraction (or min_ess) with the
run's wall-clock time to get ESS/second, the efficiency metric used to
compare schedules or chain counts.
Computed on the host with numpy (FFT autocorrelation + Geyer
initial-positive-sequence; see :func:_ess_1d), so there is no XLA compile
cost. Inputs may be jax arrays; they are pulled to the host once.
For multimodal parallel tempering, a low single-marginal ESS can reflect the chain correctly jumping between modes (mode switches are long-range correlation), so read ESS alongside the round-trip diagnostics rather than instead of them.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
samples
|
Shaped[Array, 'n_samples n_variables'] | ndarray
|
array of shape |
required |
Returns:
| Name | Type | Description |
|---|---|---|
An |
ESSReport
|
class: |
hamon.ESSReport
#
Result of :func:effective_sample_size.
Attributes:
| Name | Type | Description |
|---|---|---|
per_variable |
per-column effective sample size, shape |
|
min_ess |
smallest ESS across variables (the worst-mixing variable — the conservative number to quote). |
|
median_ess |
/ mean_ess
|
summary ESS across variables. |
ess_fraction |
/ mean_ess
|
|
n_samples |
/ mean_ess
|
number of samples the estimate was computed from. |
__annotations__
class-attribute
#
dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object's (key, value) pairs dict(iterable) -> new dictionary initialized as if via: d = {} for k, v in iterable: d[k] = v dict(**kwargs) -> new dictionary initialized with the name=value pairs in the keyword argument list. For example: dict(one=1, two=2)
__dataclass_fields__
class-attribute
#
dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object's (key, value) pairs dict(iterable) -> new dictionary initialized as if via: d = {} for k, v in iterable: d[k] = v dict(**kwargs) -> new dictionary initialized with the name=value pairs in the keyword argument list. For example: dict(one=1, two=2)
__dataclass_params__
class-attribute
#
__dict__
class-attribute
#
Read-only proxy of a mapping.
__doc__
class-attribute
#
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to 'utf-8'. errors defaults to 'strict'.
__firstlineno__
class-attribute
#
int([x]) -> integer int(x, base=10) -> integer
Convert a number or string to an integer, or return 0 if no arguments are given. If x is a number, return x.int(). For floating-point numbers, this truncates towards zero.
If x is not a number or if base is given, then x must be a string, bytes, or bytearray instance representing an integer literal in the given base. The literal can be preceded by '+' or '-' and be surrounded by whitespace. The base defaults to 10. Valid bases are 0 and 2-36. Base 0 means to interpret the base from the string as an integer iteral.
int('0b100', base=0) 4
__match_args__
class-attribute
#
Built-in immutable sequence.
If no argument is given, the constructor returns an empty tuple. If iterable is specified the tuple is initialized from iterable's items.
If the argument is a tuple, the return value is the same object.
__module__
class-attribute
#
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to 'utf-8'. errors defaults to 'strict'.
__static_attributes__
class-attribute
#
Built-in immutable sequence.
If no argument is given, the constructor returns an empty tuple. If iterable is specified the tuple is initialized from iterable's items.
If the argument is a tuple, the return value is the same object.
__weakref__
property
#
list of weak references to the object
__eq__(other)
#
__init__(per_variable: np.ndarray, min_ess: float, median_ess: float, mean_ess: float, ess_fraction: float, n_samples: int) -> None
#
__repr__()
#
hamon.diagnostics.sample_convergence(samples: Bool[Array, 'n_samples n_variables'], *, target_k: int = 15, drift_threshold: float = 0.01, jaccard_threshold: float = 0.8) -> ConvergenceReport
#
Measure stability of marginal probability estimates.
Splits samples into quartile checkpoints (25 %, 50 %, 75 %, 100 %), computes marginals at each checkpoint, and reports the L1 drift between consecutive checkpoints together with the rank stability of the top-k variables.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
samples
|
Bool[Array, 'n_samples n_variables']
|
boolean array of shape |
required |
target_k
|
int
|
number of top variables to track for rank stability. |
15
|
drift_threshold
|
float
|
maximum acceptable L1 drift per variable for the final checkpoint to be considered converged. |
0.01
|
jaccard_threshold
|
float
|
minimum Jaccard similarity of top-k sets between halves for rank stability to be considered converged. |
0.8
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
ConvergenceReport
|
class: |
hamon.diagnostics.marginal_entropy(samples: Bool[Array, 'n_samples n_variables']) -> float
#
Normalized entropy of the empirical marginal distribution.
Computes the mean per-variable binary entropy, normalized to [0, 1]. A value near 0 means most variables are frozen (all True or all False); near 1 means each variable is near 50/50.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
samples
|
Bool[Array, 'n_samples n_variables']
|
boolean array of shape |
required |
Returns:
| Type | Description |
|---|---|
float
|
Scalar in [0, 1]. |
hamon.diagnostics.energy_balance(biases: Shaped[Array, ' n'], edges: Shaped[Array, 'm 2'], weights: Shaped[Array, ' m'], *, beta: float = 1.0, warn_low: float = 0.05, warn_high: float = 2.0) -> EnergyBalanceReport
#
Check whether bias and coupling energy scales are balanced.
Computes the energy contribution from biases vs couplings at the given
temperature and reports their ratio. Logs a warning when the ratio
falls outside [warn_low, warn_high].
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
biases
|
Shaped[Array, ' n']
|
per-node bias array of shape |
required |
edges
|
Shaped[Array, 'm 2']
|
integer index pairs of shape |
required |
weights
|
Shaped[Array, ' m']
|
per-edge coupling of shape |
required |
beta
|
float
|
inverse temperature. |
1.0
|
warn_low
|
float
|
ratio below which a warning is logged. |
0.05
|
warn_high
|
float
|
ratio above which a warning is logged. |
2.0
|
Returns:
| Name | Type | Description |
|---|---|---|
An |
EnergyBalanceReport
|
class: |