Diagnostics#

Host-side diagnostics for sample quality and NRPT run health. These run in numpy (no XLA compile) over the collected samples and the stats dict returned by hamon.nrpt / hamon.tune_schedule.

`hamon.report_nrpt_diagnostics(stats: dict, samples: Bool[Array, 'n_samples n_variables'] | None = None, *, tau_min: float = 0.01, efficiency_fail: float = 0.2, efficiency_warn: float = 0.35, rej_std_max: float = 0.15, entropy_frozen: float = 0.05, entropy_uniform: float = 0.95, min_attempts: int = 50, ess_warn: float = 0.1) -> NRPTHealthReport` #

Evaluate NRPT stats (and optionally samples) into a single verdict.

For NRPT, round-trip diagnostics are the primary quality signal: states must travel the full temperature ladder for tempering to work. Marginal-convergence checks are reported for information but never used as pass/fail criteria — when PT correctly samples multiple modes, the marginals shift between halves of the run and a naive convergence test produces false "NEED_MORE" verdicts.

Decision criteria (each threshold is a keyword argument):

ISSUE: tau_observed < tau_min — no round trips, information is not flowing through the ladder.
ISSUE/WARN: efficiency < efficiency_fail / < efficiency_warn — the round-trip rate is below the ELE-optimal τ̄. The report sets efficiency_limiter to attribute the cause and point at the right knob: "schedule" when the ladder is not equalized (std(rejection_rates) > rej_std_max — tune further / add chains) or "local_exploration" when it is equalized (an ELE violation — raise gibbs_steps_per_round, or add chains as the alternative lever). A chain-count recommendation is included either way.
ISSUE: std(rejection_rates) > rej_std_max — schedule not equalized.
ISSUE: marginal_entropy < entropy_frozen — sampler frozen.
WARN: marginal_entropy > entropy_uniform — β may be too low.
WARN: a sharp peak in the λ(β) profile (barrier bottleneck).
WARN: ess_fraction < ess_warn — worst-mixing variable has low effective sample size (informational; never a hard failure).

All of these statistics are noisy when few swaps were attempted: when min(attempted) < min_attempts the would-be issues are demoted to warnings and insufficient_data is set instead of condemning a short tuning probe.

Arguments:

stats: The stats dict returned by hamon.nrpt / hamon.tune_schedule.
samples: Optional node-ordered boolean samples (e.g. from hamon.nrpt_node_samples); enables the entropy and convergence sections.

Returns:

An :class:NRPTHealthReport. Issues are logged at WARNING level.

`hamon.NRPTHealthReport` #

Result of :func:report_nrpt_diagnostics.

Attributes:

Name	Type	Description
`healthy`		`True` when no issues were found and there was enough data to judge.
`insufficient_data`		`True` when swap-attempt counts were too low to apply the pass/fail criteria; would-be issues are demoted to warnings and `healthy` reflects only what could be checked.
`issues`		Hard failures — the samples should not be trusted.
`warnings`		Soft findings worth investigating.
`acceptance_mean`	`/ rejection_std`	Swap-rate statistics.
`total_round_trips`	`/ rejection_std`	Round-trip diagnostics (`None` when the run was made with `track_round_trips=False`).
`barrier_identified`	`/ rejection_std`	`False` when the index process did not round-trip (a stalled conveyor), so `Lambda` is a within-basin artifact and must not be trusted — add chains / equalize the ladder. `True` when round trips flowed; `None` when round-trip diagnostics were unavailable. See :func:`hamon.round_trips.barrier_is_identified`.
`recommended_n_chains`	`/ rejection_std`	Suggested chain count when efficiency is low.
`efficiency_limiter`	`/ rejection_std`	When round-trip efficiency is low, which knob to turn — `"schedule"` (the ladder is not equalized: tune it further or add chains) or `"local_exploration"` (the ladder is equalized, so the local kernel is the bottleneck — an ELE violation; raise `gibbs_steps_per_round`, or increase N as the alternative lever). `None` when efficiency is healthy or unavailable.
`barrier_peak_beta`	`/ rejection_std`	Midpoint β of a sharp barrier peak, if detected.
`convergence_status`	`/ rank_stability / marginal_entropy`	Sample-based metrics (`None` when samples was not provided). For NRPT, convergence is informational only — correct multi-modal sampling shifts marginals between run halves, so a non-CONVERGED status is not treated as a failure.
`min_ess`	`/ median_ess / ess_fraction`	Effective-sample-size summaries over the provided samples (`None` when not provided). `ess_fraction` is `min_ess / n_samples` for the worst-mixing variable; a low value drives a warning (never a hard failure — see :func:`effective_sample_size` on the multimodal caveat).

`summary() -> str` #

Human-readable multi-line summary.

`hamon.effective_sample_size(samples: Shaped[Array, 'n_samples n_variables'] | np.ndarray) -> ESSReport` #

Estimate the effective sample size of an autocorrelated MCMC trace.

MCMC draws are autocorrelated, so n correlated samples carry the information of fewer independent ones. ESS estimates that effective count: the Monte-Carlo error of any estimate computed from the trace scales as σ/√ESS, not σ/√n. For an iid trace ESS ≈ n; for a slowly mixing one it can be far smaller. Pair ess_fraction (or min_ess) with the run's wall-clock time to get ESS/second, the efficiency metric used to compare schedules or chain counts.

Computed on the host with numpy (FFT autocorrelation + Geyer initial-positive-sequence; see :func:_ess_1d), so there is no XLA compile cost. Inputs may be jax arrays; they are pulled to the host once.

For multimodal parallel tempering, a low single-marginal ESS can reflect the chain correctly jumping between modes (mode switches are long-range correlation), so read ESS alongside the round-trip diagnostics rather than instead of them.

Parameters:

Name	Type	Description	Default
`samples`	`Shaped[Array, 'n_samples n_variables'] \| ndarray`	array of shape `(n_samples, n_variables)` (a 1-D `(n_samples,)` trace is treated as a single variable). Boolean or numeric.	required

Returns:

Name	Type	Description
`An`	`ESSReport`	class:`ESSReport`.

`hamon.ESSReport` #

Result of :func:effective_sample_size.

Attributes:

Name	Type	Description
`per_variable`		per-column effective sample size, shape `(n_variables,)`.
`min_ess`		smallest ESS across variables (the worst-mixing variable — the conservative number to quote).
`median_ess`	`/ mean_ess`	summary ESS across variables.
`ess_fraction`	`/ mean_ess`	`min_ess / n_samples` — the efficiency of the worst-mixing variable, in `[0, 1]`.
`n_samples`	`/ mean_ess`	number of samples the estimate was computed from.

`annotations` `class-attribute` #

dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object's (key, value) pairs dict(iterable) -> new dictionary initialized as if via: d = {} for k, v in iterable: d[k] = v dict(**kwargs) -> new dictionary initialized with the name=value pairs in the keyword argument list. For example: dict(one=1, two=2)

`__dataclass_fields__` `class-attribute` #

dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object's (key, value) pairs dict(iterable) -> new dictionary initialized as if via: d = {} for k, v in iterable: d[k] = v dict(**kwargs) -> new dictionary initialized with the name=value pairs in the keyword argument list. For example: dict(one=1, two=2)

`__dataclass_params__` `class-attribute` #

`dict` `class-attribute` #

Read-only proxy of a mapping.

`doc` `class-attribute` #

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to 'utf-8'. errors defaults to 'strict'.

`firstlineno` `class-attribute` #

int([x]) -> integer int(x, base=10) -> integer

Convert a number or string to an integer, or return 0 if no arguments are given. If x is a number, return x.int(). For floating-point numbers, this truncates towards zero.

If x is not a number or if base is given, then x must be a string, bytes, or bytearray instance representing an integer literal in the given base. The literal can be preceded by '+' or '-' and be surrounded by whitespace. The base defaults to 10. Valid bases are 0 and 2-36. Base 0 means to interpret the base from the string as an integer iteral.

int('0b100', base=0) 4

`__match_args__` `class-attribute` #

Built-in immutable sequence.

If no argument is given, the constructor returns an empty tuple. If iterable is specified the tuple is initialized from iterable's items.

If the argument is a tuple, the return value is the same object.

`module` `class-attribute` #

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to 'utf-8'. errors defaults to 'strict'.

`__static_attributes__` `class-attribute` #

Built-in immutable sequence.

If no argument is given, the constructor returns an empty tuple. If iterable is specified the tuple is initialized from iterable's items.

If the argument is a tuple, the return value is the same object.

`weakref` `property` #

list of weak references to the object

`eq(other)` #

`init(per_variable: np.ndarray, min_ess: float, median_ess: float, mean_ess: float, ess_fraction: float, n_samples: int) -> None` #

`repr()` #

`hamon.diagnostics.sample_convergence(samples: Bool[Array, 'n_samples n_variables'], *, target_k: int = 15, drift_threshold: float = 0.01, jaccard_threshold: float = 0.8) -> ConvergenceReport` #

Measure stability of marginal probability estimates.

Splits samples into quartile checkpoints (25 %, 50 %, 75 %, 100 %), computes marginals at each checkpoint, and reports the L1 drift between consecutive checkpoints together with the rank stability of the top-k variables.

Parameters:

Name	Type	Description	Default
`samples`	`Bool[Array, 'n_samples n_variables']`	boolean array of shape `(n_samples, n_variables)`.	required
`target_k`	`int`	number of top variables to track for rank stability.	`15`
`drift_threshold`	`float`	maximum acceptable L1 drift per variable for the final checkpoint to be considered converged.	`0.01`
`jaccard_threshold`	`float`	minimum Jaccard similarity of top-k sets between halves for rank stability to be considered converged.	`0.8`

Returns:

Name	Type	Description
`A`	`ConvergenceReport`	class:`ConvergenceReport`.

`hamon.diagnostics.marginal_entropy(samples: Bool[Array, 'n_samples n_variables']) -> float` #

Normalized entropy of the empirical marginal distribution.

Computes the mean per-variable binary entropy, normalized to [0, 1]. A value near 0 means most variables are frozen (all True or all False); near 1 means each variable is near 50/50.

Parameters:

Name	Type	Description	Default
`samples`	`Bool[Array, 'n_samples n_variables']`	boolean array of shape `(n_samples, n_variables)`.	required

Returns:

Type	Description
`float`	Scalar in [0, 1].

`hamon.diagnostics.energy_balance(biases: Shaped[Array, ' n'], edges: Shaped[Array, 'm 2'], weights: Shaped[Array, ' m'], *, beta: float = 1.0, warn_low: float = 0.05, warn_high: float = 2.0) -> EnergyBalanceReport` #

Check whether bias and coupling energy scales are balanced.

Computes the energy contribution from biases vs couplings at the given temperature and reports their ratio. Logs a warning when the ratio falls outside [warn_low, warn_high].

Parameters:

Name	Type	Description	Default
`biases`	`Shaped[Array, ' n']`	per-node bias array of shape `(n,)`.	required
`edges`	`Shaped[Array, 'm 2']`	integer index pairs of shape `(m, 2)`.	required
`weights`	`Shaped[Array, ' m']`	per-edge coupling of shape `(m,)`.	required
`beta`	`float`	inverse temperature.	`1.0`
`warn_low`	`float`	ratio below which a warning is logged.	`0.05`
`warn_high`	`float`	ratio above which a warning is logged.	`2.0`

Returns:

Name	Type	Description
`An`	`EnergyBalanceReport`	class:`EnergyBalanceReport`.

Diagnostics#

hamon.NRPTHealthReport #

summary() -> str #

hamon.effective_sample_size(samples: Shaped[Array, 'n_samples n_variables'] | np.ndarray) -> ESSReport #

hamon.ESSReport #

__annotations__ class-attribute #

__dataclass_fields__ class-attribute #

__dataclass_params__ class-attribute #

__dict__ class-attribute #

__doc__ class-attribute #

__firstlineno__ class-attribute #

__match_args__ class-attribute #

__module__ class-attribute #

__static_attributes__ class-attribute #

__weakref__ property #

__eq__(other) #

__init__(per_variable: np.ndarray, min_ess: float, median_ess: float, mean_ess: float, ess_fraction: float, n_samples: int) -> None #

__repr__() #

hamon.diagnostics.sample_convergence(samples: Bool[Array, 'n_samples n_variables'], *, target_k: int = 15, drift_threshold: float = 0.01, jaccard_threshold: float = 0.8) -> ConvergenceReport #

hamon.diagnostics.marginal_entropy(samples: Bool[Array, 'n_samples n_variables']) -> float #

hamon.diagnostics.energy_balance(biases: Shaped[Array, ' n'], edges: Shaped[Array, 'm 2'], weights: Shaped[Array, ' m'], *, beta: float = 1.0, warn_low: float = 0.05, warn_high: float = 2.0) -> EnergyBalanceReport #

`hamon.NRPTHealthReport` #

`summary() -> str` #

`hamon.effective_sample_size(samples: Shaped[Array, 'n_samples n_variables'] | np.ndarray) -> ESSReport` #

`hamon.ESSReport` #

`annotations` `class-attribute` #

`__dataclass_fields__` `class-attribute` #

`__dataclass_params__` `class-attribute` #

`dict` `class-attribute` #

`doc` `class-attribute` #

`firstlineno` `class-attribute` #

`__match_args__` `class-attribute` #

`module` `class-attribute` #

`__static_attributes__` `class-attribute` #

`weakref` `property` #

`eq(other)` #

`init(per_variable: np.ndarray, min_ess: float, median_ess: float, mean_ess: float, ess_fraction: float, n_samples: int) -> None` #

`repr()` #

`hamon.diagnostics.sample_convergence(samples: Bool[Array, 'n_samples n_variables'], *, target_k: int = 15, drift_threshold: float = 0.01, jaccard_threshold: float = 0.8) -> ConvergenceReport` #

`hamon.diagnostics.marginal_entropy(samples: Bool[Array, 'n_samples n_variables']) -> float` #

`hamon.diagnostics.energy_balance(biases: Shaped[Array, ' n'], edges: Shaped[Array, 'm 2'], weights: Shaped[Array, ' m'], *, beta: float = 1.0, warn_low: float = 0.05, warn_high: float = 2.0) -> EnergyBalanceReport` #