tdhook.latent.dimension_estimation#

Intrinsic dimension estimation methods.

Submodules#

Classes#

CaPcaDimensionEstimator

Curvature-adjusted intrinsic dimension estimation via local PCA [25].

LocalKnnDimensionEstimator

Local intrinsic dimension estimation via k-NN distances [20].

LocalPcaDimensionEstimator

Local intrinsic dimension estimation via PCA on k-NN neighborhoods [26].

TwoNnDimensionEstimator

Intrinsic dimension estimation via the Two NN algorithm [21].

Package Contents#

class tdhook.latent.dimension_estimation.CaPcaDimensionEstimator(k='auto', in_key='data', out_key='dimension', eps=1e-05)[source]#

Bases: tensordict.nn.TensorDictModuleBase

Curvature-adjusted intrinsic dimension estimation via local PCA [25].

Extends local PCA by calibrating to a quadratic embedding instead of a flat unit ball, accounting for manifold curvature. For each point, uses its k+1 nearest neighbors, forms the local covariance, and selects dimension by comparing curvature-corrected eigenvalues to the expected spectrum of a d-dimensional ball.

Reads a data tensor from the input TensorDict. Expects (N, D) or (…, N, D). Outputs per-point dimension estimates of shape (…, N).

Parameters:
  • k (Union[int, Literal['auto']])

  • in_key (str)

  • out_key (str)

  • eps (float)

k = 'auto'#
in_key = 'data'#
out_key = 'dimension'#
eps = 1e-05#
in_keys#
out_keys#
forward(td)[source]#
Parameters:

td (tensordict.TensorDict)

Return type:

tensordict.TensorDict

__repr__()[source]#
class tdhook.latent.dimension_estimation.LocalKnnDimensionEstimator(k='auto', in_key='data', out_key='dimension', eps=1e-05)[source]#

Bases: tensordict.nn.TensorDictModuleBase

Local intrinsic dimension estimation via k-NN distances [20].

For each point x, d(x) = ln(2) / ln(R2k/Rk), where Rk and R2k are distances to the k-th and 2k-th nearest neighbors respectively.

Reads a data tensor from the input TensorDict. Expects (N, D) or (…, N, D). Outputs per-point dimension estimates of shape (…, N).

Parameters:
  • k (Union[int, Literal['auto']])

  • in_key (str)

  • out_key (str)

  • eps (float)

k = 'auto'#
in_key = 'data'#
out_key = 'dimension'#
eps = 1e-05#
in_keys#
out_keys#
forward(td)[source]#
Parameters:

td (tensordict.TensorDict)

Return type:

tensordict.TensorDict

__repr__()[source]#
class tdhook.latent.dimension_estimation.LocalPcaDimensionEstimator(k='auto', criterion='maxgap', alpha=0.05, in_key='data', out_key='dimension', eps=1e-05)[source]#

Bases: tensordict.nn.TensorDictModuleBase

Local intrinsic dimension estimation via PCA on k-NN neighborhoods [26].

For each point, extracts its k+1 nearest neighbors (self + k neighbors), fits PCA, and estimates dimension from eigenvalues using a configurable criterion (maxgap or ratio).

Reads a data tensor from the input TensorDict. Expects (N, D) or (…, N, D). Outputs per-point dimension estimates of shape (…, N).

Parameters:
  • k (Union[int, Literal['auto']])

  • criterion (Literal['maxgap', 'ratio'])

  • alpha (float)

  • in_key (str)

  • out_key (str)

  • eps (float)

k = 'auto'#
criterion = 'maxgap'#
alpha = 0.05#
in_key = 'data'#
out_key = 'dimension'#
eps = 1e-05#
in_keys#
out_keys#
forward(td)[source]#
Parameters:

td (tensordict.TensorDict)

Return type:

tensordict.TensorDict

__repr__()[source]#
class tdhook.latent.dimension_estimation.TwoNnDimensionEstimator(in_key='data', out_key='dimension', return_xy=False, eps=1e-05)[source]#

Bases: tensordict.nn.TensorDictModuleBase

Intrinsic dimension estimation via the Two NN algorithm [21].

Reads a data tensor from the input TensorDict. Expects (N, D) or (…, N, D). For (…, N, D), flattens all leading dims, computes one dimension per dataset, stacks and reshapes to preserve the original batch shape (excluding last two dims).

Parameters:
  • in_key (str)

  • out_key (str)

  • return_xy (bool)

  • eps (float)

in_key = 'data'#
out_key = 'dimension'#
return_xy = False#
eps = 1e-05#
in_keys#
out_keys#
forward(td)[source]#
Parameters:

td (tensordict.TensorDict)

Return type:

tensordict.TensorDict

__repr__()[source]#