tdhook.latent.dimension_estimation.local_pca#

Local PCA dimension estimation via eigenvalues of local covariance.

Classes#

LocalPcaDimensionEstimator

Local intrinsic dimension estimation via PCA on k-NN neighborhoods [26].

Functions#

_local_pca(data, k, eps, criterion, alpha, pca_cls)

Compute per-point local dimension via PCA. data: (N, D). Returns (N,) dimension estimates.

_dim_from_eigenvalues_maxgap(lambda_)

Estimate dimension from eigenvalues using the maximum gap criterion [27].

_dim_from_eigenvalues_ratio(lambda_, alpha)

Estimate dimension using ratio criterion [26].

Module Contents#

class tdhook.latent.dimension_estimation.local_pca.LocalPcaDimensionEstimator(k='auto', criterion='maxgap', alpha=0.05, in_key='data', out_key='dimension', eps=1e-05)[source]#

Bases: tensordict.nn.TensorDictModuleBase

Local intrinsic dimension estimation via PCA on k-NN neighborhoods [26].

For each point, extracts its k+1 nearest neighbors (self + k neighbors), fits PCA, and estimates dimension from eigenvalues using a configurable criterion (maxgap or ratio).

Reads a data tensor from the input TensorDict. Expects (N, D) or (…, N, D). Outputs per-point dimension estimates of shape (…, N).

Parameters:
  • k (Union[int, Literal['auto']])

  • criterion (Literal['maxgap', 'ratio'])

  • alpha (float)

  • in_key (str)

  • out_key (str)

  • eps (float)

k = 'auto'[source]#
criterion = 'maxgap'[source]#
alpha = 0.05[source]#
in_key = 'data'[source]#
out_key = 'dimension'[source]#
eps = 1e-05[source]#
in_keys[source]#
out_keys[source]#
forward(td)[source]#
Parameters:

td (tensordict.TensorDict)

Return type:

tensordict.TensorDict

__repr__()[source]#
tdhook.latent.dimension_estimation.local_pca._local_pca(data, k, eps, criterion, alpha, pca_cls)[source]#

Compute per-point local dimension via PCA. data: (N, D). Returns (N,) dimension estimates.

Parameters:
  • data (torch.Tensor)

  • k (int)

  • eps (float)

  • criterion (Literal['maxgap', 'ratio'])

  • alpha (float)

  • pca_cls (type)

Return type:

torch.Tensor

tdhook.latent.dimension_estimation.local_pca._dim_from_eigenvalues_maxgap(lambda_)[source]#

Estimate dimension from eigenvalues using the maximum gap criterion [27].

de = argmax(lambda[i]/lambda[i+1]) + 1 (1-based dimension).

Parameters:

lambda_ (numpy.ndarray)

Return type:

int

tdhook.latent.dimension_estimation.local_pca._dim_from_eigenvalues_ratio(lambda_, alpha)[source]#

Estimate dimension using ratio criterion [26].

Count eigenvalues above alpha * lambda[0]. Clamped to at least 1.

Parameters:
  • lambda_ (numpy.ndarray)

  • alpha (float)

Return type:

int