NeuralTSNE.Utils.Loaders.FileLoaders package

Submodules

NeuralTSNE.Utils.Loaders.FileLoaders.file_loaders module

load_npy_file(input_file: str, step: int, exclude_cols: List[int], variance_threshold: float) Tensor[source]View on GitHub

Load and preprocess data from a NumPy (.npy) file.

The function loads data from the specified NumPy file, subsamples it based on the given step size, and excludes specified columns if the exclude_cols list is provided. It then preprocesses the data by applying a variance threshold to perform feature selection and returns the resulting torch.Tensor.

Parameters:
  • input_file (str) – The path to the input NumPy file (.npy).

  • step (int) – Step size for subsampling the data.

  • exclude_cols (List[int]) – A list of column indices to exclude from the data.

  • variance_threshold (float) – Threshold for variance-based feature selection.

Returns:

Processed data tensor.

Return type:

torch.Tensor

load_text_file(input_file: str, step: int, header: bool, exclude_cols: List[int], variance_threshold: float) Tensor[source]View on GitHub

Load and preprocess data from a text file.

The function reads the data from the specified text file, skips the header if present, and excludes specified columns if the exclude_cols list is provided. It then subsamples the data based on the given step size. Finally, it preprocesses the data by applying a variance threshold to perform feature selection and returns the resulting torch.Tensor.

Parameters:
  • input_file (str) – The path to the input text file.

  • step (int) – Step size for subsampling the data.

  • header (bool) – A boolean indicating whether the file has a header.

  • exclude_cols (List[int]) – A list of column indices to exclude from the data.

  • variance_threshold (float) – Threshold for variance-based feature selection.

Returns:

Processed data tensor.

Return type:

torch.Tensor

load_torch_dataset(name: str, step: int, output: str) Tuple[Dataset, Dataset][source]View on GitHub

Load and preprocess a torch.Dataset, returning training and testing subsets.

The function loads a torch.Dataset specified by the name parameter, extracts training and testing subsets, and preprocesses the training subset by saving labels and calculating means and variances.

Parameters:
  • name (str) – The name of the torch dataset to be loaded.

  • step (int) – The step size for subsampling the training dataset.

  • output (str) – The output file path for saving labels.

Returns:

A tuple containing the training and testing subsets.

Return type:

Tuple[Dataset, Dataset]

Note

  • The function uses the name parameter to load a torch dataset and extract training and testing subsets.

  • The training subset is subsampled using the step parameter.

  • Labels for the testing subset are saved to a file specified by the output parameter.

  • Means and variances for the training subset are calculated and saved to the “means_and_vars.txt” file.

  • The function returns a tuple containing the training and testing subsets.

Module contents