Helpers

Installation

pip install labml_helpers

Configurable Modules

class labml_helpers.device.DeviceConfigs[source]

This is a configurable module to get a single device to train model on. It can pick up CUDA devices and it will fall back to CPU if they are not available.

It has other small advantages such as being able to view the actual device name on configurations view of labml app

Parameters
  • cuda_device (int) – The CUDA device number. Defaults to 0.

  • use_cuda (bool) – Whether to use CUDA devices. Defaults to True.

Here’s an example usage.

class labml_helpers.seed.SeedConfigs(*, _primary: Optional[str] = None)[source]

This is a configurable module for setting the seeds. It will set seeds with torch.manual_seed and np.random.seed.

You need to call set method to set seeds (example).

Parameters

seed (int) – Seed integer. Defaults to 5.

class labml_helpers.optimizer.OptimizerConfigs[source]

This creates a configurable optimizer.

Parameters
  • learning_rate (float) – Learning rate of the optimizer. Defaults to 0.01.

  • momentum (float) – Momentum of the optimizer. Defaults to 0.5.

  • parameters – Model parameters to optimize.

  • d_model (int) – Embedding size of the model (for Noam optimizer).

  • betas (Tuple[float, float]) – Betas for Adam optimizer. Defaults to (0.9, 0.999).

  • eps (float) – Epsilon for Adam/RMSProp optimizers. Defaults to 1e-8.

  • step_factor (int) – Step factor for Noam optimizer. Defaults to 1024.

Here’s an example usage.

Also there is a better (more options) implementation in labml_nn. We recommend using that.

class labml_helpers.training_loop.TrainingLoopConfigs(*, _primary: Optional[str] = None)[source]

This is a configurable training loop. You can extend this class for your configurations if it involves a training loop.

>>> for step in conf.training_loop:
>>>     ...
Parameters
  • loop_count (int) – Total number of steps. Defaults to 10.

  • loop_step (int) – Number of steps to increment per iteration. Defaults to 1.

  • is_save_models (bool) – Whether to call labml.experiment.save_checkpoint() on each iteration. Defaults to False.

  • save_models_interval (int) – The interval (in steps) to save models. Defaults to 1.

  • log_new_line_interval (int) – The interval (in steps) to print a new line to the screen. Defaults to 1.

  • log_write_interval (int) – The interval (in steps) to call labml.tracker.save(). Defaults to 1.

  • is_loop_on_interrupt (bool) – Whether to handle keyboard interrupts and wait until a iteration is complete. Defaults to False.

class labml_helpers.train_valid.TrainValidConfigs(*, _primary: Optional[str] = None)[source]

This is a configurable module that you can extend for experiments that involve a training and validation datasets (i.e. most DL experiments). This is based on labml_helpers.training_loop.TrainingLoopConfigs.

Parameters
  • epochs (int) – Number of epochs to train on. Defaults to 10.

  • train_loader (torch.utils.data.DataLoader) – Training data loader.

  • valid_loader (torch.utils.data.DataLoader) – Training data loader.

  • inner_iterations (int) – Number of times to switch between training and validation within an epoch. Defaults to 1.

You can override init, step functions. There is also a sample function that you can override to generate samples ever time it switches between training and validation.

Here’s an example usage.

class labml_helpers.train_valid.SimpleTrainValidConfigs(*, _primary: Optional[str] = None)[source]

This is a configurable module that works for many standard DL experiments. This is based on labml_helpers.training_loop.TrainValidConfigs.

Parameters
  • model – A PyTorch model.

  • optimizer – A PyTorch optimizer to update model.

  • device – The device to train the model on. This defaults to a configurable device - labml_helpers.device.DeviceConfigs.

  • loss_function – A function to calculate the loss. This should accept model_output, target as arguments.

  • update_batches (int) – Number of batches to accumulate before taking an optimizer step. Defaults to 1.

  • log_params_updates (int) – How often (number of batches) to track model parameters and gradients. Defaults to a large number; i.e. logs every epoch.

  • log_activations_batches (int) – How often to log model activations. Defaults to a large number; i.e. logs every epoch.

  • log_save_batches (int) – How often to call labml.tracker.save().

Datasets

class labml_helpers.datasets.mnist.MNISTConfigs(*, _primary: Optional[str] = None)[source]

Configurable MNIST data set.

Parameters
  • dataset_name (str) – name of the data set, MNIST

  • dataset_transforms (torchvision.transforms.Compose) – image transformations

  • train_dataset (torchvision.datasets.MNIST) – training dataset

  • valid_dataset (torchvision.datasets.MNIST) – validation dataset

  • train_loader (torch.utils.data.DataLoader) – training data loader

  • valid_loader (torch.utils.data.DataLoader) – validation data loader

  • train_batch_size (int) – training batch size

  • valid_batch_size (int) – validation batch size

  • train_loader_shuffle (bool) – whether to shuffle training data

  • valid_loader_shuffle (bool) – whether to shuffle validation data

class labml_helpers.datasets.cifar10.CIFAR10Configs(*, _primary: Optional[str] = None)[source]

Configurable CIFAR 10 data set.

Parameters
  • dataset_name (str) – name of the data set, CIFAR10

  • dataset_transforms (torchvision.transforms.Compose) – image transformations

  • train_dataset (torchvision.datasets.CIFAR10) – training dataset

  • valid_dataset (torchvision.datasets.CIFAR10) – validation dataset

  • train_loader (torch.utils.data.DataLoader) – training data loader

  • valid_loader (torch.utils.data.DataLoader) – validation data loader

  • train_batch_size (int) – training batch size

  • valid_batch_size (int) – validation batch size

  • train_loader_shuffle (bool) – whether to shuffle training data

  • valid_loader_shuffle (bool) – whether to shuffle validation data

class labml_helpers.datasets.csv.CsvDataset(file_path: str, y_cols: ~typing.List, x_cols: ~typing.List, train: bool = True, transform: ~typing.Callable = <function CsvDataset.<lambda>>, test_fraction: float = 0.0, nrows: ~typing.Optional[int] = None)[source]

Remote

class labml_helpers.datasets.remote.DatasetServer[source]

Remote dataset server

Here’s a sample usage of the server

add_dataset(name: str, dataset: Dataset)[source]

Add a dataset

Parameters
  • name (str) – name of the data set

  • dataset (Dataset) – dataset to be served

start(host: str = '0.0.0.0', port: int = 8000)[source]

Start the server

Parameters
  • host (str) – hostname of the server

  • port (int) – server port

class labml_helpers.datasets.remote.RemoteDataset(name: str, host: str = '0.0.0.0', port: int = 8000)[source]

Remote dataset

Parameters

Here’s a sample

Text Datasets

class labml_helpers.datasets.text.TextDataset(path: PurePath, tokenizer: Callable, train: str, valid: str, test: str, *, n_tokens: Optional[int] = None, stoi: Optional[Dict[str, int]] = None, itos: Optional[List[str]] = None)[source]
class labml_helpers.datasets.text.TextFileDataset(path: PurePath, tokenizer: Callable, *, url: Optional[str] = None, filter_subset: Optional[int] = None)[source]
class labml_helpers.datasets.text.SequentialDataLoader(*, text: str, dataset: TextDataset, batch_size: int, seq_len: int)[source]
class labml_helpers.datasets.text.SequentialUnBatchedDataset(*, text: str, dataset: TextDataset, seq_len: int, is_random_offset: bool = True)[source]

Schedules

class labml_helpers.schedule.Schedule[source]
class labml_helpers.schedule.Flat(value)[source]
class labml_helpers.schedule.Dynamic(value)[source]
class labml_helpers.schedule.Piecewise(endpoints: List[Tuple[float, float]], outside_value: Optional[float] = None)[source]

## Piecewise schedule

class labml_helpers.schedule.RelativePiecewise(relative_endpoits: List[Tuple[float, float]], total_steps: int)[source]

Metrics

class labml_helpers.metrics.StateModule[source]
class labml_helpers.metrics.Metric[source]
class labml_helpers.metrics.accuracy.Accuracy(ignore_index: int = -1)[source]
class labml_helpers.metrics.accuracy.BinaryAccuracy(ignore_index: int = -1)[source]
class labml_helpers.metrics.accuracy.AccuracyDirect(ignore_index: int = -1)[source]
class labml_helpers.metrics.collector.Collector(name: str)[source]
class labml_helpers.metrics.recall_precision.RecallPrecision[source]
class labml_helpers.metrics.simple_state.SimpleStateModule[source]
class labml_helpers.metrics.simple_state.SimpleState[source]

Utilies

class labml_helpers.module.Module[source]

Wraps torch.nn.Module to overload __call__ instead of forward for better type checking.

PyTorch Github issue for clarification