Helpers¶

Installation

pip install labml_helpers

Configurable Modules¶

class labml_helpers.device.DeviceConfigs[source]¶

This is a configurable module to get a single device to train model on. It can pick up CUDA devices and it will fall back to CPU if they are not available.

It has other small advantages such as being able to view the actual device name on configurations view of labml app

Parameters

cuda_device (int) – The CUDA device number. Defaults to 0.
use_cuda (bool) – Whether to use CUDA devices. Defaults to True.

Here’s an example usage.

class labml_helpers.seed.SeedConfigs(*, _primary: Optional[str] = None)[source]¶

This is a configurable module for setting the seeds. It will set seeds with torch.manual_seed and np.random.seed.

You need to call set method to set seeds (example).

Parameters: seed (int) – Seed integer. Defaults to 5.

class labml_helpers.optimizer.OptimizerConfigs[source]¶

This creates a configurable optimizer.

Parameters

learning_rate (float) – Learning rate of the optimizer. Defaults to 0.01.
momentum (float) – Momentum of the optimizer. Defaults to 0.5.
parameters – Model parameters to optimize.
d_model (int) – Embedding size of the model (for Noam optimizer).
betas (Tuple[float, float]) – Betas for Adam optimizer. Defaults to (0.9, 0.999).
eps (float) – Epsilon for Adam/RMSProp optimizers. Defaults to 1e-8.
step_factor (int) – Step factor for Noam optimizer. Defaults to 1024.

Here’s an example usage.

Also there is a better (more options) implementation in labml_nn. We recommend using that.

class labml_helpers.training_loop.TrainingLoopConfigs(*, _primary: Optional[str] = None)[source]¶

This is a configurable training loop. You can extend this class for your configurations if it involves a training loop.

>>> for step in conf.training_loop:
>>>     ...

Parameters

loop_count (int) – Total number of steps. Defaults to 10.
loop_step (int) – Number of steps to increment per iteration. Defaults to 1.
is_save_models (bool) – Whether to call labml.experiment.save_checkpoint() on each iteration. Defaults to False.
save_models_interval (int) – The interval (in steps) to save models. Defaults to 1.
log_new_line_interval (int) – The interval (in steps) to print a new line to the screen. Defaults to 1.
log_write_interval (int) – The interval (in steps) to call labml.tracker.save(). Defaults to 1.
is_loop_on_interrupt (bool) – Whether to handle keyboard interrupts and wait until a iteration is complete. Defaults to False.

class labml_helpers.train_valid.TrainValidConfigs(*, _primary: Optional[str] = None)[source]¶

This is a configurable module that you can extend for experiments that involve a training and validation datasets (i.e. most DL experiments). This is based on labml_helpers.training_loop.TrainingLoopConfigs.

Parameters

epochs (int) – Number of epochs to train on. Defaults to 10.
train_loader (torch.utils.data.DataLoader) – Training data loader.
valid_loader (torch.utils.data.DataLoader) – Training data loader.
inner_iterations (int) – Number of times to switch between training and validation within an epoch. Defaults to 1.

You can override init, step functions. There is also a sample function that you can override to generate samples ever time it switches between training and validation.

Here’s an example usage.

class labml_helpers.train_valid.SimpleTrainValidConfigs(*, _primary: Optional[str] = None)[source]¶

This is a configurable module that works for many standard DL experiments. This is based on labml_helpers.training_loop.TrainValidConfigs.

Parameters

model – A PyTorch model.
optimizer – A PyTorch optimizer to update model.
device – The device to train the model on. This defaults to a configurable device - labml_helpers.device.DeviceConfigs.
loss_function – A function to calculate the loss. This should accept model_output, target as arguments.
update_batches (int) – Number of batches to accumulate before taking an optimizer step. Defaults to 1.
log_params_updates (int) – How often (number of batches) to track model parameters and gradients. Defaults to a large number; i.e. logs every epoch.
log_activations_batches (int) – How often to log model activations. Defaults to a large number; i.e. logs every epoch.
log_save_batches (int) – How often to call labml.tracker.save().

Datasets¶

class labml_helpers.datasets.mnist.MNISTConfigs(*, _primary: Optional[str] = None)[source]¶

Configurable MNIST data set.

Parameters

dataset_name (str) – name of the data set, MNIST
dataset_transforms (torchvision.transforms.Compose) – image transformations
train_dataset (torchvision.datasets.MNIST) – training dataset
valid_dataset (torchvision.datasets.MNIST) – validation dataset
train_loader (torch.utils.data.DataLoader) – training data loader
valid_loader (torch.utils.data.DataLoader) – validation data loader
train_batch_size (int) – training batch size
valid_batch_size (int) – validation batch size
train_loader_shuffle (bool) – whether to shuffle training data
valid_loader_shuffle (bool) – whether to shuffle validation data

class labml_helpers.datasets.cifar10.CIFAR10Configs(*, _primary: Optional[str] = None)[source]¶

Configurable CIFAR 10 data set.

Parameters

dataset_name (str) – name of the data set, CIFAR10
dataset_transforms (torchvision.transforms.Compose) – image transformations
train_dataset (torchvision.datasets.CIFAR10) – training dataset
valid_dataset (torchvision.datasets.CIFAR10) – validation dataset
train_loader (torch.utils.data.DataLoader) – training data loader
valid_loader (torch.utils.data.DataLoader) – validation data loader
train_batch_size (int) – training batch size
valid_batch_size (int) – validation batch size
train_loader_shuffle (bool) – whether to shuffle training data
valid_loader_shuffle (bool) – whether to shuffle validation data

class labml_helpers.datasets.csv.CsvDataset(file_path: str, y_cols: ~typing.List, x_cols: ~typing.List, train: bool = True, transform: ~typing.Callable = <function CsvDataset.<lambda>>, test_fraction: float = 0.0, nrows: ~typing.Optional[int] = None)[source]¶

Remote¶

class labml_helpers.datasets.remote.DatasetServer[source]¶

Remote dataset server

Here’s a sample usage of the server

add_dataset(name: str, dataset: Dataset)[source]¶

Add a dataset

Parameters

name (str) – name of the data set
dataset (Dataset) – dataset to be served

start(host: str = '0.0.0.0', port: int = 8000)[source]¶

Start the server

Parameters

host (str) – hostname of the server
port (int) – server port

class labml_helpers.datasets.remote.RemoteDataset(name: str, host: str = '0.0.0.0', port: int = 8000)[source]¶

Remote dataset

Parameters

name (str) – name of the data set, as specified in labml_helpers.datasets.remote.DatasetServer
host (str) – hostname of the server
post (int) – port of the server

Here’s a sample

Text Datasets¶

class labml_helpers.datasets.text.TextDataset(path: PurePath, tokenizer: Callable, train: str, valid: str, test: str, *, n_tokens: Optional[int] = None, stoi: Optional[Dict[str, int]] = None, itos: Optional[List[str]] = None)[source]¶

class labml_helpers.datasets.text.TextFileDataset(path: PurePath, tokenizer: Callable, *, url: Optional[str] = None, filter_subset: Optional[int] = None)[source]¶

class labml_helpers.datasets.text.SequentialDataLoader(*, text: str, dataset: TextDataset, batch_size: int, seq_len: int)[source]¶

class labml_helpers.datasets.text.SequentialUnBatchedDataset(*, text: str, dataset: TextDataset, seq_len: int, is_random_offset: bool = True)[source]¶

Schedules¶

class labml_helpers.schedule.Schedule[source]¶

class labml_helpers.schedule.Flat(value)[source]¶

class labml_helpers.schedule.Dynamic(value)[source]¶

class labml_helpers.schedule.Piecewise(endpoints: List[Tuple[float, float]], outside_value: Optional[float] = None)[source]¶: ## Piecewise schedule

class labml_helpers.schedule.RelativePiecewise(relative_endpoits: List[Tuple[float, float]], total_steps: int)[source]¶

Metrics¶

class labml_helpers.metrics.StateModule[source]¶

class labml_helpers.metrics.Metric[source]¶

class labml_helpers.metrics.accuracy.Accuracy(ignore_index: int = -1)[source]¶

class labml_helpers.metrics.accuracy.BinaryAccuracy(ignore_index: int = -1)[source]¶

class labml_helpers.metrics.accuracy.AccuracyDirect(ignore_index: int = -1)[source]¶

class labml_helpers.metrics.collector.Collector(name: str)[source]¶

class labml_helpers.metrics.recall_precision.RecallPrecision[source]¶

class labml_helpers.metrics.simple_state.SimpleStateModule[source]¶

class labml_helpers.metrics.simple_state.SimpleState[source]¶

Utilies¶

class labml_helpers.module.Module[source]¶

Wraps torch.nn.Module to overload __call__ instead of forward for better type checking.

PyTorch Github issue for clarification