Helpers¶
Installation
pip install labml_helpers
Configurable Modules¶
- class labml_helpers.device.DeviceConfigs[source]¶
This is a configurable module to get a single device to train model on. It can pick up CUDA devices and it will fall back to CPU if they are not available.
It has other small advantages such as being able to view the actual device name on configurations view of labml app
- Parameters
cuda_device (int) – The CUDA device number. Defaults to
0
.use_cuda (bool) – Whether to use CUDA devices. Defaults to
True
.
- class labml_helpers.seed.SeedConfigs(*, _primary: Optional[str] = None)[source]¶
This is a configurable module for setting the seeds. It will set seeds with
torch.manual_seed
andnp.random.seed
.You need to call
set
method to set seeds (example).- Parameters
seed (int) – Seed integer. Defaults to
5
.
- class labml_helpers.optimizer.OptimizerConfigs[source]¶
This creates a configurable optimizer.
- Parameters
learning_rate (float) – Learning rate of the optimizer. Defaults to
0.01
.momentum (float) – Momentum of the optimizer. Defaults to
0.5
.parameters – Model parameters to optimize.
d_model (int) – Embedding size of the model (for Noam optimizer).
betas (Tuple[float, float]) – Betas for Adam optimizer. Defaults to
(0.9, 0.999)
.eps (float) – Epsilon for Adam/RMSProp optimizers. Defaults to
1e-8
.step_factor (int) – Step factor for Noam optimizer. Defaults to
1024
.
Also there is a better (more options) implementation in
labml_nn
. We recommend using that.
- class labml_helpers.training_loop.TrainingLoopConfigs(*, _primary: Optional[str] = None)[source]¶
This is a configurable training loop. You can extend this class for your configurations if it involves a training loop.
>>> for step in conf.training_loop: >>> ...
- Parameters
loop_count (int) – Total number of steps. Defaults to
10
.loop_step (int) – Number of steps to increment per iteration. Defaults to
1
.is_save_models (bool) – Whether to call
labml.experiment.save_checkpoint()
on each iteration. Defaults toFalse
.save_models_interval (int) – The interval (in steps) to save models. Defaults to
1
.log_new_line_interval (int) – The interval (in steps) to print a new line to the screen. Defaults to
1
.log_write_interval (int) – The interval (in steps) to call
labml.tracker.save()
. Defaults to1
.is_loop_on_interrupt (bool) – Whether to handle keyboard interrupts and wait until a iteration is complete. Defaults to
False
.
- class labml_helpers.train_valid.TrainValidConfigs(*, _primary: Optional[str] = None)[source]¶
This is a configurable module that you can extend for experiments that involve a training and validation datasets (i.e. most DL experiments). This is based on
labml_helpers.training_loop.TrainingLoopConfigs
.- Parameters
epochs (int) – Number of epochs to train on. Defaults to
10
.train_loader (torch.utils.data.DataLoader) – Training data loader.
valid_loader (torch.utils.data.DataLoader) – Training data loader.
inner_iterations (int) – Number of times to switch between training and validation within an epoch. Defaults to
1
.
You can override
init
,step
functions. There is also asample
function that you can override to generate samples ever time it switches between training and validation.
- class labml_helpers.train_valid.SimpleTrainValidConfigs(*, _primary: Optional[str] = None)[source]¶
This is a configurable module that works for many standard DL experiments. This is based on
labml_helpers.training_loop.TrainValidConfigs
.- Parameters
model – A PyTorch model.
optimizer – A PyTorch optimizer to update model.
device – The device to train the model on. This defaults to a configurable device -
labml_helpers.device.DeviceConfigs
.loss_function – A function to calculate the loss. This should accept
model_output, target
as arguments.update_batches (int) – Number of batches to accumulate before taking an optimizer step. Defaults to
1
.log_params_updates (int) – How often (number of batches) to track model parameters and gradients. Defaults to a large number; i.e. logs every epoch.
log_activations_batches (int) – How often to log model activations. Defaults to a large number; i.e. logs every epoch.
log_save_batches (int) – How often to call
labml.tracker.save()
.
Datasets¶
- class labml_helpers.datasets.mnist.MNISTConfigs(*, _primary: Optional[str] = None)[source]¶
Configurable MNIST data set.
- Parameters
dataset_name (str) – name of the data set,
MNIST
dataset_transforms (torchvision.transforms.Compose) – image transformations
train_dataset (torchvision.datasets.MNIST) – training dataset
valid_dataset (torchvision.datasets.MNIST) – validation dataset
train_loader (torch.utils.data.DataLoader) – training data loader
valid_loader (torch.utils.data.DataLoader) – validation data loader
train_batch_size (int) – training batch size
valid_batch_size (int) – validation batch size
train_loader_shuffle (bool) – whether to shuffle training data
valid_loader_shuffle (bool) – whether to shuffle validation data
- class labml_helpers.datasets.cifar10.CIFAR10Configs(*, _primary: Optional[str] = None)[source]¶
Configurable CIFAR 10 data set.
- Parameters
dataset_name (str) – name of the data set,
CIFAR10
dataset_transforms (torchvision.transforms.Compose) – image transformations
train_dataset (torchvision.datasets.CIFAR10) – training dataset
valid_dataset (torchvision.datasets.CIFAR10) – validation dataset
train_loader (torch.utils.data.DataLoader) – training data loader
valid_loader (torch.utils.data.DataLoader) – validation data loader
train_batch_size (int) – training batch size
valid_batch_size (int) – validation batch size
train_loader_shuffle (bool) – whether to shuffle training data
valid_loader_shuffle (bool) – whether to shuffle validation data
- class labml_helpers.datasets.csv.CsvDataset(file_path: str, y_cols: ~typing.List, x_cols: ~typing.List, train: bool = True, transform: ~typing.Callable = <function CsvDataset.<lambda>>, test_fraction: float = 0.0, nrows: ~typing.Optional[int] = None)[source]¶
Remote¶
- class labml_helpers.datasets.remote.DatasetServer[source]¶
Remote dataset server
- class labml_helpers.datasets.remote.RemoteDataset(name: str, host: str = '0.0.0.0', port: int = 8000)[source]¶
Remote dataset
- Parameters
name (str) – name of the data set, as specified in
labml_helpers.datasets.remote.DatasetServer
host (str) – hostname of the server
post (int) – port of the server
Text Datasets¶
- class labml_helpers.datasets.text.TextDataset(path: PurePath, tokenizer: Callable, train: str, valid: str, test: str, *, n_tokens: Optional[int] = None, stoi: Optional[Dict[str, int]] = None, itos: Optional[List[str]] = None)[source]¶
- class labml_helpers.datasets.text.TextFileDataset(path: PurePath, tokenizer: Callable, *, url: Optional[str] = None, filter_subset: Optional[int] = None)[source]¶
- class labml_helpers.datasets.text.SequentialDataLoader(*, text: str, dataset: TextDataset, batch_size: int, seq_len: int)[source]¶
- class labml_helpers.datasets.text.SequentialUnBatchedDataset(*, text: str, dataset: TextDataset, seq_len: int, is_random_offset: bool = True)[source]¶