Data Processor

class neural_pipeline.data_processor.data_processor.DataProcessor(model: torch.nn.modules.module.Module, device: torch.device = None)[source]

DataProcessor manage: model, data processing, device choosing

Args:
model (Module): model, that will be used for process data device (torch.device): what device pass data for processing
load() → None[source]

Load model weights from checkpoint

model() → torch.nn.modules.module.Module[source]

Get current module

predict(data: torch.Tensor) → object[source]

Make predict by data

Parameters:data – data as torch.Tensor or dict with key data
Returns:processed output
Return type:the model output type
save_state() → None[source]

Save state of optimizer and perform epochs number

set_pick_model_input(pick_model_input: callable) → neural_pipeline.data_processor.data_processor.DataProcessor[source]

Set callback, that will get output from DataLoader and return model input.

Default mode:


lambda data: data[‘data’]

Args:
pick_model_input (callable): pick model input callable. This callback need to get one parameter: dataset output
Returns:
self object

Examples:

data_processor.set_pick_model_input(lambda data: data['data'])
data_processor.set_pick_model_input(lambda data: data[0])
class neural_pipeline.data_processor.data_processor.TrainDataProcessor(train_config: TrainConfig, device: torch.device = None)[source]

TrainDataProcessor is make all of DataProcessor but produce training process.

Parameters:train_config – train config
exception TDPException(msg)[source]
get_lr() → float[source]

Get learning rate from optimizer

get_state() → {}[source]

Get model and optimizer state dicts

Returns:dict with keys [weights, optimizer]
load() → None[source]

Load state of model, optimizer and TrainDataProcessor from checkpoint

predict(data, is_train=False) → torch.Tensor[source]

Make predict by data. If is_train is True - this operation will compute gradients. If is_train is False - this will work with model.eval() and torch.no_grad

Parameters:
  • data – data in dict
  • is_train – is data processor need train on data or just predict
Returns:

processed output

Return type:

model return type

process_batch(batch: {}, is_train: bool, metrics_processor: AbstractMetricsProcessor = None) → numpy.ndarray[source]

Process one batch of data

Parameters:
  • batch – dict, contains ‘data’ and ‘target’ keys. The values for key must be instance of torch.Tensor or dict
  • is_train – is batch process for train
  • metrics_processor – metrics processor for collect metrics after batch is processed
Returns:

array of losses with shape (N, …) where N is batch size

save_state() → None[source]

Save state of optimizer and perform epochs number

set_data_preprocess(data_preprocess: callable) → neural_pipeline.data_processor.data_processor.DataProcessor[source]

Set callback, that will get output from DataLoader and return preprocessed data. For example may be used for pass data to device.

Default mode:


_pass_data_to_device()

Args:
data_preprocess (callable): preprocess callable. This callback need to get one parameter: dataset output
Returns:
self object

Examples:

from neural_pipeline.utils import dict_recursive_bypass
data_processor.set_data_preprocess(lambda data: dict_recursive_bypass(data, lambda v: v.cuda()))
set_pick_target(pick_target: callable) → neural_pipeline.data_processor.data_processor.DataProcessor[source]

Set callback, that will get output from DataLoader and return target.

Default mode:


lambda data: data[‘target’]

Args:
pick_target (callable): pick target callable. This callback need to get one parameter: dataset output
Returns:
self object

Examples:

data_processor.set_pick_target(lambda data: data['target'])
data_processor.set_pick_target(lambda data: data[1])
update_lr(lr: float) → None[source]

Update learning rate straight to optimizer

Parameters:lr – target learning rate

Model

class neural_pipeline.data_processor.model.Model(base_model: torch.nn.modules.module.Module)[source]

Wrapper for torch.nn.Module. This class provide initialization, call and serialization for it

Parameters:base_modeltorch.nn.Module object
exception ModelException(msg)[source]
load_weights(weights_file: str = None) → None[source]

Load weight from checkpoint

model() → torch.nn.modules.module.Module[source]

Get internal torch.nn.Module object

Returns:internal torch.nn.Module object
save_weights(weights_file: str = None) → None[source]

Serialize weights to file

set_checkpoints_manager(manager: neural_pipeline.utils.fsm.CheckpointsManager) → neural_pipeline.data_processor.model.Model[source]

Set checkpoints manager, that will be used for identify path for weights file reading an writing

Parameters:managerCheckpointsManager instance
Returns:self object
to_device(device: torch.device) → neural_pipeline.data_processor.model.Model[source]

Pass model to specified device