scimba_torch.optimizers.optimizers_data¶
A module to handle several optimizers.
Examples: Optimizers usage
import math from scimba_torch.optimizers.optimizers_data import OptimizerData from scimba_torch.optimizers.scimba_optimizers import ScimbaMomentum opt_1 = { "name": "adam", "optimizer_args": {"lr": 1e-3, "betas": (0.9, 0.999)}, } opt_2 = {"class": ScimbaMomentum, "switch_at_epoch": 500} opt_3 = { "name": "lbfgs", "switch_at_epoch_ratio": 0.7, "switch_at_plateau": [500, 20], "switch_at_plateau_ratio": 3.0, } optimizers = OptimizerData(opt_1, opt_2, opt_3) print("optimizers: ", optimizers) class SimpleNN(torch.nn.Module): def __init__(self): super(SimpleNN, self).__init__() self.fc1 = torch.nn.Linear(10, 1) def forward(self, x: torch.Tensor) -> torch.Tensor: return self.fc1(x) net = SimpleNN() opt_1 = {"name": "adam", "optimizer_args": {"lr": 1e-3, "betas": (0.9, 0.999)}} opt_2 = { "name": "adam", "optimizer_args": {"lrTEST": 1e-3, "betasTEST": (0.9, 0.999)}, } # wrong list of arguments try: optimizers2 = OptimizerData(opt_1, opt_2) optimizers2.activate_first_optimizer(list(net.parameters())) except TypeError as error: print(error) input_tensor = torch.randn(10000, 10) # 10000 samples, each of size 10 target_tensor = torch.sum(input_tensor, dim=1)[ :, None ] # Target tensor with batch size 10000 loss = [torch.tensor(float("+inf"))] loss_func = torch.nn.MSELoss() opt = optimizers opt.activate_first_optimizer(list(net.parameters())) def closure(): opt.zero_grad() output_tensor = net(input_tensor) loss[0] = loss_func(output_tensor, target_tensor) loss[0].backward(retain_graph=True) return loss[0].item() init_loss = closure() grads = opt.get_opt_gradients() print("get_opt_gradients: ", grads) loss_history = [init_loss] best_loss = init_loss best_net = copy.deepcopy(net.state_dict()) epochs = 1000 for epoch in range(epochs): opt.step(closure) if math.isinf(loss[0].item()) or math.isnan(loss[0].item()): loss[0] = torch.tensor(best_loss) net.load_state_dict(best_net) if loss[0].item() < best_loss: best_loss = loss[0].item() best_net = copy.deepcopy(net.state_dict()) opt.update_best_optimizer() loss_history.append(loss[0].item()) if epoch % 100 == 0: print("epoch: ", epoch, "loss: ", loss[0].item()) if opt.test_activate_next_optimizer( loss_history, loss[0].item(), init_loss, epoch, epochs ): print("activate next opt! epoch = ", epoch) opt.activate_next_optimizer(list(net.parameters())) opt.test_and_activate_next_optimizer( list(net.parameters()), loss_history, loss[0].item(), init_loss, epoch, epochs, ) net.load_state_dict(best_net) closure() print("loss after training: ", loss[0].item()) print("net( torch.ones( 10 ) ) : ", net(torch.ones(10))) grads = opt.get_opt_gradients() print("get_opt_gradients: ", grads)
Classes
|
A class to manage multiple optimizers and their activation criteria. |
- class OptimizerData(*args, **kwargs)[source]¶
Bases:
objectA class to manage multiple optimizers and their activation criteria.
- Parameters:
*args (
dict[str,Any]) –Variable length argument list of optimizer configurations.
Input dictionary must have one of the form:
{ “class”: value (a subclass of AbstractScimbaOptimizer), keys: value }
{ “name”: value (either “adam” or “lbfgs”), keys: value }
where pairs keys value can be:
”optimizer_args”: a dictionary of arguments for the optimizer,
”scheduler”: a subclass of torch.optim.lr_scheduler.LRScheduler,
”scheduler_args: a dictionary of arguments for the scheduler
”switch_at_epoch”: a bool or an int, default false, if true then default value 5000 is used
”switch_at_epoch_ratio”: a bool or a float, default 0.7, if true then default value is used
”switch_at_plateau”: a bool or a tuple of two int, default False, if True then default (50, 10) is used
”switch_at_plateau_ratio”: a float r, default value 500.; triggers the plateau tests if current_loss < init_loss/r
**kwargs – Arbitrary keyword arguments.
Examples
>>> from scimba_torch.optimizers.scimba_optimizers\ ... import ScimbaMomentum >>> opt_1 = {\ ... "name": "adam",\ ... "optimizer_args": {"lr": 1e-3, "betas": (0.9, 0.999)},\ ... } >>> opt_2 = {"class": ScimbaMomentum, "switch_at_epoch": 500} >>> opt_3 = {\ ... "name": "lbfgs",\ ... "switch_at_epoch_ratio": 0.7,\ ... "switch_at_plateau": [500, 20],\ ... "switch_at_plateau_ratio": 3.0,\ ... } >>> optimizers = OptimizerData(opt_1, opt_2, opt_3)
-
activated_optimizer:
list[AbstractScimbaOptimizer]¶ A list containing the current optimizer; empty if none.
-
optimizers:
list[dict[str,Any]]¶ List of optimizers.
-
next_optimizer:
int¶ Index of the next optimizer to be activated.
- step(closure)[source]¶
Performs an optimization step using the currently activated optimizer.
- Parameters:
closure (
Callable[[],float]) – A closure that reevaluates the model and returns the loss.- Return type:
None
- set_lr(lr)[source]¶
Set learning rate of activated optimizer.
- Parameters:
lr (
float) – The new learning rate.- Return type:
None
- test_activate_next_optimizer(loss_history, loss_value, init_loss, epoch, epochs)[source]¶
Tests whether the next opt. should be activated based on the given criteria.
- Parameters:
loss_history (
list[float]) – History of loss values.loss_value (
float) – Current loss value.init_loss (
float) – Initial loss value.epoch (
int) – Current epoch.epochs (
int) – Total number of epochs.
- Return type:
bool- Returns:
True if the next optimizer should be activated, False otherwise.
- activate_next_optimizer(parameters, verbose=False)[source]¶
Activates the next optimizer in the list.
- Parameters:
parameters (
Union[Iterable[Tensor],Iterable[dict[str,Any]],Iterable[tuple[str,Tensor]]]) – Parameters to be optimized.verbose (
bool) – whether to print activation message or not.
- Return type:
None
- activate_first_optimizer(parameters, verbose=False)[source]¶
Activates the first optimizer in the list.
- Parameters:
parameters (
Union[Iterable[Tensor],Iterable[dict[str,Any]],Iterable[tuple[str,Tensor]]]) – Parameters to be optimized.verbose (
bool) – whether to print activation message or not.
- Return type:
None
- test_and_activate_next_optimizer(parameters, loss_history, loss_value, init_loss, epoch, epochs)[source]¶
Tests whether next optimizer should be activated; activates it.
- Parameters:
parameters (
Union[Iterable[Tensor],Iterable[dict[str,Any]],Iterable[tuple[str,Tensor]]]) – Parameters to be optimized.loss_history (
list[float]) – History of loss values.loss_value (
float) – Current loss value.init_loss (
float) – Initial loss value.epoch (
int) – Current epoch.epochs (
int) – Total number of epochs.
- Return type:
None
- get_opt_gradients()[source]¶
Gets the gradients of the currently activated optimizer.
- Return type:
Tensor- Returns:
Flattened tensor of gradients.
- update_best_optimizer()[source]¶
Updates the best state of the currently activated optimizer.
- Return type:
None
- dict_for_save()[source]¶
Returns a dictionary containing the best state of the current optimizer.
- Return type:
dict- Returns:
dictionary containing the best state of the optimizer.
- load_from_dict(parameters, checkpoint)[source]¶
Loads the optimizer and scheduler states from a checkpoint.
- Parameters:
parameters (
Union[Iterable[Tensor],Iterable[dict[str,Any]],Iterable[tuple[str,Tensor]]]) – Parameters to be optimized.checkpoint (
dict) – dictionary containing the optimizer and scheduler states.
- Raises:
ValueError – when there is no active optimizer to load in.
- Return type:
None