scimba_torch.optimizers.scimba_optimizers

A module defining scimba optimizers.

Examples: Defining optimizers

import math

import torch

from scimba_torch.optimizers.scimba_optimizers import (
    ScimbaAdam,
    ScimbaLBFGS,
    ScimbaMomentum,
)

class SimpleNN(torch.nn.Module):

    def __init__(self):
        super(SimpleNN, self).__init__()
        self.fc1 = torch.nn.Linear(10, 1)

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        return self.fc1(x)

net = SimpleNN()

class DummyScheduler:
    pass

try:
    opt_test = ScimbaAdam(list(net.parameters()), scheduler=DummyScheduler)
except ValueError as error:
    print(error)

# Batch of 10000 samples, each of size 10
input_tensor = torch.randn(10000, 10)
# learn sum function
# Target tensor with batch size 10000
target_tensor = torch.sum(input_tensor, dim=1)[:, None]

loss = [torch.tensor(float("+inf"))]
best_loss = float("+inf")
best_net = copy.deepcopy(net.state_dict())

opt = ScimbaAdam(list(net.parameters()))
# opt = ScimbaMomentum(list(net.parameters()))

loss_func = torch.nn.MSELoss()
# opt.zero_grad()

def closure() -> float:
    opt.zero_grad()
    output_tensor = net(input_tensor)
    loss[0] = loss_func(output_tensor, target_tensor)
    loss[0].backward(retain_graph=True)
    return loss[0].item()

# perform one step
opt.optimizer_step(closure)

epochs = 1000
for epoch in range(epochs):
    opt.optimizer_step(closure)

    if math.isinf(loss[0].item()) or math.isnan(loss[0].item()):
        loss[0] = torch.tensor(best_loss)
        net.load_state_dict(best_net)

    if loss[0].item() < best_loss:
        best_loss = loss[0].item()
        best_net = copy.deepcopy(net.state_dict())
        opt.update_best_optimizer()

    if epoch % 100 == 0:
        print("epoch: ", epoch, "loss: ", loss[0].item())
        print("lr: ", opt.param_groups[0]["lr"])

net.load_state_dict(best_net)
closure()
print("loss after training: ", loss[0].item())
# optimizer_step_count_after = opt._step_count
# print("step_count after: ", optimizer_step_count_after)

print("\n")
# print("dict_for_save: ", opt.dict_for_save())
print("state_dict   : ", opt.state_dict())
print("\n")
optt = ScimbaAdam(list(net.parameters()))
# print("dict_for_save: ", optt.dict_for_save())
print("state_dict   : ", optt.state_dict())
print("\n")
optt.load(opt.dict_for_save())
# print("dict_for_save: ", optt.dict_for_save())
print("state_dict   : ", optt.state_dict())
print("\n")
# print( "==", optt.state_dict() == opt.state_dict())

opt2 = ScimbaLBFGS(list(net.parameters()))

def closure() -> float:
    opt2.zero_grad()
    # Forward pass
    output_tensor = net(input_tensor)
    loss[0] = loss_func(output_tensor, target_tensor)
    loss[0].backward(retain_graph=True)
    return loss[0].item()

epochs = 1000
for epoch in range(epochs):
    opt2.optimizer_step(closure)

    if math.isinf(loss[0].item()) or math.isnan(loss[0].item()):
        loss[0] = torch.tensor(best_loss)
        net.load_state_dict(best_net)

    if loss[0].item() < best_loss:
        best_loss = loss[0].item()
        best_net = copy.deepcopy(net.state_dict())
        opt2.update_best_optimizer()

    if epoch % 100 == 0:
        print("epoch: ", epoch, "loss: ", loss[0].item())

net.load_state_dict(best_net)
closure()
print("loss after training: ", loss[0].item())

print("net( torch.ones( 10 ) ) : ", net(torch.ones(10)))

Classes

AbstractScimbaOptimizer(params[, ...])

Abstract base class for Scimba optimizers with optional learning rate scheduler.

NoScheduler()

A placeholder class to indicate the absence of a scheduler.

ScimbaAdam(params[, optimizer_args, ...])

Scimba wrapper for Adam optimizer with optional learning rate scheduler.

ScimbaCustomOptomizer(params[, ...])

An abstract class of which user defined optimizer must inherit.

ScimbaLBFGS(params[, optimizer_args])

Scimba wrapper for LBFGS optimizer with optional learning rate scheduler.

ScimbaMomentum(params[, lr, momentum])

Custom Momentum optimizer with scheduler.

ScimbaSGD(params[, optimizer_args, ...])

Scimba wrapper for SGD optimizer with optional learning rate scheduler.

ScimbaSSBFGS(params[, optimizer_args])

Scimba wrapper for SSBFGS optimizer.

ScimbaSSBroyden(params[, optimizer_args])

Scimba wrapper for SSBroyden optimizer.

class NoScheduler[source]

Bases: object

A placeholder class to indicate the absence of a scheduler.

class AbstractScimbaOptimizer(params, optimizer_args={}, scheduler=<class 'scimba_torch.optimizers.scimba_optimizers.NoScheduler'>, scheduler_args={}, **kwargs)[source]

Bases: Optimizer, ABC

Abstract base class for Scimba optimizers with optional learning rate scheduler.

Parameters:
  • params (Union[Iterable[Tensor], Iterable[dict[str, Any]], Iterable[tuple[str, Tensor]]]) – Iterable of parameters to optimize or dicts defining parameter groups.

  • optimizer_args (dict[str, Any]) – Additional arguments for the optimizer. Defaults to {}.

  • scheduler (type) – Learning rate scheduler class. Defaults to NoScheduler.

  • scheduler_args (dict[str, Any]) – Additional arguments for the scheduler. Defaults to {}.

  • **kwargs – Arbitrary keyword arguments.

Raises:

ValueError – scheduler is not an object of a subclass of torch.optim.lr_scheduler.LRScheduler.

scheduler_exists

Flag indicating if a scheduler is set.

scheduler

list containing the scheduler.

best_optimizer

dictionary containing the best state of the optimizer.

best_scheduler

list containing the best state of the scheduler.

optimizer_step(closure)[source]

Performs an optimization step and updates the scheduler if it exists.

Parameters:

closure (Callable[[], float]) – A closure that reevaluates the model and returns the loss.

Return type:

None

abstract inner_step(closure)[source]

Abstract method for performing the inner optimization step.

Parameters:

closure (Callable[[], float]) – A closure that reevaluates the model and returns the loss.

Return type:

None

update_best_optimizer()[source]

Updates the best optimizer state.

Return type:

None

dict_for_save()[source]

Returns a dictionary containing the best optimizer and scheduler states.

Returns:

dictionary containing the best optimizer and scheduler states.

Return type:

dict

load(checkpoint)[source]

Loads the optimizer and scheduler states from a checkpoint.

Parameters:

checkpoint (dict) – dictionary containing the optimizer and scheduler states.

Return type:

None

class ScimbaAdam(params, optimizer_args={}, scheduler=<class 'torch.optim.lr_scheduler.StepLR'>, scheduler_args={}, **kwargs)[source]

Bases: AbstractScimbaOptimizer, Adam

Scimba wrapper for Adam optimizer with optional learning rate scheduler.

Parameters:
  • params (Union[Iterable[Tensor], Iterable[dict[str, Any]], Iterable[tuple[str, Tensor]]]) – Iterable of parameters to optimize or dicts defining parameter groups.

  • optimizer_args (dict[str, Any]) – Additional arguments for the Adam optimizer. Defaults to {}.

  • scheduler (type) – Learning rate scheduler class. Defaults to torch.optim.lr_scheduler.StepLR.

  • scheduler_args (dict[str, Any]) – Additional arguments for the scheduler. Defaults to {}.

  • **kwargs – Arbitrary keyword arguments.

inner_step(closure)[source]

Performs the inner optimization step for ScimbaAdam.

Parameters:

closure (Callable[[], float]) – A closure that reevaluates the model and returns the loss.

Return type:

None

class ScimbaSGD(params, optimizer_args={}, scheduler=<class 'torch.optim.lr_scheduler.StepLR'>, scheduler_args={}, **kwargs)[source]

Bases: AbstractScimbaOptimizer, SGD

Scimba wrapper for SGD optimizer with optional learning rate scheduler.

Parameters:
  • params (Union[Iterable[Tensor], Iterable[dict[str, Any]], Iterable[tuple[str, Tensor]]]) – Iterable of parameters to optimize or dicts defining parameter groups.

  • optimizer_args (dict[str, Any]) – Additional arguments for the Adam optimizer. Defaults to {}.

  • scheduler (type) – Learning rate scheduler class. Defaults to torch.optim.lr_scheduler.StepLR.

  • scheduler_args (dict[str, Any]) – Additional arguments for the scheduler. Defaults to {}.

  • **kwargs – Arbitrary keyword arguments.

inner_step(closure)[source]

Performs the inner optimization step for ScimbaAdam.

Parameters:

closure (Callable[[], float]) – A closure that reevaluates the model and returns the loss.

Return type:

None

class ScimbaLBFGS(params, optimizer_args={}, **kwargs)[source]

Bases: AbstractScimbaOptimizer, LBFGS

Scimba wrapper for LBFGS optimizer with optional learning rate scheduler.

Parameters:
  • params (Union[Iterable[Tensor], Iterable[dict[str, Any]], Iterable[tuple[str, Tensor]]]) – Iterable of parameters to optimize or dicts defining parameter groups.

  • optimizer_args (dict[str, Any]) – Additional arguments for the LBFGS optimizer. Defaults to {}.

  • **kwargs – Arbitrary keyword arguments.

inner_step(closure)[source]

Performs the inner optimization step for ScimbaLBFGS.

Parameters:

closure (Callable[[], float]) – A closure that reevaluates the model and returns the loss.

Return type:

None

class ScimbaSSBFGS(params, optimizer_args={}, **kwargs)[source]

Bases: AbstractScimbaOptimizer, SSBroyden

Scimba wrapper for SSBFGS optimizer.

Parameters:
  • params (Union[Iterable[Tensor], Iterable[dict[str, Any]], Iterable[tuple[str, Tensor]]]) – Iterable of parameters to optimize or dicts defining parameter groups.

  • optimizer_args (dict[str, Any]) – Additional arguments for the SSBFGS optimizer. Defaults to {}.

  • **kwargs – Arbitrary keyword arguments.

inner_step(closure)[source]

Performs the inner optimization step for ScimbaSSBFGS.

Parameters:

closure (Callable[[], float]) – A closure that reevaluates the model and returns the loss.

Return type:

None

class ScimbaSSBroyden(params, optimizer_args={}, **kwargs)[source]

Bases: AbstractScimbaOptimizer, SSBroyden

Scimba wrapper for SSBroyden optimizer.

Parameters:
  • params (Union[Iterable[Tensor], Iterable[dict[str, Any]], Iterable[tuple[str, Tensor]]]) – Iterable of parameters to optimize or dicts defining parameter groups.

  • optimizer_args (dict[str, Any]) – Additional arguments for the SSBroyden optimizer. Defaults to {}.

  • **kwargs – Arbitrary keyword arguments.

inner_step(closure)[source]

Performs the inner optimization step for ScimbaSSBroyden.

Parameters:

closure (Callable[[], float]) – A closure that reevaluates the model and returns the loss.

Return type:

None

class ScimbaCustomOptomizer(params, optimizer_args={}, scheduler=<class 'scimba_torch.optimizers.scimba_optimizers.NoScheduler'>, scheduler_args={}, **kwargs)[source]

Bases: AbstractScimbaOptimizer, ABC

An abstract class of which user defined optimizer must inherit.

abstract step(closure)[source]

To be implemented in subclasses: applies one step of optimizer.

Parameters:

closure (Callable[[], float]) – A closure that reevaluates the model and returns the loss.

class ScimbaMomentum(params, lr=0.001, momentum=0.0)[source]

Bases: ScimbaCustomOptomizer

Custom Momentum optimizer with scheduler.

For an example of a custom optimizer inheriting from AbstractScimbaOptimizer.

Parameters:
  • params (Union[Iterable[Tensor], Iterable[dict[str, Any]], Iterable[tuple[str, Tensor]]]) – Iterable of parameters to optimize or dicts defining parameter groups.

  • lr (float) – learning rate

  • momentum (float) – momentum

step(closure=None)[source]

Re-implements the step method.

Parameters:

closure (Optional[Callable[[], float]]) – A closure that reevaluates the model and returns the loss.

inner_step(closure)[source]

The inner step method.

Parameters:

closure (Callable[[], float]) – A closure that reevaluates the model and returns the loss.

Return type:

None