neurai.grads package#
Submodules#
- class neurai.grads.autograd.BP(loss_f, acc_f=None, fun_has_aux=False)#
Bases:
GradeBase
Back Propagation algorithm. Using the chain rule computes the gradients for feedforward neural network. And the loss and acc of the network.
- Parameters:
loss_f (Callable) – The loss function used to compute the loss of the network.
acc_f (Callable, optional) – The accuracy function used to evaluate the performance of the network. Default is None.
fun_has_aux (bool, optional) – Flag indicating whether the loss function takes auxiliary arguments, such as regularization parameters. Default is False.
- get_grads(model, param, batch_data, return_param=False, **kwargs)#
Computes the gradients of the network parameters using the Back Propagation algorithm.
- Parameters:
- Returns:
tuple – The gradients of the network parameters, the loss for the mini-batch, and the accuracy for the mini-batch, if the accuracy function was specified during initialization. If the accuracy function was not specified, the third element of the tuple is -1.
- predict(model, param, batch_data, **kwargs)#
Computes the loss and accuracy of the network for a mini-batch.
- class neurai.grads.autograd.BiPC(loss_f, loss_f_b, acc_f=None, eta=0.1, n=20, eta_trace=None, mode=PCMode.SWEIGHT_BISTRICT, fun_has_aux=False)#
Bases:
PC
Bidirectional Predict coding(BiPC) algorithm.
- Parameters:
loss_f (Callable) – Loss function
loss_f_b (Callable) – Loss function of reverse.
acc_f (Callable, optional) – Accuracy function, by default None
eta (float, optional) – Fine-tuning parameter , by default 0.1
n (int, optional) – Iter numbers, by default 20
eta_trace (EtaTrace) – The method of state correction, by default None
mode (PCMode) – The method of PC, by default PCMode.SWEIGHT_BISTRICT.
fun_has_aux (bool, optional) – Indicates whether fun returns a pair where the first element is considered the output of the mathematical function to be differentiated and the second element is auxiliary data. Default False.
- get_grads(model, param, batch_data, **kwargs)#
Compute the grads of network param.
- class neurai.grads.autograd.EP(loss_f, acc_f=None, beta=1.0, dt=0.1, n_relax=20, tau=1, tol=0, mode=EPMode.TWO_PHASE, activation=None, fun_has_aux=False, **kwargs)#
Bases:
LocalBase
Equilibrium Propagation(PC) algorithm.
- Parameters:
loss_f (Callable[..., Any]) – Loss function
acc_f (Callable[..., Any], optional) – Accuracy Function, by default None
beta (float, optional) – The influence param. , by default 1.
dt (float, optional) – The update hyperparameter of nodes, by default 1.
n_relax (int, optional) – Iterations, by default 20
tau (int, optional) – The update hyperparameter of nodes, by default 1
tol (int, optional) – The threshold, by default 0
mode (EPMode, optional) – The method of EP, by default EPMode.TWO_PHASE
activation (callable, optional) – The activate method of nodes, by default None
fun_has_aux (bool, optional) – Indicates whether fun returns a pair where the first element is considered the output of the mathematical function to be differentiated and the second element is auxiliary data. Default False.
- get_grads(model, param, batch_data, optim, opt_state, repeat=1)#
Compute gards, loss, acc, and energy.
- Parameters:
- Return type:
- Returns:
tuple – The grads, accuracy, loss and energy.
- normal_fun(dtype=None)#
Return a new array of given shape and type, filled with zeros.
LAX-backend implementation of
numpy.zeros()
.Original docstring below.
- predict(model, param, batch_data)#
Computes the loss and accuracy of the network for a mini-batch.
- class neurai.grads.autograd.GradeBase(loss_f, acc_f=None, fun_has_aux=False)#
Bases:
object
The base class for automatic differentiation.
- Parameters:
loss_f (Callable) – loss function.
acc_f (Callable, None.) – evaluation accuracy function.
fun_has_aux (Optional, bool) – Indicates whether fun returns a pair where the first element is considered the output of the mathematical function to be differentiated and the second element is auxiliary data. Default False.
- class neurai.grads.autograd.LocalBase(loss_f, acc_f=None, fun_has_aux=False)#
Bases:
GradeBase
The base class for automatic differentiation.
- Parameters:
loss_f (Callable) – loss function.
acc_f (Callable, None.) – evaluation accuracy function.
fun_has_aux (Optional, bool) – Indicates whether fun returns a pair where the first element is considered the output of the mathematical function to be differentiated and the second element is auxiliary data. Default False.
- predict(model, param, batch_data)#
Computes the loss and accuracy of the network for a mini-batch.
- class neurai.grads.autograd.PC(loss_f, acc_f=None, eta=0.1, n=20, mode=PCMode.STRICT, fun_has_aux=False)#
Bases:
LocalBase
Predict coding(PC) algorithm.
- Parameters:
loss_f (Callable) – Loss function
acc_f (Callable, optional) – Accuracy function, by default None
eta (float, optional) – Fine-tuning parameter , by default 0.1
n (int, optional) – Iter numbers, by default 20
mode (PCMode) – The method of PC, by default PCMode.STRICT.
fun_has_aux (bool, optional) – Indicates whether fun returns a pair where the first element is considered the output of the mathematical function to be differentiated and the second element is auxiliary data. Default False.
- get_grads(model, param, batch_data)#
Compute the grads of model param.
- neurai.grads.autograd.grad(fun, argnums=0, has_aux=False, holomorphic=False, allow_int=False, reduce_axes=(), return_fun_value=False)#
Creates a function that evaluates the gradient of
fun
.- Parameters:
fun (Callable) – Function to be differentiated. Its arguments at positions specified by
argnums
should be arrays, scalars, or standard Python containers. Argument arrays in the positions specified byargnums
must be of inexact (i.e., floating-point or complex) type. It should return a scalar (which includes arrays with shape()
but not arrays with shape(1,)
etc.)argnums (Union[int, Sequence[int]], optional) – integer or sequence of integers. Specifies which positional argument(s) to differentiate with respect to, by default 0
has_aux (bool, optional) – Indicates whether
fun
returns a pair where the first element is considered the output of the mathematical function to be differentiated and the second element is auxiliary data, by default Falseholomorphic (bool, optional) – Indicates whether
fun
is promised to be holomorphic. If True, inputs and outputs must be complex, by default Falseallow_int (bool, optional) – Whether to allow differentiating with respect to integer valued inputs. The gradient of an integer input will have a trivial vector-space dtype (float0), by default False
reduce_axes (Sequence[AxisName], optional) – tuple of axis names. If an axis is listed here, and
fun
implicitly broadcasts a value over that axis, the backward pass will perform apsum
of the corresponding gradient. Otherwise, the gradient will be per-example over named axes. For example, if'batch'
is a named batch axis,grad(f, reduce_axes=('batch',))
will create a function that computes the total gradient whilegrad(f)
will create one that computes the per-example gradient., by default ()return_fun_value (bool, optional) – Indicates whether the
fun
evaluation value is returned, the first argument tofun
. This parameter is added so that users don’t usejax.value_and_grad
andjax.grad
, by default False
- Return type:
- Returns:
typing.Callable – jax.value_and_grad : A function with the same arguments as fun that evaluates both fun and the gradient of fun and returns them as a pair (a two-element tuple). If argnums is an integer then the gradient has the same shape and type as the positional argument indicated by that integer. If argnums is a sequence of integers, the gradient is a tuple of values with the same shapes and types as the corresponding arguments. If has_aux is True then a tuple of ((value, auxiliary_data), gradient) is returned. jax.grad : A function with the same arguments as
fun
, that evaluates the gradient offun
. Ifargnums
is an integer then the gradient has the same shape and type as the positional argument indicated by that integer. If argnums is a tuple of integers, the gradient is a tuple of values with the same shapes and types as the corresponding arguments. Ifhas_aux
is True then a pair of (gradient, auxiliary_data) is returned.
- class neurai.grads.surrogate_grad.Gaussian(alpha=1.0)#
Bases:
SurrogateFunctionBase
STBP h4: the Gaussian cumulative distribution function (CDF) is used for backpropagation, and this is the fourth surrogate-gradient function mentioned in the article https://www.frontiersin.org/articles/10.3389/fnins.2018.00331/full .
The surrogate-gradient function is defined as:
\[\mathrm{STBP_{h4}}(u) = \frac{1}{\sqrt{2 \pi a}} e^{- \frac{(u - V_{th})^2}{2 a}}\]The primitive function :
\[\mathrm{STBP_{h4-primitive}(u)} = \frac{{\text{erfc}(-\alpha u)}}{2}\]where erfc is a function in JAX used to compute the Complementary Error Function.
- Parameters:
alpha – controlling the smoothness of the gradient during backpropagation.
- class neurai.grads.surrogate_grad.MultiGaussian(lens=0.5, scale=6.0, hight=0.15)#
Bases:
SurrogateFunctionBase
The Surrogate gradient is defined in the 2021 paper Accurate and efficient time-domain classification with adaptive spiking recurrent neural networks. Paper Link <https://www.nature.com/articles/s42256-021-00397-w>.
- class neurai.grads.surrogate_grad.Polynomial(alpha=1.0, spiking=True)#
Bases:
SurrogateFunctionBase
STBP h2: the polynomial function is used to represent the direction propagation, and the second surrogate-gradient function mentioned in the article [https://www.frontiersin.org/articles/10.3389/fnins.2018.00331/full] is used for backpropagation:
\[\mathrm{STBP_{h2}}(u) = \left( \frac{\sqrt{a}}{2} - \frac{a}{4} \vert u - V_{th} \vert \right) \text{sign} \left( \frac {2}{\sqrt{a}} - \vert u - V_{th} \vert \right)\]The primitive function :
\[\begin{split}\begin{align*} \text{mask0} &= (u > \frac{2}{\sqrt{\alpha}}) \\ \text{mask1} &= (|\,u\,| \leq \frac{2}{\sqrt{\alpha}}) \\ \mathrm{STBP_{h2-primitive}(u)} &= \text{mask0} + \text{mask1} \cdot \left(-\frac{\alpha}{8} \cdot u^2 \cdot \text{sign}(u) + \frac{\sqrt{\alpha}}{2} \cdot u + 0.5\right) \end{align*}\end{split}\]- Parameters:
alpha – controlling the smoothness of the gradient during backpropagation.
spiking – whether to output spikes or not. If ‘True’, the Heaviside fun is used during forward, and the surrogate-gradient function is used during backpropagation. If ‘False’, the original function corresponding to the surrogate-gradient function is used during forward, and the surrogate-gradient function is used during backpropagation.
- class neurai.grads.surrogate_grad.Rectangular(alpha=1.0, spiking=True)#
Bases:
SurrogateFunctionBase
STBP h1: the rectangular function is used to represent the direction propagation, and the first surrogate-gradient function mentioned in the article [https://www.frontiersin.org/articles/10.3389/fnins.2018.00331/full] is used for backpropagation:
\[\mathrm{STBP_{h1}}(u - V_{th}) = \frac{1}{a} \text{sign} \left( \vert u - V_{th} \vert < \frac{a}{2} \right)\]The primitive function :
\[\begin{split}\mathrm{STBP_{h1-primitive}}(u - V_{th}) = \begin{align*} \begin{cases} \frac{1}{\alpha} (u - V_{th}) + \frac{\alpha}{2} & \text{if } |u - V_{th}| < \frac{\alpha}{2} \\ \text{0} & \text{if sign(u - V_{th}) = -1} \\ \frac{\alpha}{2} & \text{otherwise} \end{cases} \end{align*}\end{split}\]- Parameters:
alpha (float) – The parameter a determines the steepness of the surrogate-gradient curve, i.e. the peak width.
spiking – whether to output spikes or not. If ‘True’, the Heaviside fun is used during forward, and the surrogate-gradient function is used during backpropagation. If ‘False’, the original function corresponding to the surrogate-gradient function is used during forward, and the surrogate-gradient function is used during backpropagation.
- class neurai.grads.surrogate_grad.Sigmoid(alpha=4.0, spiking=True)#
Bases:
SurrogateFunctionBase
Sigmoid function implementation using SurrogateFunctionBase. STBP h3: the sigmoid function is used to represent the direction propagation, and the third surrogate-gradient function mentioned in the article [https://www.frontiersin.org/articles/10.3389/fnins.2018.00331/full] is used for backpropagation:
\[\mathrm{STBP_{h3}}(u) = \frac{1}{a} \frac{e^{ \frac{V_{th} - u}{a}}}{\left(1+ e^{\frac{V_{th} - u}{a}} \right)^2}\]The primitive function :
\[\mathrm{STBP_{h3-primitive}(u)} = \frac{1}{1 + e^{-\alpha u}}\]- Parameters:
alpha – controlling the smoothness of the gradient during backpropagation.
spiking – whether to output spikes or not. If ‘True’, the Heaviside fun is used during forward, and the surrogate-gradient function is used during backpropagation. If ‘False’, the original function corresponding to the surrogate-gradient function is used during forward, and the surrogate-gradient function is used during backpropagation.
Examples
>>> sigmoid = Sigmoid(alpha=4.0, spiking=True) >>> x = np.array([1, 2, 3]) >>> y = sigmoid(x)
- class neurai.grads.surrogate_grad.SingleExponential(grad_width=0.5, grad_scale=1.0)#
Bases:
object
Surrogate gradient as defined in Shrestha and Orchard, 2018. Paper Link.
- class neurai.grads.surrogate_grad.SlayerPdf(scale=1, tau=10.0, spiking=True)#
Bases:
SurrogateFunctionBase
slayer surrogate uses the probability density function(PDF). The surrogate-gradient function is defined as:
\[\rho(t) = \alpha e^{-\beta |u(t) - \theta|}\]
- class neurai.grads.surrogate_grad.SuperSpike(alpha=1.0)#
Bases:
SurrogateFunctionBase
the SuperSpike algorithm is used for backpropagation, and this is the surrogate-gradient function used in this article . An implementation of this function can be found in here . The surrogate-gradient function is defined as:
\[\sigma(U_i) = \left( (1 + \beta | U_i - \theta |)^{-2} \right)\]- Parameters:
alpha – controlling the smoothness of the gradient during backpropagation.