neurai.grads package#

Submodules#

class neurai.grads.autograd.BP(loss_f, acc_f=None, fun_has_aux=False)#

Bases: GradeBase

Back Propagation algorithm. Using the chain rule computes the gradients for feedforward neural network. And the loss and acc of the network.

Parameters:
  • loss_f (Callable) – The loss function used to compute the loss of the network.

  • acc_f (Callable, optional) – The accuracy function used to evaluate the performance of the network. Default is None.

  • fun_has_aux (bool, optional) – Flag indicating whether the loss function takes auxiliary arguments, such as regularization parameters. Default is False.

get_grads(model, param, batch_data, return_param=False, **kwargs)#

Computes the gradients of the network parameters using the Back Propagation algorithm.

Parameters:
  • model (Callable) – The neural network model.

  • param (list) – The current values of the network parameters.

  • batch_data (tuple) – The input and output data for a mini-batch.

  • return_param (bool) – Whether to return param

Returns:

tuple – The gradients of the network parameters, the loss for the mini-batch, and the accuracy for the mini-batch, if the accuracy function was specified during initialization. If the accuracy function was not specified, the third element of the tuple is -1.

predict(model, param, batch_data, **kwargs)#

Computes the loss and accuracy of the network for a mini-batch.

Parameters:
  • model (Callable) – The neural network model.

  • param (dict) – The current values of the network parameters.

  • batch_data (tuple) – The input and output data for a mini-batch.

Returns:

tuple – The loss and accuracy for the mini-batch.

class neurai.grads.autograd.BiPC(loss_f, loss_f_b, acc_f=None, eta=0.1, n=20, eta_trace=None, mode=PCMode.SWEIGHT_BISTRICT, fun_has_aux=False)#

Bases: PC

Bidirectional Predict coding(BiPC) algorithm.

Parameters:
  • loss_f (Callable) – Loss function

  • loss_f_b (Callable) – Loss function of reverse.

  • acc_f (Callable, optional) – Accuracy function, by default None

  • eta (float, optional) – Fine-tuning parameter , by default 0.1

  • n (int, optional) – Iter numbers, by default 20

  • eta_trace (EtaTrace) – The method of state correction, by default None

  • mode (PCMode) – The method of PC, by default PCMode.SWEIGHT_BISTRICT.

  • fun_has_aux (bool, optional) – Indicates whether fun returns a pair where the first element is considered the output of the mathematical function to be differentiated and the second element is auxiliary data. Default False.

get_grads(model, param, batch_data, **kwargs)#

Compute the grads of network param.

Parameters:
  • model (object) – The instance object of model.

  • param (dict) – The param of model.

  • batch_data (tuple) – The input and output data for a mini-batch.

Return type:

dict

Returns:

dict – Return the grad of param loss and acc value.

class neurai.grads.autograd.EP(loss_f, acc_f=None, beta=1.0, dt=0.1, n_relax=20, tau=1, tol=0, mode=EPMode.TWO_PHASE, activation=None, fun_has_aux=False, **kwargs)#

Bases: LocalBase

Equilibrium Propagation(PC) algorithm.

Parameters:
  • loss_f (Callable[..., Any]) – Loss function

  • acc_f (Callable[..., Any], optional) – Accuracy Function, by default None

  • beta (float, optional) – The influence param. , by default 1.

  • dt (float, optional) – The update hyperparameter of nodes, by default 1.

  • n_relax (int, optional) – Iterations, by default 20

  • tau (int, optional) – The update hyperparameter of nodes, by default 1

  • tol (int, optional) – The threshold, by default 0

  • mode (EPMode, optional) – The method of EP, by default EPMode.TWO_PHASE

  • activation (callable, optional) – The activate method of nodes, by default None

  • fun_has_aux (bool, optional) – Indicates whether fun returns a pair where the first element is considered the output of the mathematical function to be differentiated and the second element is auxiliary data. Default False.

get_grads(model, param, batch_data, optim, opt_state, repeat=1)#

Compute gards, loss, acc, and energy.

Parameters:
  • model (object) – The instance object of model.

  • param (dict) – The param of model.

  • batch_data (tuple) – The input and output data for a mini-batch.

  • optim – The instance object of Optimizer.

  • opt_state – The state of Optimizer.

  • repeat (int) – The repeat numbers of EP method.

Return type:

tuple

Returns:

tuple – The grads, accuracy, loss and energy.

normal_fun(dtype=None)#

Return a new array of given shape and type, filled with zeros.

LAX-backend implementation of numpy.zeros().

Original docstring below.

Parameters:
  • shape (int or tuple of ints) – Shape of the new array, e.g., (2, 3) or 2.

  • dtype (data-type, optional) – The desired data-type for the array, e.g., numpy.int8. Default is numpy.float64.

Return type:

Array

Returns:

out (ndarray) – Array of zeros with the given shape, dtype, and order.

predict(model, param, batch_data)#

Computes the loss and accuracy of the network for a mini-batch.

Parameters:
  • model (Callable) – The neural network model.

  • param (dict) – The param of model.

  • batch_data (tuple) – The input and output data for a mini-batch.

Return type:

tuple

Returns:

tuple – The loss and accuracy for the mini-batch.

class neurai.grads.autograd.GradeBase(loss_f, acc_f=None, fun_has_aux=False)#

Bases: object

The base class for automatic differentiation.

Parameters:
  • loss_f (Callable) – loss function.

  • acc_f (Callable, None.) – evaluation accuracy function.

  • fun_has_aux (Optional, bool) – Indicates whether fun returns a pair where the first element is considered the output of the mathematical function to be differentiated and the second element is auxiliary data. Default False.

class neurai.grads.autograd.LocalBase(loss_f, acc_f=None, fun_has_aux=False)#

Bases: GradeBase

The base class for automatic differentiation.

Parameters:
  • loss_f (Callable) – loss function.

  • acc_f (Callable, None.) – evaluation accuracy function.

  • fun_has_aux (Optional, bool) – Indicates whether fun returns a pair where the first element is considered the output of the mathematical function to be differentiated and the second element is auxiliary data. Default False.

predict(model, param, batch_data)#

Computes the loss and accuracy of the network for a mini-batch.

Parameters:
  • model (Callable) – The neural network model.

  • param (dict) – The instance object of network.

  • batch_data (tuple) – The input and output data for a mini-batch.

Return type:

tuple

Returns:

tuple – The loss and accuracy for the mini-batch.

class neurai.grads.autograd.PC(loss_f, acc_f=None, eta=0.1, n=20, mode=PCMode.STRICT, fun_has_aux=False)#

Bases: LocalBase

Predict coding(PC) algorithm.

Parameters:
  • loss_f (Callable) – Loss function

  • acc_f (Callable, optional) – Accuracy function, by default None

  • eta (float, optional) – Fine-tuning parameter , by default 0.1

  • n (int, optional) – Iter numbers, by default 20

  • mode (PCMode) – The method of PC, by default PCMode.STRICT.

  • fun_has_aux (bool, optional) – Indicates whether fun returns a pair where the first element is considered the output of the mathematical function to be differentiated and the second element is auxiliary data. Default False.

get_grads(model, param, batch_data)#

Compute the grads of model param.

Parameters:
  • model (object) – The instance object of network.

  • params (dict) – Parameters of model.

  • batch_data (tuple) – Subset of datasets.

  • param (dict) –

Return type:

tuple

Returns:

tuple – Return the grad of param loss and acc value.

neurai.grads.autograd.grad(fun, argnums=0, has_aux=False, holomorphic=False, allow_int=False, reduce_axes=(), return_fun_value=False)#

Creates a function that evaluates the gradient of fun.

Parameters:
  • fun (Callable) – Function to be differentiated. Its arguments at positions specified by argnums should be arrays, scalars, or standard Python containers. Argument arrays in the positions specified by argnums must be of inexact (i.e., floating-point or complex) type. It should return a scalar (which includes arrays with shape () but not arrays with shape (1,) etc.)

  • argnums (Union[int, Sequence[int]], optional) – integer or sequence of integers. Specifies which positional argument(s) to differentiate with respect to, by default 0

  • has_aux (bool, optional) – Indicates whether fun returns a pair where the first element is considered the output of the mathematical function to be differentiated and the second element is auxiliary data, by default False

  • holomorphic (bool, optional) – Indicates whether fun is promised to be holomorphic. If True, inputs and outputs must be complex, by default False

  • allow_int (bool, optional) – Whether to allow differentiating with respect to integer valued inputs. The gradient of an integer input will have a trivial vector-space dtype (float0), by default False

  • reduce_axes (Sequence[AxisName], optional) – tuple of axis names. If an axis is listed here, and fun implicitly broadcasts a value over that axis, the backward pass will perform a psum of the corresponding gradient. Otherwise, the gradient will be per-example over named axes. For example, if 'batch' is a named batch axis, grad(f, reduce_axes=('batch',)) will create a function that computes the total gradient while grad(f) will create one that computes the per-example gradient., by default ()

  • return_fun_value (bool, optional) – Indicates whether the fun evaluation value is returned, the first argument to fun. This parameter is added so that users don’t use jax.value_and_grad and jax.grad, by default False

Return type:

Callable

Returns:

typing.Callable – jax.value_and_grad : A function with the same arguments as fun that evaluates both fun and the gradient of fun and returns them as a pair (a two-element tuple). If argnums is an integer then the gradient has the same shape and type as the positional argument indicated by that integer. If argnums is a sequence of integers, the gradient is a tuple of values with the same shapes and types as the corresponding arguments. If has_aux is True then a tuple of ((value, auxiliary_data), gradient) is returned. jax.grad : A function with the same arguments as fun, that evaluates the gradient of fun. If argnums is an integer then the gradient has the same shape and type as the positional argument indicated by that integer. If argnums is a tuple of integers, the gradient is a tuple of values with the same shapes and types as the corresponding arguments. If has_aux is True then a pair of (gradient, auxiliary_data) is returned.

class neurai.grads.surrogate_grad.Gaussian(alpha=1.0)#

Bases: SurrogateFunctionBase

STBP h4: the Gaussian cumulative distribution function (CDF) is used for backpropagation, and this is the fourth surrogate-gradient function mentioned in the article https://www.frontiersin.org/articles/10.3389/fnins.2018.00331/full .

The surrogate-gradient function is defined as:

\[\mathrm{STBP_{h4}}(u) = \frac{1}{\sqrt{2 \pi a}} e^{- \frac{(u - V_{th})^2}{2 a}}\]

The primitive function :

\[\mathrm{STBP_{h4-primitive}(u)} = \frac{{\text{erfc}(-\alpha u)}}{2}\]

where erfc is a function in JAX used to compute the Complementary Error Function.

Parameters:

alpha – controlling the smoothness of the gradient during backpropagation.

class neurai.grads.surrogate_grad.MultiGaussian(lens=0.5, scale=6.0, hight=0.15)#

Bases: SurrogateFunctionBase

The Surrogate gradient is defined in the 2021 paper Accurate and efficient time-domain classification with adaptive spiking recurrent neural networks. Paper Link <https://www.nature.com/articles/s42256-021-00397-w>.

class neurai.grads.surrogate_grad.Polynomial(alpha=1.0, spiking=True)#

Bases: SurrogateFunctionBase

STBP h2: the polynomial function is used to represent the direction propagation, and the second surrogate-gradient function mentioned in the article [https://www.frontiersin.org/articles/10.3389/fnins.2018.00331/full] is used for backpropagation:

\[\mathrm{STBP_{h2}}(u) = \left( \frac{\sqrt{a}}{2} - \frac{a}{4} \vert u - V_{th} \vert \right) \text{sign} \left( \frac {2}{\sqrt{a}} - \vert u - V_{th} \vert \right)\]

The primitive function :

\[\begin{split}\begin{align*} \text{mask0} &= (u > \frac{2}{\sqrt{\alpha}}) \\ \text{mask1} &= (|\,u\,| \leq \frac{2}{\sqrt{\alpha}}) \\ \mathrm{STBP_{h2-primitive}(u)} &= \text{mask0} + \text{mask1} \cdot \left(-\frac{\alpha}{8} \cdot u^2 \cdot \text{sign}(u) + \frac{\sqrt{\alpha}}{2} \cdot u + 0.5\right) \end{align*}\end{split}\]
Parameters:
  • alpha – controlling the smoothness of the gradient during backpropagation.

  • spiking – whether to output spikes or not. If ‘True’, the Heaviside fun is used during forward, and the surrogate-gradient function is used during backpropagation. If ‘False’, the original function corresponding to the surrogate-gradient function is used during forward, and the surrogate-gradient function is used during backpropagation.

class neurai.grads.surrogate_grad.Rectangular(alpha=1.0, spiking=True)#

Bases: SurrogateFunctionBase

STBP h1: the rectangular function is used to represent the direction propagation, and the first surrogate-gradient function mentioned in the article [https://www.frontiersin.org/articles/10.3389/fnins.2018.00331/full] is used for backpropagation:

\[\mathrm{STBP_{h1}}(u - V_{th}) = \frac{1}{a} \text{sign} \left( \vert u - V_{th} \vert < \frac{a}{2} \right)\]

The primitive function :

\[\begin{split}\mathrm{STBP_{h1-primitive}}(u - V_{th}) = \begin{align*} \begin{cases} \frac{1}{\alpha} (u - V_{th}) + \frac{\alpha}{2} & \text{if } |u - V_{th}| < \frac{\alpha}{2} \\ \text{0} & \text{if sign(u - V_{th}) = -1} \\ \frac{\alpha}{2} & \text{otherwise} \end{cases} \end{align*}\end{split}\]
Parameters:
  • alpha (float) – The parameter a determines the steepness of the surrogate-gradient curve, i.e. the peak width.

  • spiking – whether to output spikes or not. If ‘True’, the Heaviside fun is used during forward, and the surrogate-gradient function is used during backpropagation. If ‘False’, the original function corresponding to the surrogate-gradient function is used during forward, and the surrogate-gradient function is used during backpropagation.

class neurai.grads.surrogate_grad.Sigmoid(alpha=4.0, spiking=True)#

Bases: SurrogateFunctionBase

Sigmoid function implementation using SurrogateFunctionBase. STBP h3: the sigmoid function is used to represent the direction propagation, and the third surrogate-gradient function mentioned in the article [https://www.frontiersin.org/articles/10.3389/fnins.2018.00331/full] is used for backpropagation:

\[\mathrm{STBP_{h3}}(u) = \frac{1}{a} \frac{e^{ \frac{V_{th} - u}{a}}}{\left(1+ e^{\frac{V_{th} - u}{a}} \right)^2}\]

The primitive function :

\[\mathrm{STBP_{h3-primitive}(u)} = \frac{1}{1 + e^{-\alpha u}}\]
Parameters:
  • alpha – controlling the smoothness of the gradient during backpropagation.

  • spiking – whether to output spikes or not. If ‘True’, the Heaviside fun is used during forward, and the surrogate-gradient function is used during backpropagation. If ‘False’, the original function corresponding to the surrogate-gradient function is used during forward, and the surrogate-gradient function is used during backpropagation.

Examples

>>> sigmoid = Sigmoid(alpha=4.0, spiking=True)
>>> x = np.array([1, 2, 3])
>>> y = sigmoid(x)
class neurai.grads.surrogate_grad.SingleExponential(grad_width=0.5, grad_scale=1.0)#

Bases: object

Surrogate gradient as defined in Shrestha and Orchard, 2018. Paper Link.

Parameters:
class neurai.grads.surrogate_grad.SlayerPdf(scale=1, tau=10.0, spiking=True)#

Bases: SurrogateFunctionBase

slayer surrogate uses the probability density function(PDF). The surrogate-gradient function is defined as:

\[\rho(t) = \alpha e^{-\beta |u(t) - \theta|}\]
Parameters:
  • scale (float) – spike function derivative scale factor. This is calculated as alpha divided by tau.

  • tau (float) – spike function derivative time constant.

  • spiking (bool) – A flag indicating whether to use the spiking or the primitive function.

class neurai.grads.surrogate_grad.SuperSpike(alpha=1.0)#

Bases: SurrogateFunctionBase

the SuperSpike algorithm is used for backpropagation, and this is the surrogate-gradient function used in this article . An implementation of this function can be found in here . The surrogate-gradient function is defined as:

\[\sigma(U_i) = \left( (1 + \beta | U_i - \theta |)^{-2} \right)\]
Parameters:

alpha – controlling the smoothness of the gradient during backpropagation.