`larq.quantizers`

¶

A Quantizer defines the way of transforming a full precision input to a quantized output and the pseudo-gradient method used for the backwards pass.

Quantizers can either be used through quantizer arguments that are supported for Larq layers, such as `input_quantizer`

and `kernel_quantizer`

; or they can be used similar to activations, i.e. either through an `Activation`

layer, or through the `activation`

argument supported by all forward layers:

```
import tensorflow as tf
import larq as lq
...
x = lq.layers.QuantDense(64, activation=None)(x)
x = lq.layers.QuantDense(64, input_quantizer="ste_sign")(x)
```

is equivalent to:

```
x = lq.layers.QuantDense(64)(x)
x = tf.keras.layers.Activation("ste_sign")(x)
x = lq.layers.QuantDense(64)(x)
```

as well as:

```
x = lq.layers.QuantDense(64, activation="ste_sign")(x)
x = lq.layers.QuantDense(64)(x)
```

We highly recommend using the first of these formulations: for the other two formulations, intermediate layers - like batch normalization or average pooling - and shortcut connections may result in non-binary input to the convolutions.

Quantizers can either be referenced by string or called directly. The following usages are equivalent:

```
lq.layers.QuantDense(64, kernel_quantizer="ste_sign")
```

```
lq.layers.QuantDense(64, kernel_quantizer=lq.quantizers.SteSign(clip_value=1.0))
```

### Quantizer¶

```
larq.quantizers.Quantizer(
trainable=True, name=None, dtype=None, dynamic=False, **kwargs
)
```

Common base class for defining quantizers.

**Attributes**

**precision**: An integer defining the precision of the output. This value will be used by`lq.models.summary()`

for improved logging.

### NoOp¶

```
larq.quantizers.NoOp(precision, **kwargs)
```

Instantiates a serializable no-op quantizer.

\[ q(x) = x \]

Warning

This quantizer will not change the input variable. It is only intended to mark variables with a desired precision that will be recognized by optimizers like `Bop`

and add training metrics to track variable changes.

Example

```
layer = lq.layers.QuantDense(
16, kernel_quantizer=lq.quantizers.NoOp(precision=1),
)
layer.build((32,))
assert layer.kernel.precision == 1
```

**Arguments**

**precision**`int`

: Set the desired precision of the variable. This can be used to tag**metrics**: An array of metrics to add to the layer. If`None`

the metrics set in`larq.context.metrics_scope`

are used. Currently only the`flip_ratio`

metric is available.

### SteSign¶

```
larq.quantizers.SteSign(clip_value=1.0, **kwargs)
```

Instantiates a serializable binary quantizer.

\[ q(x) = \begin{cases} -1 & x < 0 \\ 1 & x \geq 0 \end{cases} \]

The gradient is estimated using the Straight-Through Estimator (essentially the binarization is replaced by a clipped identity on the backward pass). \[\frac{\partial q(x)}{\partial x} = \begin{cases} 1 & \left|x\right| \leq \texttt{clip_value} \\ 0 & \left|x\right| > \texttt{clip_value} \end{cases}\]

**Arguments**

**clip_value**`float`

: Threshold for clipping gradients. If`None`

gradients are not clipped.**metrics**: An array of metrics to add to the layer. If`None`

the metrics set in`larq.context.metrics_scope`

are used. Currently only the`flip_ratio`

metric is available.

**References**

### ApproxSign¶

```
larq.quantizers.ApproxSign(*args, metrics=None, **kwargs)
```

Instantiates a serializable binary quantizer. \[ q(x) = \begin{cases} -1 & x < 0 \\ 1 & x \geq 0 \end{cases} \]

The gradient is estimated using the ApproxSign method. \[\frac{\partial q(x)}{\partial x} = \begin{cases} (2 - 2 \left|x\right|) & \left|x\right| \leq 1 \\ 0 & \left|x\right| > 1 \end{cases} \]

**Arguments**

**metrics**: An array of metrics to add to the layer. If`None`

the metrics set in`larq.context.metrics_scope`

are used. Currently only the`flip_ratio`

metric is available.

**References**

### SteHeaviside¶

```
larq.quantizers.SteHeaviside(clip_value=1.0, **kwargs)
```

Instantiates a binarization quantizer with output values 0 and 1. \[ q(x) = \begin{cases} +1 & x > 0 \\ 0 & x \leq 0 \end{cases} \]

The gradient is estimated using the Straight-Through Estimator (essentially the binarization is replaced by a clipped identity on the backward pass).

\[\frac{\partial q(x)}{\partial x} = \begin{cases} 1 & \left|x\right| \leq 1 \\ 0 & \left|x\right| > 1 \end{cases}\]

**Arguments**

**clip_value**`float`

: Threshold for clipping gradients. If`None`

gradients are not clipped.**metrics**: An array of metrics to add to the layer. If`None`

the metrics set in`larq.context.metrics_scope`

are used. Currently only the`flip_ratio`

metric is available.

**Returns**

AND Binarization function

### SwishSign¶

```
larq.quantizers.SwishSign(beta=5.0, **kwargs)
```

Sign binarization function.

\[ q(x) = \begin{cases} -1 & x < 0 \\ 1 & x \geq 0 \end{cases} \]

The gradient is estimated using the SignSwish method.

\[ \frac{\partial q_{\beta}(x)}{\partial x} = \frac{\beta\left\{2-\beta x \tanh \left(\frac{\beta x}{2}\right)\right\}}{1+\cosh (\beta x)} \]

**Arguments**

**beta**`float`

: Larger values result in a closer approximation to the derivative of the sign.**metrics**: An array of metrics to add to the layer. If`None`

the metrics set in`larq.context.metrics_scope`

are used. Currently only the`flip_ratio`

metric is available.

**Returns**

SwishSign quantization function

**References**

### MagnitudeAwareSign¶

```
larq.quantizers.MagnitudeAwareSign(clip_value=1.0, **kwargs)
```

Instantiates a serializable magnitude-aware sign quantizer for Bi-Real Net.

A scaled sign function computed according to Section 3.3 in Zechun Liu et al.

**Arguments**

**clip_value**`float`

: Threshold for clipping gradients. If`None`

gradients are not clipped.**metrics**: An array of metrics to add to the layer. If`None`

the metrics set in`larq.context.metrics_scope`

are used. Currently only the`flip_ratio`

metric is available.

**References**

### SteTern¶

```
larq.quantizers.SteTern(
threshold_value=0.05, ternary_weight_networks=False, clip_value=1.0, **kwargs
)
```

Instantiates a serializable ternarization quantizer.

\[ q(x) = \begin{cases} +1 & x > \Delta \\ 0 & |x| < \Delta \\ -1 & x < - \Delta \end{cases} \]

where \(\Delta\) is defined as the threshold and can be passed as an argument, or can be calculated as per the Ternary Weight Networks original paper, such that

\[ \Delta = \frac{0.7}{n} \sum_{i=1}^{n} |W_i| \] where we assume that \(W_i\) is generated from a normal distribution.

The gradient is estimated using the Straight-Through Estimator (essentially the Ternarization is replaced by a clipped identity on the backward pass). \[\frac{\partial q(x)}{\partial x} = \begin{cases} 1 & \left|x\right| \leq \texttt{clip_value} \\ 0 & \left|x\right| > \texttt{clip_value} \end{cases}\]

**Arguments**

**threshold_value**`float`

: The value for the threshold, \(\Delta\).**ternary_weight_networks**`bool`

: Boolean of whether to use the Ternary Weight Networks threshold calculation.**clip_value**`float`

: Threshold for clipping gradients. If`None`

gradients are not clipped.**metrics**: An array of metrics to add to the layer. If`None`

the metrics set in`larq.context.metrics_scope`

are used. Currently only the`flip_ratio`

metric is available.

**References**

### DoReFa¶

```
larq.quantizers.DoReFa(k_bit=2, mode="activations", **kwargs)
```

Instantiates a serializable k_bit quantizer as in the DoReFa paper.

\[ q(x) = \begin{cases} 0 & x < \frac{1}{2n} \\ \frac{i}{n} & \frac{2i-1}{2n} < x < \frac{2i+1}{2n} \text{ for } i \in \{1,n-1\}\\ 1 & \frac{2n-1}{2n} < x \end{cases} \]

where \(n = 2^{\text{k_bit}} - 1\). The number of bits, k_bit, needs to be passed as an argument. The gradient is estimated using the Straight-Through Estimator (essentially the binarization is replaced by a clipped identity on the backward pass). \[\frac{\partial q(x)}{\partial x} = \begin{cases} 1 & 0 \leq x \leq 1 \\ 0 & \text{else} \end{cases}\]

The behavior for quantizing weights should be different in comparison to the quantization of activations: instead of limiting input operands (or in this case: weights) using a hard limiter, a tangens hyperbolicus is applied to achieve a softer limiting with a gradient, which is continuously differentiable itself.

\[ w_{lim}(w) = \tanh(w) \]

Furthermore, the weights of each layer are normed, such that the weight with the largest magnitude gets the largest or smallest (depending on its sign) quantizable value. That way, the full quantizable numeric range is utilized.

\[ w_{norm}(w) = \frac{w}{\max(|w|)} \]

The formulas can be found in the paper in section 2.3. Please note, that the paper refers to weights being quantized on a numeric range of [-1, 1], while activations are quantized on the numeric range [0, 1]. This implementation uses the same ranges as specified in the paper.

The activation quantizer defines the function quantizek() from the paper with the correct numeric range of [0, 1]. The weight quantization mode adds pre- and post-processing for numeric range adaptions, soft limiting and norming. The full quantization function including the adaption of numeric ranges is

\[ q(w) = 2 \, quantize_{k}(\frac{w_{norm}\left(w_{lim}\left(w\right)\right)}{2} + \frac{1}{2}) - 1 \]

Warning

The weight mode works for weights on the range [-1, 1], which matches the default setting of `constraints.weight_clip`

. Do not use this quantizer with a different constraint `clip_value`

than the default one.

`mode == "activations"`

`mode == "weights"`

**Arguments**

**k_bit**`int`

: number of bits for the quantization.**mode**`str`

:`"activations"`

for clipping inputs on [0, 1] range or`"weights"`

for soft-clipping and norming weights on [-1, 1] range before applying quantization.**metrics**: An array of metrics to add to the layer. If`None`

the metrics set in`larq.context.metrics_scope`

are used. Currently only the`flip_ratio`

metric is available.

**Returns**

Quantization function

**Raises**

ValueError for bad value of `mode`

.

**References**