Larq Compute Engine Operators¶

Larq Compute Engine extends the builtin TensorFlow Lite operators with optimized operators for running binarized neural networks.

Data Format¶

Larq Compute Engine adheres to the NHWC data format used in TensorFlow Lite for activation tensors. In the following a tensor of format BitTensor<int32, n> represents a TensorFlow Lite tensor with data type int32 containing binary values which are bitpacked across the channel dimension n and potentially padded with zeros. Mathematically a 0 valued bit represents a real value of 1.0 while 1 is interpreted as -1.0.

Options for the operators are stored using FlexBuffers in the .tflite model files generated by the Converter and passed as opaque values to the custom operators.

List of Operations¶

LceBconv2d¶

2D binarized convolution layer.

The operation will compute \[ \hat{y}_{n,\mathrm{out}} = \sum_{i = 0}^{I - 1} w_{\mathrm{out}, i} \star \mathrm{bsign}(x_{n,i}) \] with \(n \in [0, N)\) and \(\mathrm{out} \in [0, O)\), where \(\star\) is the 2D cross-correlation operator, \(N\) is a batch size, \(I\) and \(O\) denote the number of input and output channels, and \(\mathrm{bsign}\)[^1] is the binary sign function.

The final output with type Tensor<float32|int8> is then calculated as \[ y_{n,\mathrm{out}} = \beta_\mathrm{out} + \gamma_\mathrm{out} \, \sigma\left(\hat{y}_{n,\mathrm{out}}\right)\text{.} \] If the output type is BitTensor<int32, 3> the final transformation is simplified to \[ y_{n,\mathrm{out}} = \begin{cases} -1.0 & \hat{y}_{n,\mathrm{out}} > \tau_\mathrm{out} \\ \hphantom{-}1.0 & \hat{y}_{n,\mathrm{out}} \leq \tau_\mathrm{out}\text{.} \end{cases} \]

Inputs

BitTensor<int32, 3>: 4D input tensor \(x\)
BitTensor<int32, 3>: 4D bitpacked binary filter tensor \(w\) in OHWI format
Tensor<float32> | null: 1D post activation multiplier \(\gamma\). This operand will be null if an output threshold is set.
Tensor<float32> | null: 1D post activation bias \(\beta\). This operand will be null if an output threshold is set.
Tensor<int32> | null: 1D output threshold \(\tau\). This operand defines the binary output threshold if experimental_enable_bitpacked_activations is enabled and the output is BitTensor<int32, 3>.

Outputs

Tensor<float32|int8> | BitTensor<int32, 3>: Result of the 2D convolution of the input tensor \(y\)

Options

channels_in int32: Number of input channels of the incoming activations. This is necessary since input channels cannot be inferred from the shape of weights and activations if both are bitpacked.
dilation_height_factor int32: Vertical dilation rate of the filter window
dilation_width_factor int32: Horizontal dilation rate of the filter window
fused_activation_function ActivationFunctionType: \(\sigma\), one of NONE, RELU, RELU_N1_TO_1 or RELU6
padding Padding: One of SAME or VALID
stride_height int32: Vertical stride of the filter window
stride_width int32: Horizontal stride of the filter window

LceBMaxPool2d¶

Max pooling operation on spatial input data.

Inputs

BitTensor<int32, 3>: 4D input tensor

Outputs

BitTensor<int32, 3>: A tensor where each entry is the maximum of the input values in the corresponding window.

Options

padding Padding: One of SAME or VALID
stride_width int32: Horizontal stride of the sliding window
stride_height int32: Vertical stride of the sliding window
filter_width int32: Horizontal size of the sliding window
filter_height int32: Vertical size of the sliding window

LceQuantize¶

Binary quantize operation.

Inputs

Tensor<float32|int8>: Input tensor

Outputs

BitTensor<int32, -1>: Binarized tensor, bitpacked along the last dimension

LceDequantize¶

Binary dequantize operation.

Inputs

BitTensor<int32, -1>: Binarized input tensor, bitpacked along the last dimension

Outputs

Tensor<float32|int8>: Output tensor