Skip to content

Larq Compute Engine Operators

Larq Compute Engine extends the builtin TensorFlow Lite operators with optimized operators for running binarized neural networks.

Data Format

Larq Compute Engine adheres to the NHWC data format used in TensorFlow Lite for activation tensors. In the following a tensor of format BitTensor<int32, n> represents a TensorFlow Lite tensor with data type int32 containing binary values which are bitpacked across the channel dimension n and potentially padded with zeros. Mathematically a 0 valued bit represents a real value of 1.0 while 1 is interpreted as -1.0.

Options for the operators are stored using FlexBuffers in the .tflite model files generated by the Converter and passed as opaque values to the custom operators.

List of Operations

LceBconv2d

2D binarized convolution layer.

The operation will compute \[ \hat{y}_{n,\mathrm{out}} = \sum_{i = 0}^{I - 1} w_{\mathrm{out}, i} \star \mathrm{bsign}(x_{n,i}) \] with \(n \in [0, N)\) and \(\mathrm{out} \in [0, O)\), where \(\star\) is the 2D cross-correlation operator, \(N\) is a batch size, \(I\) and \(O\) denote the number of input and output channels, and \(\mathrm{bsign}\)[^1] is the binary sign function.

The final output with type Tensor<float32|int8> is then calculated as \[ y_{n,\mathrm{out}} = \beta_\mathrm{out} + \gamma_\mathrm{out} \, \sigma\left(\hat{y}_{n,\mathrm{out}}\right)\text{.} \] If the output type is BitTensor<int32, 3> the final transformation is simplified to \[ y_{n,\mathrm{out}} = \begin{cases} -1.0 & \hat{y}_{n,\mathrm{out}} > \tau_\mathrm{out} \\ \hphantom{-}1.0 & \hat{y}_{n,\mathrm{out}} \leq \tau_\mathrm{out}\text{.} \end{cases} \]

Inputs

  • BitTensor<int32, 3>: 4D input tensor \(x\)
  • BitTensor<int32, 3>: 4D bitpacked binary filter tensor \(w\) in OHWI format
  • Tensor<float32> | null: 1D post activation multiplier \(\gamma\). This operand will be null if an output threshold is set.
  • Tensor<float32> | null: 1D post activation bias \(\beta\). This operand will be null if an output threshold is set.
  • Tensor<int32> | null: 1D output threshold \(\tau\). This operand defines the binary output threshold if experimental_enable_bitpacked_activations is enabled and the output is BitTensor<int32, 3>.

Outputs

  • Tensor<float32|int8> | BitTensor<int32, 3>: Result of the 2D convolution of the input tensor \(y\)

Options

  • channels_in int32: Number of input channels of the incoming activations. This is necessary since input channels cannot be inferred from the shape of weights and activations if both are bitpacked.
  • dilation_height_factor int32: Vertical dilation rate of the filter window
  • dilation_width_factor int32: Horizontal dilation rate of the filter window
  • fused_activation_function ActivationFunctionType: \(\sigma\), one of NONE, RELU, RELU_N1_TO_1 or RELU6
  • padding Padding: One of SAME or VALID
  • stride_height int32: Vertical stride of the filter window
  • stride_width int32: Horizontal stride of the filter window

LceBMaxPool2d

Max pooling operation on spatial input data.

Inputs

  • BitTensor<int32, 3>: 4D input tensor

Outputs

  • BitTensor<int32, 3>: A tensor where each entry is the maximum of the input values in the corresponding window.

Options

  • padding Padding: One of SAME or VALID
  • stride_width int32: Horizontal stride of the sliding window
  • stride_height int32: Vertical stride of the sliding window
  • filter_width int32: Horizontal size of the sliding window
  • filter_height int32: Vertical size of the sliding window

LceQuantize

Binary quantize operation.

Inputs

  • Tensor<float32|int8>: Input tensor

Outputs

  • BitTensor<int32, -1>: Binarized tensor, bitpacked along the last dimension

LceDequantize

Binary dequantize operation.

Inputs

  • BitTensor<int32, -1>: Binarized input tensor, bitpacked along the last dimension

Outputs

  • Tensor<float32|int8>: Output tensor