Larq Compute Engine Operators¶
Larq Compute Engine extends the builtin TensorFlow Lite operators with optimized operators for running binarized neural networks.
Data Format¶
Larq Compute Engine adheres to the NHWC
data format used in TensorFlow Lite for activation tensors. In the following a tensor of format BitTensor<int32, n>
represents a TensorFlow Lite tensor with data type int32
containing binary values which are bitpacked across the channel dimension n
and potentially padded with zeros. Mathematically a 0
valued bit represents a real value of 1.0
while 1
is interpreted as -1.0
.
Options for the operators are stored using FlexBuffers in the .tflite
model files generated by the Converter and passed as opaque values to the custom operators.
List of Operations¶
LceBconv2d¶
2D binarized convolution layer.
The operation will compute \[ \hat{y}_{n,\mathrm{out}} = \sum_{i = 0}^{I - 1} w_{\mathrm{out}, i} \star \mathrm{bsign}(x_{n,i}) \] with \(n \in [0, N)\) and \(\mathrm{out} \in [0, O)\), where \(\star\) is the 2D cross-correlation operator, \(N\) is a batch size, \(I\) and \(O\) denote the number of input and output channels, and \(\mathrm{bsign}\)[^1] is the binary sign function.
The final output with type Tensor<float32|int8>
is then calculated as \[ y_{n,\mathrm{out}} = \beta_\mathrm{out} + \gamma_\mathrm{out} \, \sigma\left(\hat{y}_{n,\mathrm{out}}\right)\text{.} \] If the output type is BitTensor<int32, 3>
the final transformation is simplified to \[ y_{n,\mathrm{out}} = \begin{cases} -1.0 & \hat{y}_{n,\mathrm{out}} > \tau_\mathrm{out} \\ \hphantom{-}1.0 & \hat{y}_{n,\mathrm{out}} \leq \tau_\mathrm{out}\text{.} \end{cases} \]
Inputs
BitTensor<int32, 3>
: 4D input tensor \(x\)BitTensor<int32, 3>
: 4D bitpacked binary filter tensor \(w\) inOHWI
formatTensor<float32> | null
: 1D post activation multiplier \(\gamma\). This operand will benull
if an output threshold is set.Tensor<float32> | null
: 1D post activation bias \(\beta\). This operand will benull
if an output threshold is set.Tensor<int32> | null
: 1D output threshold \(\tau\). This operand defines the binary output threshold ifexperimental_enable_bitpacked_activations
is enabled and the output isBitTensor<int32, 3>
.
Outputs
Tensor<float32|int8> | BitTensor<int32, 3>
: Result of the 2D convolution of the input tensor \(y\)
Options
- channels_in
int32
: Number of input channels of the incoming activations. This is necessary since input channels cannot be inferred from the shape of weights and activations if both are bitpacked. - dilation_height_factor
int32
: Vertical dilation rate of the filter window - dilation_width_factor
int32
: Horizontal dilation rate of the filter window - fused_activation_function
ActivationFunctionType
: \(\sigma\), one ofNONE
,RELU
,RELU_N1_TO_1
orRELU6
- padding
Padding
: One ofSAME
orVALID
- stride_height
int32
: Vertical stride of the filter window - stride_width
int32
: Horizontal stride of the filter window
LceBMaxPool2d¶
Max pooling operation on spatial input data.
Inputs
BitTensor<int32, 3>
: 4D input tensor
Outputs
BitTensor<int32, 3>
: A tensor where each entry is the maximum of the input values in the corresponding window.
Options
- padding
Padding
: One ofSAME
orVALID
- stride_width
int32
: Horizontal stride of the sliding window - stride_height
int32
: Vertical stride of the sliding window - filter_width
int32
: Horizontal size of the sliding window - filter_height
int32
: Vertical size of the sliding window
LceQuantize¶
Binary quantize operation.
Inputs
Tensor<float32|int8>
: Input tensor
Outputs
BitTensor<int32, -1>
: Binarized tensor, bitpacked along the last dimension
LceDequantize¶
Binary dequantize operation.
Inputs
BitTensor<int32, -1>
: Binarized input tensor, bitpacked along the last dimension
Outputs
Tensor<float32|int8>
: Output tensor