Skip to content

Layers : Broadcast

yaitaissa edited this page Aug 30, 2024 · 1 revision

Broadcast layers perform a element wise operation between tensors.

The layer support multi-direction broadcasting for the input.

Two tensor $T_1$ and $T_2$ of size respectively $(N,C_1,H_1,W_1)$ and $(N,C_2,H_2,W_2)$ are said to be broadcastable to the same shape if and only, for each dimension, both size are equals or at least one is equal to 1.

How the layer works

Inputs

  • T list of tensor. The tensor $T_i=T[i]$ is of size $(N,C_{T,i},H_{T,i},W_{T,i})$

Outputs

  • Y tensor of size $(N,C_Y,H_Y,W_Y)$

With $(N,C_Y,H_Y,W_Y) = (N,max(C_{T,i}),max(H_{T,i}),max(W_{T,i}))$

Output's values computing

Let be $f()$ the operation to apply to the tensors.

Let's note $m$ the number of inputs.

  • indice $(z_Y,x_Y, y_Y)$ of the output tensor
  • Compute the output value $Y(z_Y,x_Y, y_Y) = f(T_1(z_1,x_1, y_1), ..., T_m(z_m,x_m, y_m))$

Broadcast operation

This section describe the $f()$ function.

Let be $(x_1, ..., x_m)$ a set of reals.

Add

$f(x_1, ..., x_m) = \sum_{i=1}^m x_i$

Multiply

$f(x_1, ..., x_m) = \prod_{i=1}^m x_i$

Subtract

$f(x_1, ..., x_m) = x_1 - (\sum_{i=2}^m x_i)$

Divide

$f(x_1, ..., x_m) = x_1 / (\prod_{i=2}^m x_i)$

Average

$f(x_1, ..., x_m) = \frac{1}{m}.\sum_{i=1}^m x_i$

Maximum

$f(x_1, ..., x_m) = max(x_1, ..., x_m)$

Minimum

$f(x_1, ..., x_m) = min(x_1, ..., x_m)$

Clone this wiki locally