Skip to content

Layers : Reduce

yaitaissa edited this page Aug 30, 2024 · 2 revisions

Reduce layers apply the same operation to the tensor elements along one or more axes.

How the layer works

Inputs

  • T tensor of size $(N,C_T,H_T,W_T)$

Attributs

  • (optional) axes (list[int]) : the axes along which the operation is to be applied. If not given, the deault is to either to reduce over all the axes if noop_with_empty_axes is 0, or to not reduce otherwise.
  • noop_with_empty_axes (int) (default is 0) : Define behavior if axes is empty : reduce over all axes is 0, do not reduce if set to 1
  • keepdims (int) (default i 1) : If set to 1, the reduced dimension will be kept (they wil be of size 1)

Outputs

  • Y tensor of size $(N,C_Y,H_Y,W_Y)$

Output's values computing

Let be $f()$ the operation to apply to the tensor.

For this section, keepdims is assumed to be set to 1 to simplify the writing.

  • Isolate a set $S_i$ of element to reduce
  • Compute the output value using $f()$
  • Put the value instead of the set $S_i$
  • Repeat for each set

Over all the dimensions:

The output size is $(N,C_Y,H_Y,W_Y) = (N,1,1,1)$

Only one set: $S = \set{T(z,x,y), \forall (z,x,y) \in [[0,C_T[[\ X\ [[0,H_T[[\ X\ [[0,W_T[[\ \ } $

Over two dimensions:

Let's assume that the axes to reduce are the axes 1 and 2.

The output size is $(N,C_Y,H_Y,W_Y) = (N,1,1,W_T)$

There are $W_T$ sets: $\forall i \in [[0,W_T[[,\ S_i = \set{T(z,x,i), \forall (z,x) \in [[0,C_T[[\ X\ [[0,H_T[[\ \ } $

Over one dimensions:

Let's assume that the axe to reduce is 1.

The output size is $(N,C_Y,H_Y,W_Y) = (N,1,H_T,W_T)$

There are $W_T$ sets: $\forall (j, i) \in [[0,H_T[[\ X\ [[0,W_T[[,\ S_{j,i} = \set{T(z,j,i), \forall z \in [[0,C_T[[\ \ } $

Reducing operation

This section describe the $f()$ function used to reduce the tensor

Let be $(x_1, ..., x_m)$ a set of reals.

Sum

$f(x_1, ..., x_m) = \sum_{i=1}^m x_i$

Prod

$f(x_1, ..., x_m) = \prod_{i=1}^m x_i$

Mean

$f(x_1, ..., x_m) = \frac{1}{m}.\sum_{i=1}^m x_i$

Max

$f(x_1, ..., x_m) = max(x_1, ..., x_m)$

Min

$f(x_1, ..., x_m) = min(x_1, ..., x_m)$

Clone this wiki locally