-
Notifications
You must be signed in to change notification settings - Fork 2
Layers : Resize
Resize layers change the size of the input tensor be performing an interpolation of input's elements to find the output's elements.
-
Ttensor of size$(N,C_T,H_T,W_T)$ - List indicating the output size. Either:
-
scales(list[float]): list of real factors by which to multiply T's dimensions -
sizes(list[int]): list integers representing the output tensor's dimensions
-
- (optional)
roi(list[float]): 1D tensor given as [start1, ..., startN, end1, ..., endN] (used in one case specified later). Those values are normalized in tensor T's coordinates.
-
antialias(int) (default is0): is set to one, use an antialiasing filter when downsampling withlinearorcubicmode (not implemented yet) -
axes(list[int]): specify a subset of axes on which to apply the layer. By default, all the axes are assumed. -
coordinate_transformation_mode(string): specify which coordinate transformation function to use (see Coordinate transformation modes section) -
cubic_coeff_a(float) (default is-0.75): constant used incubicmode (see Cubic sub-section) -
exclude_outside(int) (default is0): is set to 1, the outside of the tensor will have a weight of 0, and the other weight will be normalized to 1 (not implemented yet) -
extrapolation_value(float) (default is0.0): if tf_crop_and_resize coordinate transformatino mode is used and$x_{original}$ is not possible (negative value, or higher than the maximal), this constant is used instead. -
keep_aspect_ratio_policy(string) (default isstretch): Used when the input issizes, this attribut describe if one wants to keep the ratio of the input tensor (not implemented yet) -
mode(string) (default isnearest): interpolation mode to use (see Interpolation functions section) -
nearest_mode(string) (default isround_prefer_floor): rounding mode to use in modenearest(see Nearest sub-section)
-
Ytensor of size$(N,C_Y,H_Y,W_Y)$
If scales is an input,
Otherwise, sizes must be in the inputs, then
The output is computed element wise:
- indices
$(x_{resized},y_{resized})$ of the output tensor - Calculation of coordinates in the input tensor :
$(x_{original},y_{original}) = (t(x_{resized}),t(y_{resized}))$ - Calculates the value using the interpolation function g :
$Y(x_{resized},y_{resized}) = {\bf g} (x_{original},y_{original})$
With
The layer focusing on manipulating the last two dimensions of the input tensor, for the rest of the documentation, the first two are going to be ignored, and the 4D tensors will be assimilated to 2D tensors.
The
-
$x_{resized}$ the coordinate alongside$x$ in the output tensor -
$x_{original}$ the coordinate alongside$x$ in the input tensor -
$lengthOriginal$ the size of the input tensor alongside$x$ :$H_T$ -
$lengthResized$ the size of the output tensor alongside$x$ :$H_Y$ $scale = lengthResized / lengthOriginal$ -
$outputWidth$ the target size alongside$x$ (can be real if calculated with$scale$ ):$H_T.scales[2]$ -
$outputWidthInt$ the entire effective size alongside$x$
The coordinate transformation functions are the following one:
- half_pixel:
- half_pixel_symmetric:
Let be:
Then:
- pytorch_half_pixel:
- align_corners:
- asymmetric:
-
tf_crop_and_resized (input
roiused in this case):
Example:
Let be T of size Y of size scales roi
- half_pixel:
- half_pixel_symmetric:
- pytorch_half_pixel:
- align_corners:
- asymmetric:
-
tf_crop_and_resized (input
roiused in this case):
This interpolation mode simply round T using those rounded coordinates.
Function g:
Let be
Rouding functions:
- floor:
- ceil:
- round_prefer_floor:
- round_prefer_ceil:
Example:
Let be
- floor:
- ceil:
- round_prefer_floor:
- round_prefer_ceil:
This interpolation mode do the linear interpolation of the 2 (1D) or 4 (2D) closest points of
Function g:
The function described here is the bi-linear interpolation function. Is the input is 1D, the linear interpolation function used is similar to one the presented below, with
To simplify the writing, we note in this section
Let be:
Then:
This interpolation mode do the cubic interpolation of the 4 (1D) or 16 (2D) closest points of
In this section, the input tensor T is supposed to be 1D
To simplify the writing, we note in this section
Let be:
Then:
In this section, the input tensor T is supposed to be 2D
To simplify the writing, we note in this section
Let be:
Then:
$ {\bf g} (x_{original}, y_{original}) = p(y_{original}, b_{-1}, b_0, b_1, b_2)$
For some values of x_{original} (resp. y_{original}),
Example:
Let be
Then
The value is thus clipped befor usage: