Encoding ETC1S and XUASTC LDR Texture Video

Intro

Texture Video support works with any LDR/HDR Basis Universal codec format, including XUASTC LDR, but currently only ETC1S has temporal optimizations. Texture Video files are essentially huge texture arrays to the encoder/transcoder. Texture Video files support mipmapping.

The current compressor and encoder design (originally designed with common texture use cases in mind) loads all frames into memory at once, so there are practical limits to the maximum video size in a single .basis/.ktx2 file. Texture Video isn't a big development focus for us, but we do ensure it works.

XUASTC LDR is an i-frame codec, but at 12x12 it can achieve surprisingly low bitrates (~.25 bpp).

Also see the videotest basic WebGL example. Note this sample doesn't support all ASTC block sizes yet, but it does support ASTC 4x4 or transcoding any ASTC block size to the other LDR texture format. (We'll be fixing this soon.)

ETC1S Texture Video

ETC1S texture video support was a stretch goal of ours. Videos are significantly more challenging than textures, and supporting them helped us create a better looking system overall, as well as helping us gain experience with video. ETC1S texture video has noticeable block artifacts, but the tradeoff is fast transcode times.

The current system only supports I-frames and P-frames with skip blocks, however it does use global endpoint/selector codebooks across all frames in the texture video sequence. Currently, the first frame is always an I-frame, and all subsequent frames are P-frames, although this current limitation is not imposed by the file format itself, just the API.

Mipmapping and alpha channels are also supported in ETC1S texture video mode. Internally, texture video files are treated as 2D texture arrays with an extra layer of compression: skip blocks on P-frames, and I-frames with no skip blocks. The global selector/endpoint codebooks are applied to all video frames.

Texture video stresses the encoder beyond its typical use, so some extra configuration is typically necessary. For nearly maximum possible achievable ETC1S mode quality with the current format and encoder (completely ignoring encoding speed!), use:

-comp_level 5 -max_endpoints 16128 -max_selectors 16128 -no_selector_rdo -no_endpoint_rdo

Level 5 is extremely slow, so unless you have a very powerful machine, levels 1-4 are recommended. "-no_selector_rdo -no_endpoint_rdo" are optional. Using them hurts rate-distortion performance, but they increase quality. An alternative is to use -selector_rdo_thresh X and -endpoint_rdo_thresh, with X ranging from [1,2] (higher=lower quality/better compression - see the tool's help text).

To compress small video sequences, using tools like ffmpeg and VirtualDub, first uncompress the video frames to multiple individual .PNG files:

ffmpeg -i input.mp4 pic%04d.png

Then, to compress the first 200 frames to a .basis file (.KTX2 works too):

basisu -basis -comp_level 2 -tex_type video -multifile_printf "pic%04u.png" -multifile_num 200 -multifile_first 1 -max_selectors 16128 -max_endpoints 16128 -endpoint_rdo_thresh 1.05 -selector_rdo_thresh 1.05

For ETC1S video encoding, the more cores and memory your machine has, the better. BasisU is intended for smaller videos of a few dozen seconds or so. On a powerful enough machine you should be able to encode up to a few thousand 720P frames using a single set of codebooks. The webgl_videotest directory contains a very simple (in progress) video viewer.

For texture video, use -comp_level 2 or 3. The default is 1, which isn't quite good enough for texture video. Higher comp_level's result in reduced ETC1S artifacts.

The .basis file will contain multiple ETC1S image frames (or slices) in a large 2D texture array, all using the same global codebooks, which you can retrieve using the transcoder's image API. The system now supports conditional replenishment (CR, or "skip blocks"). CR can reduce the bitrate of some videos (highly dependent on how dynamic the content is) by over 50%. In texture video mode, the images must be requested from the transcoder in sequence from first to last, and random access is only allowed to I-Frames.

Be sure to experiment with increasing the endpoint RDO threshold (-endpoint_rdo_thresh X). This setting controls how aggressively the compressor's backend will combine together nearby blocks so they use the same block endpoint codebook vectors, for better coding efficiency. X defaults to a modest 1.5, which means the backend is allowed to increase the overall color distance by 1.5x while searching for merge candidates. The higher this setting, the better the compression, with the tradeoff of more block artifacts. Settings up to ~2.25 can work well, and make the codec stronger. "-endpoint_rdo_thresh 1.75" is a good setting on many textures.

For video, -comp_level 1 should result in decent results on most clips. For less banding, level 2 can make a big difference. This is still an active area of development, and quality/encoding perf. will improve over time.

For more info on controlling the ETC1S encoder's quality vs. encoding speed tradeoff, see ETC1S Compression Effort Levels.

XUASTC LDR Texture Video

XUASTC LDR looks substantially better for video vs. ETC1S, but it's an i-frame only codec (i.e. there are no temporal optimizations at all, not even skip blocks), and transcoding is noticeably slower. XUASTC LDR 10x10-12x12 is capable of surprisingly low bitrates (.25-.4 bpp) on texture video content.

This example command creates a .basis texture video file of 2864 frames (numbered from 1-2864), using XUASTC LDR 12x12 Zstd at DCT quality level 60, effort 9 (highest practical effort), debug output, and resamples each frame to 50%. The resulting .basis file can be played back using our videotest WebGL example (assuming it's not too large).

basisu -basis -tex_type video -multifile_printf "pic%04u.png" -multifile_num 2864 -multifile_first 1 -xuastc_ldr_12x12 -quality 60 -effort 9 -debug -resample_factor .5

The Zstd profile is recommended. The hybrid and arithmetic profiles are likely too slow at useable resolutions. XUASTC LDR transcodes most rapidly to ASTC. If it has to transcode to BC7, that's additional overhead, and be aware that by default the transcoder will deblock going to BC7 and other LDR formats (even more overhead). For ASTC usage on large block sizes GPU shader deblocking is likely essential and highly recommended. Considering the complexity and overhead of targeting the other GPU texture formats, XUASTC LDR Texture Video is likely only really useful on ASTC devices. (Which is probably fine - there are billions of them.)

The larger the block size, the faster the transcoding and the lower the overall bitrate. The largest ASTC block sizes (8x6 or beyond) are probably the most useful for Texture Video.

The .ktx2 file format supports texture video too, but we haven't made a playback sample for it yet.

Encoding ETC1S and XUASTC LDR Texture Video

Table of Contents

Intro

ETC1S Texture Video

XUASTC LDR Texture Video

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally