Skip to content

Commit da01e58

Browse files
committed
Document 3-band waveforms
1 parent 80803b2 commit da01e58

File tree

1 file changed

+69
-7
lines changed

1 file changed

+69
-7
lines changed

doc/modules/ROOT/pages/anlz.adoc

Lines changed: 69 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ names used in the byte field diagrams match the IDs assigned to them
3232
in the
3333
https://github.com/Deep-Symmetry/crate-digger/blob/master/src/main/kaitai/rekordbox_anlz.ksy[Kaitai
3434
Struct specification], unless that is too long to fit, in which case a
35-
subscripted abbreviation is used, and the text will mention the actual
35+
subscript abbreviation is used, and the text will mention the actual
3636
struct field name.
3737

3838
The file itself starts with the four-character code `PMAI` that
@@ -55,7 +55,7 @@ specifies the length of the whole file in bytes:
5555
----
5656

5757
The header seems to usually be `1c` bytes long, though we do not yet
58-
know the purpose of any of the header values that come after
58+
know the purpose of the header values that come after
5959
__len_file__. After the header, the file consists of a series of
6060
tagged sections, each with their own four-character code identifying
6161
the section type, followed by a header and the section content. This
@@ -93,7 +93,7 @@ next tag.
9393

9494
There is not much value to __len_header__. If you study the structure
9595
of each type of tagged section, you can see some sense of where the
96-
“header-like stuff ” ends, and “content-like stuff” begins, and this
96+
“header-like stuff” ends, and “content-like stuff” begins, and this
9797
seems to line up with the value of __len_header__. But because there
9898
are important values in each tag’s header, and those always start
9999
immediately after __len_tag__, it is simply easier to ignore the value
@@ -499,7 +499,7 @@ rendition of the track waveform, which scrolls along while the track
499499
plays, giving a detailed glimpse of the neighborhood of the current
500500
playback position. Since this is potentially much larger than other
501501
analysis elements, and is not supported by older players, it is stored
502-
in the extended analyis file (with extension `.EXT`). It is identified
502+
in the extended analysis file (with extension `.EXT`). It is identified
503503
by the four-character code `PWV3` and has the structure shown below.
504504
__len_header__ is `18`.
505505

@@ -534,7 +534,7 @@ This kind of section holds a fixed-width color preview of the track
534534
waveform, displayed above the touch strip on nexus 2 players,
535535
providing a birds-eye view of the current playback position, and
536536
supporting direct needle jump to specific track sections. It is also
537-
used in rekordbox itself. This is stored in the extended analyis file
537+
used in rekordbox itself. This is stored in the extended analysis file
538538
(with extension `.EXT`). It is identified by the four-character code
539539
`PWV4` and has the structure shown below. __len_header__ is `18`.
540540

@@ -570,7 +570,7 @@ This kind of section holds a variable-width and much larger color
570570
rendition of the track waveform, introduced with the nexus 2 line (and
571571
also used in rekordbox), which scrolls along while the track plays,
572572
giving a detailed glimpse of the neighborhood of the current playback
573-
position. This is stored in the extended analyis file (with extension
573+
position. This is stored in the extended analysis file (with extension
574574
`.EXT`). It is identified by the four-character code `PWV5` and has
575575
the structure shown below. __len_header__ is `18`.
576576

@@ -618,6 +618,68 @@ used. This is shown below:
618618
(draw-related-boxes [0 0])
619619
----
620620

621+
[[three-band-preview]]
622+
=== Waveform 3-Band Preview Tag
623+
624+
This kind of section holds a fixed-width three-band preview of the track
625+
waveform, first displayed at the bottom of the display on CDJ-3000 players,
626+
providing a birds-eye view of the current playback position, and
627+
supporting direct needle jump to specific track sections. It is also
628+
used in rekordbox itself. This is stored in a second extended analysis file
629+
(with extension `.2EX`). It is identified by the four-character code
630+
`PWV6` and has the structure shown below. __len_header__ is `14`.
631+
632+
.Waveform 3-Band Preview tag.
633+
[bytefield]
634+
----
635+
include::example$tag_shared.edn[]
636+
(draw-tag-header "PWV6")
637+
(draw-box (text "len_entry_bytes" :math) [:bg-yellow {:span 4}])
638+
(draw-box (text "len_entries" :math) [:bg-yellow {:span 4}])
639+
(draw-gap (text "entries" :math))
640+
(draw-bottom)
641+
----
642+
643+
__len_entry_bytes__ identifies how many bytes each waveform preview entry takes up; for this kind of tag it always has the value 3.
644+
__len_entries__ specifies how many entries are present in the tag.
645+
The three-band waveform preview data begins at byte{nbsp}``14`` and is 3,600 (decimal) bytes long, representing 1,200 columns of waveform preview information.
646+
647+
The three-band waveform preview entries are one-byte height values representing the mid-range, high, and low frequencies, in that order.
648+
There is some scaling involved, and they seem to be drawn stacked on top of each other, with the lows in dark blue, the mid-range in amber, and the highs in white.
649+
650+
[[three-band-detail]]
651+
=== Waveform 3-Band Detail Tag
652+
653+
This kind of section holds a variable-width and much larger three-band rendition of the track waveform, introduced with the CDJ-3000 (and also used in rekordbox), which scrolls along while the track plays, giving a detailed glimpse of the neighborhood of the current playback position.
654+
This is stored in the second extended analysis file (with extension `.EXT`).
655+
It is identified by the four-character code `PWV5` and has the structure shown below. __len_header__ is `18`.
656+
657+
.Waveform 3-Band Detail tag.
658+
[bytefield]
659+
----
660+
include::example$tag_shared.edn[]
661+
(draw-tag-header "PWV5")
662+
(draw-box (text "len_entry_bytes" :math) [:bg-yellow {:span 4}])
663+
(draw-box (text "len_entries" :math) [:bg-yellow {:span 4}])
664+
(draw-box (text "unknown" :math) [:bg-yellow {:span 4}])
665+
(draw-gap (text "entries" :math))
666+
(draw-bottom)
667+
----
668+
669+
__len_entry_bytes__ identifies how many bytes each waveform detail entry takes up; for this kind of tag it always has the value 3.
670+
__len_entries__ specifies how many entries are present in the tag.
671+
Each entry represents one <<djl-analysis:ROOT:track_metadata.adoc#frames,half-frame>> of audio data, and there are 75 frames per second, so for each second of track audio there are 150 waveform detail entries.
672+
The purpose of the header bytes{nbsp}``14``-`17` is unknown; they may always have the value `00960000`.
673+
The three-band waveform detail entries begin at byte{nbsp}``18``.
674+
675+
Three-band detail entries have the same structure preview entries, one-byte height values representing the mid-range, high, and low frequencies, in that order.
676+
There is a different kind of scaling involved in drawing these, and it seems to be non-linear.
677+
We have not yet found an approach that matches what we see in rekordbox.
678+
The colors for low, mid-range, and high frequencies are the same as in the preview, but they are drawn on the same axis rather than being stacked.
679+
The area where low and mid-range frequencies overlap is drawn in brown, and their pure colors are seen where there is no overlap.
680+
The white high frequency is drawn last so it obscures any low or mid-range information underneath it.
681+
Recordbox actually seems to do some light blending, but it is so dim that we have not bothered to reproduce it so far.
682+
621683
[[song-structure-tag]]
622684
=== Song Structure Tag
623685

@@ -670,7 +732,7 @@ on its analysis of the audio.
670732
The value 1 is a “high” mood where the phrase types consist of
671733
“Intro”, “Up”, “Down”, “Chorus”, and “Outro”. Other values in each
672734
phrase entry cause the intro, chorus, and outro phrases to have their
673-
labels subdivided into styes “1” or “2” (for example, “Intro 1”), and
735+
labels subdivided into styles “1” or “2” (for example, “Intro 1”), and
674736
“up” is subdivided into style “Up 1”, “Up 2”, or “Up 3”. See the
675737
<<phrase-labels,table below>> for an expanded version of this
676738
description.

0 commit comments

Comments
 (0)