Skip to content

Latest commit

 

History

History
547 lines (438 loc) · 23.8 KB

File metadata and controls

547 lines (438 loc) · 23.8 KB

THETA Metadata Spec

Photo Metadata

RICOH THETA JPG file and DNG file have vendor specified metadata stored under MakerNote IFD.

  • Tiff Header
  • 0th IFD (Primary IFD)
  • 1st IFD (Thumbnail IFD) (*1)
  • Photo Sphere XMP Metadata (*2)

*1 : DNG file does not have this IFD in some mode.
*2 : Stored in Equirectangular JPG file only. Refer to https://developers.google.com/streetview/spherical-metadata for detail.

Important

THETA X firmware v2.71.1 has a bug where the values of GPano:PosePitch/RollDegrees are NOT zero even when Zenith Correction is enabled. This bug has been fixed in THETA X firmware v2.80.1.

MakerNote IFD

MakerNote IFD is always started with 8 byte header 0x52_69_63_6F_68_00_00_00 which means "Ricoh\0\0\0" as ASCII format. The next 2 bytes indicate the number of entries.

Name Type Count A1 X Z1 V SC2 SC S m15 THETA Description
0x4001 LONG 1 Offset to RICOH THETA IFD

RICOH THETA IFD

The first 2 bytes indicate the number of entries.

Name Type Count A1 X Z1 V SC2 SC S m15 THETA Description
0x0001 SHORT 1 Image Type
1 Equirectangular
2 DualFisheye
3 DualFisheye (upright)
4 SingleFisheye
5 SingleFisheye (Front-lens)
6 SingleFisheye (Rear-lens)
0x0002 SHORT 1 -- -- Option: HDR Rendering
0 OFF, 1 ON
0x000E SHORT 1 Option: Handheld HDR Rendering
0 OFF, 1 ON
0x0006 SHORT 1 Option: Noise Reduction
0 OFF, 1 ON
0x0008 SHORT 1 Option: DR Compensation
0 OFF, 1 ON
0x0010 SHORT 1 Filter: Face mode
0 OFF, 1 ON
0x0011 SHORT 1 Filter: Night View mode
0 OFF, 1 ON
0x0012 SHORT 1 Filter: Lens-by-Lens Exposure mode
0 OFF, 1 ON
0x0007 SHORT 1 Zenith Correction
0 Not applied, 1 Applied
0x0109 SRATIONAL 3 Zenith Direction (=Rotation Vector) (ax,ay,az)
0x010D SHORT 1 Water Housing Setting
0 Not used, 1 Used underwater, 2 Used on-land
0x1011 SHORT 1 -- -- Microphone Select
0 Internal, 1 External (4ch), 2 External (1ch)
0x1013 SHORT 1 Audio Gain Setting
0 Normal, 1 Low, 2 Mute
0x1001 SHORT 1 Temperature of image sensor (front-lens) *1
0x1003 SHORT 1 Temperature of image sensor (rear-lens) *1
0x1101 SHORT 1 Temperature of IMU *1
0x1201 SLONG 1 Temperature of main board *1
0x1202 SLONG 1 Temperature of battery *1
65535 is set when the battery is not inserted

*1 : The values is 10 times of actual value. e.g. 250 means 25.0 degree C.


Video Metadata

RICOH THETA MP4 file have two types of metadata which stores IMU sensor data, GNSS sensor data, and related timestamp information.
Basic structure of RICOH THETA MP4 files is following, and vendor specified metadata is stored under udta box and in CaMM track.

  • ftyp
  • mdat
  • moov
    • mvhd
    • trak (video)
      • tkhd
      • mdia
        • ....
      • uuid ffcc8263-f855-4a93-8814-587a02521fdd *1
    • trak (audio)
      • ....
    • trak (CaMM)
      • ....
    • meta
      • ....
    • udta
      • ....
  • uuid 28F311E2-B791-4F6F-94E2-4F5DEACB3C01

*1 : Spherical Video RFC v1.0. Refer to https://github.com/google/spatial-media/blob/master/docs/spherical-video-rfc.md for detail. This box is stored only when video is stitched as equirectangular format.


Axis Definition

The side where the shutter button is located is defined as Rear, for all THETA model.

axis


UDTA Box

udta box has following child-boxes.

Name A1 X Z1 V SC2 SC S m15 THETA Description
RTHU Thumbnail
RMKN Same structure as Photo Metadata
RDT1-8 Details Not Disclosed
RDT9 Details Not Disclosed
RDTA Accelerometer
RDTB Gyroscope
RDTC Ambient magnetic field
RDTD Details Not Disclosed
RDTG Timestamp for each video frame
RDG2
*3
Exposure control for each video frame
RDTH Quarternion for each video frame
RDTI Details Not Disclosed
RDTL
*1

*1
GNSS location
RDL2
*1*2
GNSS location + accuracy
@mod Model name
e.g. RICOH THETA X
@swr Software version
e.g. RICOH THETA X Ver 2.21.0
@day Date/Time with timezone
e.g. 2023-04-01T09:00+09:00
@xyz GNSS location
e.g. +35.7101+139.8108+0032
@mak Manufacturer
RICOH
manu Same as @mak
modl Same as @mod

*1 : Serial number started with AA12 or YR12 (shipped to China) will not store this type data in any case.
*2 : RICOH THETA X firmware v2.61.0 and later.
*3 : RICOH THETA Z1 firmware v3.60.3 and later.

RDT* Atom

Data field Format Bytes Description
Size uint32 4
Type uint32 4 e.g. RDTA
Number of entries uint32 4
Sampling rate uint16 2 [Hz]
Sample size uint16 2 e.g. 0x18
Endian uint16 2 Little endian : 0x0123
Big endian : 0x3210
Reserve uint16[3] 6 0x00..00 *1
Data Table Variable Data Table is stored as array of Data Packet.

*1 :
RICOH THETA X firmware v2.40.0 and later puts the value of "timestamp" of 1st data packet as [msec] for RDTL.
RICOH THETA X firmware v2.61.0 and later puts the value of "timestamp" of 1st data packet as [msec] for RDL2.
RICOH THETA X firmware v2.40.0 and later puts the value of "timestamp" of 1st data packet as [msec] for RDTL.
RICOH THETA X firmware v2.61.0 and later puts the value of "timestamp" of 1st data packet as [msec] for RDL2.

Data Packet Format

RDTA

RDTA stores Accelerometer sensor data.

Data Format Bytes Description
x float 4 [m/s^2]
y float 4 [m/s^2]
z float 4 [m/s^2]
reserve float 4
timestamp uint64 8 [nsec]

RDTB

RDTB stores Gyroscope sensor data.

Data Format Bytes Description
x float 4 [rad/s]
y float 4 [rad/s]
z float 4 [rad/s]
reserve float 4
timestamp uint64 8 [nsec]

RDTC

RDTC stores Ambient magnetic field sensor data.

Data Format Bytes Description
x float 4 [μT]
y float 4 [μT]
z float 4 [μT]
reserve float 4
timestamp uint64 8 [nsec]

RDTG

RDTG stores timestamp information for each video frame.
This timestamp indicates the center of exposure considering with exposure time and rolling shutter skew.

Data Format Bytes Description
timestamp uint64 8 [nsec]

RDG2

RDG2 stores timestamp information and exposure control information for each video frame.
This timestamp indicates the center of exposure considering with exposure time and rolling shutter skew.

Data Format Bytes Description
timestamp uint64 8 [nsec]
exposure program uint8 1 1 : Manual
2: Automatic
3: Aperture Priority
4: Shutter Priority
5: ISO Sensitivity Priority
(reserved) uint8 1
Bv value int16[2] 4 numerator/denominator, Brightness Value
Tv value int16[2] 4 numerator/denominator, Shutter Speed
Av value int16[2] 4 numerator/denominator, Aperture Value
Sv value int16[2] 4 numerator/denominator, ISO Sensitivity

RDTL

RDTL stores GNSS location sensor data.

Data Format Bytes Description
timestamp double 8 GPS time epoch [sec]
latitude double 8 [deg]
longitude double 8 [deg]
altitude double 8 [m]

RDL2

RDL2 stores GNSS location + accuracy sensor data.

Data Format Bytes Description
timestamp double 8 GPS time epoch [sec]
gps_fix_type int16 2 0 (no fix), 2 (2D fix), 3 (3D fix)
latitude double 8 [deg]
longitude double 8 [deg]
altitude float 4 [m]
horizontal_accuracy float 4 [m]
vertical_accuracy float 4 [m]
velocity_east float 4 [m/s]
velocity_north float 4 [m/s]
velocity_up float 4 [m/s] , always 0.0f
speed_accuracy float 4 [m/s]

Sample How to Find Data Packet from RDTA Box

00 02 62 B4 52 44 54 41 71 19 00 00 C8 00 18 00
~~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~ ~~~~~
|           |           |           |     └ SampleSize = 0x0018 (24)
|           |           |           └ SamplingRate = 0x00C8 (200)
|           |           └ Entries = 0x00001971 (6513)
|           └ Type = "RDTA"
└ Size = 0x000262B4 (156340)

23 01 00 00 00 00 00 00 76 FD 9C BC 31 7C 6B BC
~~~~~ ~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~
|     |                 |           └ y[0] = -0.014
|     └ Reserve         └ x[0] = -0.019
└ Endian = 0x2301 (Little Endian)

D1 23 1E 41 00 00 00 00 80 64 B0 09 05 0A 00 00
~~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~
|           |           └ timestamp[0] = 11016753669248
|           └ reserve
└ z[0] = +9.883

82 DC 57 3D 76 FD 1C 3D 5D 61 1B 41 00 00 00 00
~~~~~~~~~~~ ....
└ x[1] = +0.052  


CaMM Track

CaMM is Camera Motion Metadata defined by Google. Refer to spec for detail.
RICOH THETA MP4 file also have CaMM track in following modes.

  • RICOH360 THETA A1
    • 8K (7680x3840px) 10fps, 5fps, 2fps
    • 5.7K (5760x2880px) 10fps, 5fps, 2fps
  • RICOH THETA X
    • 8K (7680x3840px) 10fps, 5fps, 2fps
    • 5.7K (5760x2880px) 10fps, 5fps, 2fps
    • 2.7K (2752x2752px) 10fps, 5fps, 2fps
  • RICOH THETA Z1
    • 3.6K (3648x3648px) 2fps, 1fps
    • 2.7K (2688x2688px) 2fps, 1fps

Supported Data Type

Type A1 X Z1 Description
0 angle_axis
1 pixel_exposure_time
and rolling_shutter_skew_time
2 gyro
3 Accelerometer
4 position
5
*1

*1
latitude,
longitude,
and altitude
6
*1*2
5 plus
time_gps_epoch,
gps_fix_type,
and some *_accuracy data
7 magnetic_field

*1 : Serial number started with AA12 or YR12 (shipped to China) will not store this type data in any case.
*2 : RICOH THETA X firmware v2.61.0 and later.
*2 : RICOH THETA X firmware v2.61.0 and later.

Track Structure

You can parse CaMM track to each data packet with using stts, stsz, and co64 atoms' information.

Handler Reference Atom

Data field Format Bytes Description
Size uint32 4 0x3C
Type uint32 4 hdlr
Version uint32 4 0
Pre-defined uint32 4 mhlr
Handler Type uint32 4 camm
Reserved uint32[3] 12 ----
Name String 28 CameraMetadataMotionHandler + 0x00

Time-to-Sample Atom

Data field Format Bytes Description
Size uint32 4
Type uint32 4 stts
Version uint32 1 0
Flags ---- 3 0x000000
Number of entries uint32 4 The total number of recorded sensor data.
e.g. N = 10sec * (acc 200Hz + gyro 200Hz + gnss 1Hz) = 4010
Time-to-sample table uint32[2]*N Variable Sample count 4 Byte +
Sample Duration 4 Byte

Also refer to Time-to-sample atom ('stts') in QuickTime File Format provided by Apple.

Sample-to-Chunk Atom

Data field Format Bytes Description
Size uint32 4
Type uint32 4 stsc
Version uint32 1 0
Flags ---- 3 0x000000
Number of entries uint32 4 N = 1
Sample-to-chunk table uint16[3]*N Variable First chunk 4 Byte +
Sample per chunk 4 Byte
Sample description ID 4 Byte

Also refer to Sample-to-chunk atom ('stsc') in QuickTime File Format provided by Apple.

Sample-to-Size Atom

Data field Format Bytes Description
Size uint32 4
Type uint32 4 stsz
Version uint32 1 0
Flags ---- 3 0x000000
Sample size uint32 4 0x10 or 0x00 (*1)
Number of entries uint32 4 The total number of recorded sensor data.
e.g. N = 10sec * (acc 200Hz + gyro 200Hz + gnss 1Hz) = 4010
Sample size table uint16*N Variable Sample size 4 Byte (*1)

(*1) If only gyro and Accelerometer are stored in CaMM track, Sample size will be 0x10 and Sample size table will be 0x00. If GNSS location are also stored in CaMM strack, Sample size will be 0x00 and all data packet size information will be stored in Sample size table as array.
Because, data packet size of gyto and Accelerometer is 0x10 (16Byte), GNSS location is 0x1C (28Byte).

Also refer to Sample-to-chunk atom ('stsz') in QuickTime File Format provided by Apple.

Chunk Offset Atom

Data field Format Bytes Description
Size uint32 4
Type uint32 4 co64
Version uint32 1 0
Flags ---- 3 0x000000
Number of entries uint32 4 The total number of recorded sensor data.
e.g. N = 10sec * (acc 200Hz + gyro 200Hz + gnss 1Hz) = 4010
Chunk offset table uint64*N Variable Chunk offset 8 Byte

Also refer to Chunk-offset-atom ('stco') in QuickTime File Format provided by Apple.

Sample How to Find Data Packet from CaMM Track

STEP 1. Parse stts, stsz, and co64 box to get each information.

stts

00 05 B7 38 73 74 74 73 00 00 00 00 00 00 B6 E5
~~~~~~~~~~~ ~~~~~~~~~~~ ~~ ~~~~~~~~ ~~~~~~~~~~~
|           |           |  |        └ Number of entries = 0x0000B6E5 (46821)
|           |           |  └ Flag = 0x000000
|           |           └ Version = 0x00
|           └ Type = "stts"
└ Size = 0x0005B738 (374584)

00 00 00 01 00 00 00 00 00 00 00 01 00 4C 97 57
~~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~
|           |           |           └ duration[1] = 0x004C9757 (5.019[msec])
|           |           └ count[1] = 1
|           └ duration[0] = 0x00000000 (0.000[msec])
└ count[0] = 1

00 00 00 01 00 00 00 00 00 00 00 01 00 4C 97 57
~~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~
|           |           |           └ duration[3] = 0x004C9757 (5.019[msec])
|           |           └ count[3] = 1
|           └ duration[2] = 0x00000000 (0.000[msec])
└ count[2] = 1

....

00 00 00 01 00 4C 93 A3 00 00 00 01 00 00 00 00
~~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~
|           |           |           └ duration[m+1] = 0x00000000 (0.000[msec])
|           |           └ count[m+1] = 1
|           └ duration[m] = 0x004C93A3 (5.018[msec])
└ count[m] = 1

stsz

00 02 DB AC 73 74 73 7A 00 00 00 00 00 00 00 00
~~~~~~~~~~~ ~~~~~~~~~~~ ~~ ~~~~~~~~ ~~~~~~~~~~~
|           |           |  |        └ Sample size = 0x000000
|           |           |  └ Flag = 0x000000
|           |           └ Version = 0x00
|           └ Type = "stsz"
└ Size = 0x0002DBA8 (187304)

00 00 B6 E5 00 00 00 10 00 00 00 10 00 00 00 10
~~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~
|           |           |           └ size[2] = 0x10(16)
|           |           └ size[1] = 0x10(16)
|           └ size[0] = 0x10(16)
└ Number of entries = 0x0000B6E5 (46821)

....

00 00 00 10 00 00 00 10 00 00 00 1C 00 00 00 10
~~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~
|           |           |           └ size[m+1] = 0x10(16)
|           |           └ size[m] = 0x1C(24)
|           └ size[m-1] = 0x10(16)
└ size[m-2] = 0x10(16)

co64

00 05 B7 40 63 6F 36 34 00 00 00 00 00 00 B6 E5
~~~~~~~~~~~ ~~~~~~~~~~~ ~~ ~~~~~~~~ ~~~~~~~~~~~
|           |           |  |        └ Number of entries = 0x0000B6E5 (46821)
|           |           |  └ Flag = 0x000000
|           |           └ Version = 0x00
|           └ Type = "co64"
└ Size = 0x0005B738 (374584)

00 00 00 00 00 01 00 28 00 00 00 00 00 01 00 38
~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~
└ offset[0]             └ offset[1]
....

00 00 00 00 00 01 31 28 00 00 00 00 00 01 31 44
~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~
└ offset[m]             └ offset[m+1]

....

STEP 2. Read the data packet with offset (by co64) and size (by stsz).

ADDRESS    00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
----------------------------------------------------------
0x00010020 96 96 96 96 96 96 97 07 00 00 02 00 01 EF A6 BE 
                                   ~~~~~~~~~~~~~~~~~~~~~~~
                                   └ data_packet[0]
0x00010030 03 72 AD 3D 00 77 CA BD 00 00 03 00 A4 4A EB BE 
           ~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~
                                   └ data_packet[1]
0x00010040 A6 B2 4A BF 9E DA 26 41 00 00 02 00 1C EE D6 BE 
           ~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~
                                   └ data_packet[2]
0x00010050 6B FE 83 3D BB 3D D0 BE 00 00 03 00 2D 6D 7E BE 
           ~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~
                                   └ data_packet[3]
0x00010060 7A 00 37 BF 3E 8D 22 41 00 00 02 00 0F 1C F7 BE 
           ~~~~~~~~~~~~~~~~~~~~~~~ ....
....
0x00013120 0A 52 D5 BD FA 3B 2A 41 00 00 05 00 02 50 B2 C2
                                   ~~~~~~~~~~~~~~~~~~~~~~~
                                   └ data_packet[m] 
0x00013130 1A C8 41 40 0E 08 87 F1 2D 71 61 40 CD CC CC CC 
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
0x00013140 CC 0C 42 40 00 00 02 00 EC 00 41 3E 1A 2C 8F 3E
           ~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                       └ data_packet[m+1]

STEP 3. Parse the data packet based on CaMM spec.

data_packet[0] 00 00 02 00 01 EF A6 BE 03 72 AD 3D 00 77 CA BD
               ~~~~~ ~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~
               |     |     └ x=-0.326  └ y=+0.084  └ z=-0.098
               |     └ type = 2 (gyro)      
               └ reserve

data_packet[1] 00 00 03 00 A4 4A EB BE A6 B2 4A BF 9E DA 26 41
               ~~~~~ ~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~
               |     |     └ x=-0.459  └ y=-0.791  └ z=+10.428
               |     └ type = 3 (acc)      
               └ reserve

data_packet[m] 00 00 05 00 02 50 B2 C2 1A C8 41 40 0E 08 87 F1 
               ~~~~~ ~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~
               |     |     └ lat = +35.563316      └ lng = +139.536858
               |     └ type = 5 (gps)      
               └ reserve

               2D 71 61 40 CD CC CC CC CC 0C 42 40
               ~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~ ....
                           └ alt = +36.1

Parse Result

# Type Value Unit Duration Unit
0 gyro ( -0.326, +0.084, -0.098 ) [rad/s] 0.000 [msec]
1 acc ( -0.459, -0.791, +10.428 ) [m/s^2] 5.019 [msec]
2 gyro ( -0.000, +0.064, -0.406 ) [rad/s] 0.000 [msec]
3 acc ( -0.248, -0.714, +10.159 ) [m/s^2] 5.019 [msec]
.... .... .... .... .... ....
m-2 gyro ( +0.211, +0.250, +0.351 ) [rad/s] 0.000 [msec]
m-1 acc ( +0.178, -0.104, +10.639 ) [m/s^2] 0.100 [msec]
m gps ( +35.563316, +139.536858, +36.1 ) [deg],[m] 5.018 [msec]
m+1 gyro ( +0.188, +0.279, +0.282 ) [rad/s] 0.000 [msec]
.... .... .... .... ....