Skip to content

efpl-columbia/TableOrientedBinaries.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Table Oriented Binary (TOB) Files in Julia

Table Oriented Binary (TOB) files are used by Campell Scientific dataloggers. The format is at least partially documented by CS but they only provide some proprietary tools to read/convert the data.

This Julia package currently provides limited support for reading data in the TOB3 format as it is written by the CR1000X data logger. It should, however, be relatively straightforward to add support for missing variants of the data format.

Since the TOB3 format is not fully documented, it is best not to trust this implementation too much. For a relatively safe usage, it is recommended to compare the output of this code to the output of Campbell Scientific’s proprietary tools for one sample file per time series, and also for any output that contains fewer records than expected.

Usage

The package exports a single type TOB, which can be used to load data from a TOB3 file:

julia> tob = TOB("test/sample1.dat");

julia> tob[1]
(Record_Number = 0x00050a67, Timestamp = 0x186b0679e4b22400, CSAT_Ux = 0.17911212f0, CSAT_Uy = 0.6709892f0, CSAT_Uz = -0.19307604f0, CSAT_Temp = 23.379425f0, CSAT_Diag_Word = 0.0f0)

This struct then acts as an array of named tuples, each of which represents one record. The field names and types match the columns as they are defined in the TOB3 file, but there are two additional values in each record: Record_Number (UInt32) and Timestamp (UInt64). The timestamp is converted into nanoseconds since unix epoch, which should allow to represent any valid value within the next ~500 years without loss of precision. Use Dates.unix2datetime to convert it to a Julia DateTime, but note that this only supports millisecond resolution and may therefore lose precision.

julia> using Dates

julia> unix2datetime(tob[1].Timestamp * 1e-9)
2025-10-03T16:00:00.016

The data is memory-mapped and only loaded from the file once the respective indices are accessed.

File Format

TOB3 files consist of an ASCII header followed by binary data records. Those records are grouped into “frames” that have a small header and footer. A high-level description of the format can be found in the LoggerNet Manual, but this is missing a lot of details. The most complete resource appears to be an open-source command line utility called camp2ascii written by Mathias Bavay at SLF in Davos, Switzerland. In particular, the archived usage page explains that the logger first writes the full file with invalid frames and then overwrites the frames with valid ones as the data is collected. It is therefore expected that some of the frames within a file are invalid.

To determine whether a frame is valid, there are two UInt16 values written at the end. The second one is a “validation marker” that should exactly match a number that is provided in the header. If not, we can ignore the whole frame. The first one appears to play a role when a frame has some data but is not complete, which seems to be referred to as a “minor frame”, as opposed to a “major frame” with complete data. For a complete frame, the value likely should be zero, but that remains to be confirmed.

In a partially complete frame, the first value appears to contain a number of bits that are used as flags and an offset number. It is not exactly clear whether the four or five first bits are flags and what each one means. It appears that some of the flags are set for an incomplete frame. The remaining 11–12 bits are an “offset” value. This likely contains the number of bytes that are “available” within the frame, i.e. that should be skipped at the end of the records data, or, equivalently, that we need to seek backwards to get to the end of the valid data. This also means that frames are likely kept to at most a few kilobytes, as a 12-bit offset could only point back 4kB.

It also appears that after the valid records, there is another “inner” footer with the same two UInt16 values. The first value now should have the only the first flag bit set, and its offset value appears to consist of the number of bytes that were already written in the frame (including the 16 header & footer bytes). The two offsets of the inner and outer footer should therefore add up to the frame size minus 16 bytes for the header and footer. The second value of the inner footer is the same validation marker as for the outer footer.

The meaning of the flag bits is not properly documented. The camp2ascii code interprets bit 1 as “minor frame”, bit 2 as “empty frame”, and bit 3 as “card removed after writing”, while bit 4 is not interpreted and just named flag_f (note that the bitmasks in the code are wrong though and all flags end up being ignored). Another code repo includes comments with the names “minor”, “empty”, “x/reserved/unused”, and “file mark” for the bit flags. In our sample data, the first flag is set for both the outer and inner footer of an incomplete frame, and the last flag is also set in the outer footer of that frame, with no flags being used otherwise.

The code currently interprets “no flags set” in the outer footer as a complete frame and otherwise tries to read it as minor frame. The inner footer is treated as valid if only the first flag is set and invalid otherwise.

Development & Debugging

If the TOB constructor cannot load a file, it may contain a currently unsupported data type. Use the following shell command to prepare a sample that should contain enough data to implement and test the missing functionality, replacing FILE.dat with the actual file name:

$ head -c 5k FILE.dat > sample.dat

To include one or several frames from the end of the file, you need to determine the number of bytes in the header and the number of bytes per frame, which can be found as the third value on the second header line. For example, here we use the header size of 512B and the frame size of 1024B to write a sample file with the first and last two frames:

$ head -n 2 FILE.dat
"TOB3","49674","CR1000X","49674","CR1000X.8.2.1","CPU:Hi_Fi_1v1.0.85.CR1X","31394","2025-10-04 06:00:00"
"NR01","1000 MSEC","1024","3948","21231","Sec100Usec","           0","           0","4224989552"

$ head -n 6 | wc -c
512

$ head -c 2560 FILE.dat > sample.dat

$ tail -c 2048 FILE.dat >> sample.dat

To inspect TOB3 data in Julia with only the most basic parsing, we can use the following line:

julia> header, data = open(io -> ([readline(io) for _ in 1:6], read(io)), path, "r")

About

Julia package with partial support for loading the Table Oriented Binary 3 (TOB3) format produced by Campbell Scientific data loggers

Resources

License

Stars

Watchers

Forks

Contributors