-
Notifications
You must be signed in to change notification settings - Fork 1
Does it work and how does it work
TRACE uses UNIX mmap to allow multiple processes to access a shared
memory area.
After the file is created, the first portion of the memory is mapped
read-only to protect
all processes from memory buffer corruption which may cause a program to
access beyond the
limits of the memory buffer.
Atomic access (across threads and processes) is part of the C/C standard
since C/C++11.
TRACE uses this for when a particular TRACE call get it’s “slot” in the
memory buffer.
TLS, also part of the standard, is used to store the “Trace ID”
associated with each TRACE.
Also, for the “slow-destination message limiting” feature, the current
timespan start, state, and
count are in TLS.
ariadic Macros are used to manage possible variable number of
arguments.
There is a limit to the number of arguments supported. This is arbitrary
and there is
no increased runtime overhead for supporting more.
There are macros to separate the format from the rest of the optional
arguments.
There is a macro which counts the number of arguments.
An important aspect to memory tracing is that no (real) system call
occur. Besides significant overhead,
the possibility of interrupt program flow can happen when a system call
occurs. Modern UNIX OSes use
virtual system call for getting time and process/thread ID information.
Delayed formatting of integers and floats has been seen to be a
significant feature.
There is an architecture-specific component to this. Knowledge of the
specific architects
ABI is used to implement this feature. Basically, after a certain number
of integer and floating-point arguments are passed in registers, the
remaining arguments go in a contiguous block of
stack memory. The macro system is used to inject unused floating-point
parameters into the trace
function calls forcing the user-supplied parameters to be written to the
contiguous stack memory.
The macro system is also used to count parameters. This allows for the
important case where
the number of arguments is 0 and therefore no formatting need be done.
This is important for when
the message is likely to include embedded format specifiers*.
* For example, when using the TLOG macro to include string variables which may include ‘%’; the LOG macro can be given an optional parameter to turn off delayed formatting to avoid this issue.
This is used to provide “strings helper” macro for sprintf-ing strings
into the message format in C.
Efficiency aside, this is not needed for C as strings can easily to
included in the message using
basic std::string append/concatenate functionality.
The TRACE kernel module can create a virtual file that can be memory
mapped by user space
processes. The Kernel (starting somewhere in the 2.6 kernels) allow for
“os hook” functions
to be registers — i.e. a hook for process/thread scheduling, a hook for
system calls and a hook
for hw/sw interrupts can be registered which do TRACEs.
setup TRACE # set TRACE; specify a version if "current" not defined
rm -f /tmp/trace_buffer_$USER # clear any existing buffer (and config (buffer will automatically be created)
export TRACE_MSGMAX=0 TRACE_LVLM=0x3f # enable circular memory and make sure trace levels 0-5 for memory are enabled
treset; tcntl -x2 test-threads >|messages.out; : -x2 causes a TRACE "name" to be assigned to each thread;\
cat messages.out | wc -l;\
tshow | wc -l;\
tlvls
The last four lines (one continued command) should produce the following:
18
44
mode: M=1 S=1
TID NAME maskM maskS maskT
--- ---------- ------------------ ------------------ ------------------
0 TEST_THR_1 0x000000000000003f 0x0000000000000007 0x0000000000000000
1 TEST_THR_3 0x000000000000003f 0x0000000000000007 0x0000000000000000
2 TEST_THR_0 0x000000000000003f 0x0000000000000007 0x0000000000000000
3 TEST_THR_2 0x000000000000003f 0x0000000000000007 0x0000000000000000
125 TRACE 0x000000000000003f 0x0000000000000007 0x0000000000000000
126 _TRACE_ 0x000000000000003f 0x0000000000000007 0x0000000000000000
There should be 18 lines in the messages.out file and 42+2 lines in the
memory buffer.
18 = 4 threads * (2 loops * 2 TRACEs each) + 2 TRACEs from the main
program. 4*2*2 + 2
44 = 4 threads * (2 loops * 5 TRACEs each) + 2 TRACEs from the main
program + 2 tshow header lines. 4*2*5 + 2 +2
There are fewer lines in the memory buffer than in the output file
because there are fewer bits set
in the maskM column than in the maskS column.
More TRACE lines (messages) can be generated by increasing the thread and loop counts:
$ treset; tcntl -x2 test-threads 80 -l100 >|messages.out;\
> cat messages.out | wc -l;\
> tshow | wc -l
16004
40004
40,002 (plus 2 tshow header) lines were delivered to the default 50,000 line circular buffer.
3 * 1,600 and 3 * 40,002 lines will occur:
$ treset;\
> ( tcntl -x2 test-threads 80 -l100 &\
> tcntl -x2 test-threads 80 -l100 &\
> tcntl -x2 test-threads 80 -l100 ) >|messages.out;\
> cat messages.out | wc -l;\
> tshow | wc -l;\
> tinfo | grep used
48012
50002
writeIdxCount =0x0001d4c6 entries used: 50000
The hex value 0x0001d4c6 = 120006 (from “tinfo”), which is 3 *
40,002,
indicates that the circular memory buffer wrapped around a couple of
times.