Debugging tips

Cython debugging for TileDB-Py

Debugging on Linux

Cython provides a cython-aware gdb frontend, cygdb:
- https://cython.readthedocs.io/en/latest/src/userguide/debugging.html
However, gdb/cygdb are only practically useful on Linux, because gdb does not work well on newer versions of macOS.
checking the version of libtiledb in the running python process:
- import os; os.getpid() to get the pid from python
- in shell, pmap <pid> | grep libtiledb

Debugging on macOS

It is reasonably practical to single-step debug small sections of the Cython-generated C++ code. Some familiarity with the CPython object model is very helpful here.
The Cython option Cython.Compiler.Options.``emit_code_comments controls whether Cython emits a copy of the source code into the output C++ file; this is on by default and should be enabled for debugging. Each line of C++ code will be preceded by a commented-out version of the source Cython code.
Each context block in the generated C++ will have the corresponding line number in the original Cython code. So, start from a Cython line number, find that block, and set a breakpoint at the line below the context comment in the generated libtiledb.cpp.
In order to see all of the python code corresponding to C++ code while single-stepping, it is recommended to increase the lldb code-listing verbosity:
```
(lldb) settings set stop-line-count-before 8
```
Start the python interpreter under lldb and run a command which will invoke the targeted section of Cython/C++ code.
- or run a script (potentially w/ args). Assuming LINENO in libtiledb.cpp as per above:
```
$ lldb -- python -i MYSCRIPT.py
(lldb) b libtiledb.cpp:LINENO
>>> import tiledb
>>> [run command to trigger breakpoint, then step, view values, etc.]
```
- To print Cython PyObject* variables in the debugger, install the following LLDB script: https://github.com/malor/cpython-lldb
- Then, within a libtiledb.cpp frame:
  - individual PyObject* variables should pretty-print with p, for example: p __pyx_v_uri
  - the LLDB command frame variable will show known variables in the frame

- Ideally, the Cython code will have primitive types which can be printed with the usual lldb p(rint) command. However, to print the contents of a PyObject* inside the debugger, see the following discussion; these commands may be called in the debugger: ~~- https://stackoverflow.com/questions/5356773/python-get-string-representation-of-pyobject~~

checking the version of libtiledb in the running python process:
- import os; os.getpid()
- in shell: vmmap -p <pid> | grep libtiledb

Misc debugging

Given a memory address, ADDR, ctypes may be used to read value(s) from that address:

>>> import ctypes
>>> p = ctypes.cast(ADDR, ctypes.POINTER(ctypes.c_uint64))
>>> p[0], p[1]
    ^ equivalent to *p *(p+1) etc.

Defining the following function will allow most tests to be copy-pasted into the REPL from test_libtiledb.py, and run directly:

>>> import tiledb, numpy as np
>>> self = lambda: None; self.path = lambda x: os.path.join("/tmp", x)
>>> [paste non-indented test block, and run]

Debugging on macOS with gdb (note: does not currently work):

Install gdb from Homebrew
Follow signing instructions to give sufficient access to gdb:
- https://sourceware.org/gdb/wiki/PermissionsDarwin
[TBD: so far unsuccessful. unclear as of 2019/3, (link 1) (link 2), whether any gdb version supports macOS 10.14]

Modular compilation

TileDB-Py's setup.py supports a command line argument --modular which enables a modular build. By default, code in separate .pyx files is sourced into the main libtiledb.pyx file using the Cython include command. When setup.py is run with --modular, the Cython compile-time constant TILEDBPY_MODULAR is set to True, and all files listed in MODULAR_SOURCES within setup.py are built as separate Cython modules (initially the only modular file is np2buf.pyx). When TILEDBPY_MODULAR is set, import is used to make the necessary function definitions available in libtiledb.pyx. The goal of this mechanism is to reduce the compilation time by limiting the size of the pyx file. For more details and usage example, see the following commits:

Analyzing reference count problems

Given a function (in pure python) which creates a DenseArray:

def foo():
  arr = tiledb.DenseArray(...)
  import pdb; pdb.set_trace()

Entering pdb at this point, we can print out the array:

(Pdb) p arr
<tiledb.libtiledb.DenseArray object at 0x000000123456789>

Copy the address!

Now, set a breakpoint (or repeat pdb.set_trace()) in a location where we expect the refcount of arr to be zero -- for example, some location after the function return. At that point we can check the refcount and referrers as follows:

(Pdb) import ctypes, sys
(Pdb) o = ctypes.cast(0x000000123456789, ctypes.py_object)
(Pdb) o
py_object(<tiledb.libtiledb.DenseArray object at 0x000000123456789>)
(Pdb) sys.getrefcount(o.value)
?
(Pdb) gc.get_referrers(o.value)
[...]

(note that ctypes.cast(<addr>, ctypes.py_object) does not increase the refcount of the target object -- which can be verified by assigning a second variable to the identical ctypes.cast call.

Running against libtiledb with address sanitizer

TileDB-Py can be run with libtiledb compiled aginst address sanitizer, by using the --enable-sanitizer=address TileDB bootstrap option, and then preloading the ASAN library before running TileDB-Py:

export LD_PRELOAD=/usr/lib64/libasan.so.4.0.0

(path above is for CentOS 7 / AL2; paths will vary based on Linux distribution)

Building with address sanitizer

TileDB-Py may be built with address sanitizer support using the following exports before running setup.py:

export LFLAGS="-fsanitize=address"
export CXXFLAGS="-fsanitize=address -g -fno-omit-frame-pointer"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Debugging tips

Cython debugging for TileDB-Py

Debugging on Linux

Debugging on macOS

Misc debugging

Debugging on macOS with gdb (note: does not currently work):

Modular compilation

Analyzing reference count problems

Running against libtiledb with address sanitizer

Building with address sanitizer

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Release Process

Build Instructions

Debugging Tips

Testing

Documentation

TileDB Py to TileDB Embedded Version Chart

Clone this wiki locally