-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Debugging, Profiling and Benchmarking DVC
You can add -vv flag to any commands to increase the verbosity of dvc's log. Eg:
$ dvc metrics diff -vv
2021-03-11 09:10:34,136 TRACE: Namespace(cprofile=False, cprofile_dump=None, pdb=False, instrument=False, instrument_open=False, quiet=0, verbose=2, version=None, cd='.', cmd='diff', a_rev=None, b_rev=None, targets=None, recursive=False, all=False, show_json=False, show_md=False, no_path=False, precision=None, func=<class 'dvc.command.metrics.CmdMetricsDiff'>)
2021-03-11 09:10:34,503 DEBUG: Check for update is enabled.
...If you are using dvc's Python API or dvc.api, you can do following to increase verbosity. Note that you need to do this after you are done with importing, as importing anything from dvc later might change the verbosity again.
import logging
logger = logging.getLogger("dvc")
logger.setLevel(5)Any dvc commands can be used with --pdb flag added to it, which will drop you to the debugger on any exceptions.
By default, it will try to use ipdb as a debugger, and/or fallback to the pdb.
$ dvc metrics diff --pdb
dvc stage list data/dvc.yaml --pdb
> /home/user/dvc/dvc/dvcfile.py(133)_load()
132 is_ignored = self.repo.fs.exists(self.path, use_dvcignore=False)
--> 133 raise StageFileDoesNotExistError(self.path, dvc_ignored=is_ignored)
134
ipdb> Similarly, if you are using python API, you can do the dvc._debug.debug context manager to achieve this.
from dvc._debug import debug
with debug():
pass # your code hereYou could alternatively use pdb or breakpoint() in your code itself.
With —show-stack/—ss flag, you can inspect what dvc is doing at the time, with Ctrl + T on macOS and Ctrl + \ on Linux. It will print a stack-frame of the main thread. It might be useful for debugging when dvc freezes or hangs. Not available on Windows.
You can use dvc._debug.show_stack() context manager in Python APIs if you want the same behavior.
If you are having some performance issues, we might ask you for profiling data. DVC supports two kinds of profiling data: deterministic (with cprofile) and statistical (with pyinstrument).
cprofile traces every Python call, which might make it a bit slower than without it. But, most of the time, it's enough to trace where performance issues are. So, we will ask for it most of the time. One disadvantage of cprofile data is the lack of full-stack records (why those functions are getting called). This is possible to gather with another profiler pyinstrument, which is a sampling profiler and has a much lower overhead.
--cprofile-dump <filename> flag can be used to generate cprofile data for the given command, with the specified filename.
You can attach it to us on email/issue/chat for tracing performance issues. Eg:
$ dvc push --cprofile-dump dump.profSimilarly, if you are using Python API, you can use dvc._debug.profile to generate the cprofile data.
from dvc._debug import profile
with profile("dump.prof"): # dumps profiling output to the file
pass # your code here
with profile(): # dumps profiling output to the terminal
pass # your code hereAlternatively, you can use cProfile.Profile to do the same.
snakeviz or tuna can be used to visualize the data.
Alternatively, pstats can also be used to analyze the data.
You need to install pyinstrument first, as DVC does not come it pre-installed.
After it's installed, you can use --instrument-open flag to any dvc's commands to instrument/profile them. Example:
$ dvc status --instrument-openThis will open a webpage with performance results. If you want to print this into the console instead, --instrument flag could be used instead.
Similarly, when using python API, you could use dvc._debug.instrument to achieve this.
from dvc._debug import instrument
with instrument(html_output=True): # opens a webpage
pass # your code here
with instrument(): # prints to the terminal
pass # your code hereThis requires yappi to be installed. This generates callgrind output, so you may want to install kcachegrind (qcachegrind on macOS/Windows).
After that, you can use --yappi flag on any of the DVC's commands, which will generate callgrind file in the form of callgrind.dvc-XXX.out, which can be viewed using kcachegrind/qcachegrind or any other compatible visualization tools.
Also consider using --yappi-separate-threads flag that will generate one callgrind file per every thread, which makes debugging multithreaded code much easier.
Similarly, you can use dvc._debug.yappi_profile as a part of an API, when profiling small APIs.
Also, please check this small guide if you are new to the kcachegrind/qcachegrind, to make familiar with it's user interface.
This requires viztracer to be installed.
After that, you can use --viztracer flag on any of the DVC's commands, which will generate an output file in the form of viztracer.dvc-XXX.json. You can use the --viztracer-depth flag to customize the Max Stack Depth or --viztracer-async to visualize async tasks as separate "threads":
$ dvc status --viztracer --viztracer-depth 8 --viztracer-asyncSimilarly, you can use dvc._debug.viztracer_profile as a part of an API, when profiling small APIs.
The results can be visualized with vizviewer, which is installed alongside viztracer:
vizviewer viztracer.dvc-20220406_091208.jsonThis requires filprofiler to be installed. After it's installed, you can create
an script:
# status.py
from dvc.repo import Repo
Repo().status()And run
$ fil-profile status.pyOn Python3.7 and above, there's a builtin profiler on import time: -X importtime option. Example:
$ python -X importtime -m dvc --help
You can use tuna to visualize this as well. Example:
$ python -X importtime -m dvc --help 2> startup.log
$ tuna startup.log
You can use tools like hyperfine to do the benchmarks, as they provide statistical analysis and perform multiple runs.
$ hyperfine "dvc --help" --warmup 3
DVC's performance can be heavily influenced by disk caches, so it's recommended to be use warmup runs.
debug, profile and instrument can also be used as a decorator during the development (in addition to a context manager). Please don't forget to call it as a function though:
@debug() # ✅
@debug # ❎Eg:
@instrument()
def collect_repo(self, onerror: Callable[[str, Exception], None] = None):If you want to mix them all, you can also take the help of debugtools.