Skip to content

Debugging

Justin Bassett edited this page Jun 11, 2020 · 13 revisions

Debugging a Mismatch

So you've found that validation-test.py or plaidbench-validation-test.py reports a mismatch between two of your runs. How do we approach debugging this mismatch?

The first thing to verify is whether we should expect the runs to be mismatch-free. Mismatch checking only works if the costs between the runs are comparable. It's expected to have mismatches if you change the cost function, e.g. from PERP to SLIL.

Minimize the test case

If you discovered the mismatch from compiling all of SPEC CPU2006, choose a single benchmark which has a mismatch and only run that. Ideally, you'd choose a smaller benchmark rather than a larger one.

To determine which benchmark a block comes from, spills.dat from the output of runspec-wrapper-optsched.py shows the function names, making it possible to search that file for the given block (minus the :##; block names are functionName:#blkNumber#, e.g. foo:23).

Here's a script which does that search for you:

findblock.py

#!/usr/bin/env python3
import sys
import argparse

parser = argparse.ArgumentParser(description='Search spills.dat to find the benchmark for a block')
parser.add_argument('spills', help='The spills.dat file to search in. - for stdin')
parser.add_argument('blocks', help='The blocks to search for. This may include the `:##` part, or it may just be the mangled function name', nargs='*')

result = parser.parse_args()

with open(result.spills, 'r') as f:
    file = f.read()

fns = (block.split(':')[0] for block in result.blocks)

fn_locs = [file.find(fn) for fn in fns]
fn_benchmarks = [file.rfind(':', 0, fnindex) for fnindex in fn_locs]
fn_benchmark_spans = [(file.rfind('\n', 0, e), e) for e in fn_benchmarks]
fn_benchmarks = [file[b + 1:e] for (b, e) in fn_benchmark_spans]

print('\n'.join(fn_benchmarks))

Visualize the DDG

It can be very helpful to visualize the DDG for a small mismatch case, presuming there is a small mismatch case.

Dumping a DDG

DataDepGraph has a function WriteToFile which can be used to dump the DDG to a file. Inside sched_region.cpp's FindOptimalSchedule function, check if (dataDepGraph_->GetDagID() == std::string{"MyBlockOfConcern:32"}), writing the DDG to a file with the aforementioned function if it is. Then, re-run the single benchmark.

You may want to make a hotfuncs.ini file which only lists the function with the mismatched block, speeding up the run to dump the DDG:

hotfuncs.ini

MyBlockOfConcern YES

Rough code for dumping DDG:

if (dataDepGraph_->GetDagID() == std::string{"MyBlockofConcern:123"}) {
  Logger::Info("Writing DDG."); // Help to verify that we actually did write the DDG
  auto f = fopen("/home/<me>/path/to/the.ddg", "w");
  dataDepGraph_->WriteToFile(f, RES_SUCCESS, 1, 0);
  fclose(f);
}

Visualizing the DDG

Convert the DDG to a graphical format. Graphviz has dot, which may be desirable. This script converts the WriteToFile format to a .dot file:

ddg2dot.py

#!/usr/bin/env python3
import argparse
import sys
import re

parser = argparse.ArgumentParser(description='Convert data_dep WriteToFile format to a .dot file')
parser.add_argument('input', help='The WriteToFile format file to convert. Input a single hyphen (-) to read from stdin')
parser.add_argument('-o', '--output', help='The destination to write to. Defaults to stdout')
parser.add_argument('--filter-weights', nargs='*', default=[], help='filter out weights with the respective values')

args = parser.parse_args()

if args.input == '-':
    infile = sys.stdin
else:
    infile = open(args.input, 'r')

filtered_weights = set(int(x) for x in args.filter_weights)

text = infile.read()
infile.close()

NODE_RE = re.compile(r'node (?P<number>\d+) "(?P<name>.*?)"(\s*"(?P<other_name>.*?)")?')
EDGE_RE = re.compile(r'dep (?P<from>\d+) (?P<to>\d+) "(?P<type>.*?)" (?P<weight>\d+)')

result = ['digraph G {\n']

for match in NODE_RE.finditer(text):
    num = match['number']
    name = match['name']
    if name == 'artificial':
        name = ['exit', 'entry'][match['other_name'] == '__optsched_entry']
    
    result.append(f'    n{num} [label="{name}:n{num}"];\n')

result.append('\n')

for match in EDGE_RE.finditer(text):
    from_ = match['from']
    to = match['to']
    type_ = match['type']
    weight = match['weight']
    
    weight_label = '' if int(weight) in filtered_weights else ':' + weight
    label = weight_label if type_ == 'data' else f'{type_}{weight_label}'
    label_section = f'label="{label}"'

    attr_section = f' [{label_section}]' if label else ''
    result.append(f'    n{from_} -> n{to}{attr_section};\n')

result.append('}\n')

output = sys.stdout
if args.output: output = open(args.output, 'w')

print(''.join(result), file=output)

Running dot on the output gives a graph visualization.

Visualization of Graph Transformations

Some small things can make it easier to see the change that the Graph Transformation did. In short, write the DDG before and after the transformations, then proceed as above for each file. However, the Graphviz output dags will likely differ in layout.

To make the layout the same, meaning that the only difference is the added edges, copy the added edges over to the .dot file without the edges, adding style=invis. For example:

Before transformations:

digraph G {
    n0 [label="LEA64r:n0"];
    n1 [label="LEA64r:n1"];
    n2 [label="COPY:n2"];
    n3 [label="entry:n3"];
    n4 [label="exit:n4"];

    n0 -> n4 [label="other"];
    n1 -> n4 [label="other"];
    n2 -> n4 [label="other"];
    n3 -> n2 [label="other"];
    n3 -> n1 [label="other"];
    n3 -> n0 [label="other"];
}

After transformations:

digraph G {
    n0 [label="LEA64r:n0"];
    n1 [label="LEA64r:n1"];
    n2 [label="COPY:n2"];
    n3 [label="entry:n3"];
    n4 [label="exit:n4"];

    n0 -> n4 [label="other"];
    n0 -> n2 [label="other"];
    n0 -> n1 [label="other"];
    n1 -> n4 [label="other"];
    n1 -> n2 [label="other"];
    n2 -> n4 [label="other"];
    n3 -> n2 [label="other"];
    n3 -> n1 [label="other"];
    n3 -> n0 [label="other"];
}

Altered before to remove layout differences:

digraph G {
    n0 [label="LEA64r:n0"];
    n1 [label="LEA64r:n1"];
    n2 [label="COPY:n2"];
    n3 [label="entry:n3"];
    n4 [label="exit:n4"];

    n0 -> n4 [label="other"];
    n0 -> n2 [label="other" style=invis];
    n0 -> n1 [label="other" style=invis];
    n1 -> n4 [label="other"];
    n1 -> n2 [label="other" style=invis];
    n2 -> n4 [label="other"];
    n3 -> n2 [label="other"];
    n3 -> n1 [label="other"];
    n3 -> n0 [label="other"];
}

Dump Use/Defs

It can be very useful to have Use/Def information for the block, as well as some other debug info. To obtain this, build LLVM and OptSched in Debug configuration, or pass -DLLVM_ENABLE_ASSERTIONS=ON when configuring CMake in release mode. This enables the -debug and -debug-only=optsched-ddg-wrapper commandline options, which dump the info.

To add these flags to the SPEC benchmarks, change your .cfg file to pass the flags to the compiler.

Clone this wiki locally