Skip to content

[hist] No clear TH1K use case? #19761

@guitargeek

Description

@guitargeek

There is this TH1K class that has no tests and only one tutorial. It caught my attention because there is a JIRA issue about it.

I tried it out to get a feeling for it, comparing it with a TH1F when only filling one bin:

TH1F hf{"h1", "title", n, -2.5, 2.5};
TH1K hk{"h2", "title", n, -2.5, 2.5};

std::vector<double> vals{-0.1, +0.1};

for (double x : vals) {
    hk.Fill(x);
    hf.Fill(x);
}

TAxis *xaxis = hk.GetXaxis();
for (int i = 1; i <= n; ++i) {
    double xmin = xaxis->GetBinLowEdge(i);
    double xmax = xaxis->GetBinUpEdge(i);
    std::cout << "bin " << i
        << " [" << xmin << ", " << xmax << "] : "
        << hf.GetBinContent(i) << " : "
        << hk.GetBinContent(i) << std::endl;
}

std::cout << "TH1F sum : " << hf.GetSumOfWeights()
          << " (sumw2 = " << hf.GetSumw2()->GetSum() << ")" << std::endl;

std::cout << "TH1K sum : " << hk.GetSumOfWeights()
          << " (sumw2 = " << hk.GetSumw2()->GetSum() << ")" << std::endl;

The result is:

bin 1 [-2.5, -1.5] :   0.0 :   0.175439
bin 2 [-1.5, -0.5] :   0.0 :   0.606061
bin 3 [-0.5,  0.5] :   2.0 :   6.66667
bin 4 [ 0.5,  1.5] :   0.0 :   0.606061
bin 5 [ 1.5,  2.5] :   0.0 :   0.1754

TH1F sum : 2 (sumw2 = 0)
TH1K sum : 8.22966 (sumw2 = 0)

What is going on? I see that the neighboring bins have non-zero bin content, which makes sense because the bins are smeared using the K Neighbours method. But if the content is smeared, the content of the central bin should be strictly less than the content of the original histogram. Also, I would expect the normalization to be consistent.

I ran ROOT versions back to 6.26, and they all have this behavior.

With this blown-up bin content, does the TH1K even give the right fit results? I think it will underestimate the statistical uncertainties, as it artificially has increased stats without storing custom sum-of-weights-squared to correct to the original statistical power.

Given these flaws and that nobody complained about them, I presume we should deprecate and remove this statistically unsafe TH1K class (unless I am missing something?).

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions