Skip to content

"ground truth" discussion #24

@freeman-lab

Description

@freeman-lab

There's been lots of discussion of the "ground truth" labels currently used for NeuroFinder, so we wanted to consolidate that discussion in one place, and get feedback on some new ideas for moving forward.

current concerns

The labels used now reflect a mix of approaches, including activity-independent nuclear labeling, hand labeling using the raw data, hand labeling using various summary statistics, and hand curation of semi-automated methods.

All have advantages and disadvantages, but the inconsistency has been a source of confusion for both algorithm developers and those trying to interpret the results (see for example #15 and #16). A particular concern is that the variability in performance across algorithms reflects not only differences in algorithms but also differences in how ground truth is defined.

moving forward

Ideally, we should have a ground truth definition that (1) can be arrived at by following a clearly specified procedure (2) would yield similar answers if multiple people followed those instructions and (3) is applied consistently to all training and testing datasets.

Here's one proposal:

  1. Provide each of several independent labelers (at least 3-5) with a mean and local correlation image
  2. Also provide several examples of what individual neurons look like to the labelers
  3. Have them label all datasets and aggregate the results via some consensus procedure

What do people think of this idea? Or other ideas?

cc @marius10p @agiovann @epnev @Selmaan @aaronkerlin @sofroniewn @svoboda314 @boazmohar @syncrostone

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions