-
Notifications
You must be signed in to change notification settings - Fork 1
Wish list
Jonathan A Rees edited this page Apr 6, 2020
·
6 revisions
Please read README first
- Documentation of method
- Try out more taxonomic sources e.g. CoL, MSW 2/3/4
- Translation to RCC-5
- Method
- Handle 'unclassified' ('incertae sedis') containers and child rank inconsistencies
- Compute RCC-5
>vs. the usual=when B-TNU has no exact A-TNU MRCA match - Eliminate NCBI 'containers' without losing implicit disjointness claims (e.g. via ranks or via 'mutexes')
- Quantified uncertainty (confidence in RCC-5 judgments)...
- Match even when the genus differs (if closely related, e.g. a sibling also matches in the same way, and unambiguous)
- Match even when there is a gender error (-us vs. -a)
- Enlist genetic and occurrence data, and perhaps other data, in understanding TNUs and/or assigning data records to TNUs
- Display
- Fix display of synonyms; maybe 3 columns: accepted TNU in A hierarchy, a name that's in both A and B that enables the match (it's either A or an A synonym, and also either B or a B synonym), accepted TNU in B hierarchy
- Stop eliding repeated names with
=; maybe put name comparison status (=vs. blank) in a separate column - Show B nesting level somehow? Show A nesting better (indentation??)? Deal with ranks better? (NCBI ranks are not always consistent, sometimes not given at all)
- Group A-TNU children according to their B-TNU parent, when the implied A-group is split into multiple B-groups
- Maybe show all consistent RCC-5 options, as opposed to just suggesting one? Hmm...
- Nice HTML output similar to Avibase checklist comparator
- Graphviz? (that might be better done by separate fdownstream tools)
- Questions
- Is there a sensible way to distinguish 'wholly new' from, say, splitting? Or to suggest to the user that either is a possibility for 'no match'? (in the case of Genbank, we could look at the sequence records maybe, as a heuristic??)