Skip to content

Syntax validation and differences between uom-systems/ucum and ucum-essence.xmlΒ #173

@JohnTimm

Description

@JohnTimm

I am looking for a JSR-385 based library to parse and validate UCUM units in our FHIR server implementation: http://github.com/ibm/fhir with the potential for supporting unit conversion in the future. I wrote a unit test to test parsing on: http://hl7.org/fhir/valueset-ucum-common.html and found a number of issues:

  1. handling of annotations in numerator or denominator (e.g. code: %/100{WBC}, display: percent / 100 WBC)
    Encountered " <ANNOTATION> "{WBC} "" at line 1, column 6.|
    For this one there are also problems with this syntax: /{oif} where there is only an annotation in the denominator (or the numerator)

  2. handling of annotations that contains spaces (e.g. code: %{Negative Control}, display: percent Negative Control)
    Lexical error at line 1, column 11. Encountered: " " (32), after : "{NEGATIVE"

  3. missing symbols(e.g. [iU] (or [IU]) for international units, bit_s, bd, etc.

Here are the numbers from the codes listed in that valueset:

Total: 1364
Success: 1117
Error: 247

It looks like part of the problem is how strict the UCUM format parser is. In the short term, I can look at "fixing up" some of the codes before passing them to the parser (e.g. remove spaces from annotations). The thing that concerns me the most, however, is how many missing symbols there are. Especially if you consider what's in ucum-essence.xml and compare that to the resource files that the format parser uses.

Is there a way to configure UCUMFormatParser to use ucum-essence.xml as a starting point for its symbol map? I looked into using Eclipse uomo but the activity there isn't the same as this project and it doesn't look like it is up to speed on its JSR 385 compliance. Please advise.

Here's a list of the 247 codes that generated exceptions:

%/100{WBC}
%{Negative Control}
/[arb'U]
/[HPF]
/[iU]
/[LPF]
/[HPF]
/[LPF]
/1010
/10
12
/1012{rbc}
/10
6
/109
/100{cells}
/100{neutrophils}
/100{spermatozoa}
/100{WBC}
/100{WBCs}
/cm[H2O]
[APL'U]
[APL'U]/mL
[arb'U]
[arb'U]/L
[arb'U]/mL
[AU]
[BAU]
[beth'U]
[beth'U]
[CFU]
[CFU]/L
[CFU]/mL
[Ch]
[drp]
[drp]/[HPF]
[drp]/h
[drp]/min
[drp]/mL
[drp]/s
[GPL'U]
[iU]
[IU]/(2.h)
[IU]/(24.h)
[IU]/10
9{RBCs}
[IU]/d
[IU]/dL
[IU]/g
[IU]/g{Hb}
[iU]/g{Hgb}
[IU]/h
[IU]/kg
[IU]/kg/d
[IU]/L
[IU]/min
[IU]/mL
[MPL'U]
[tb'U]
[todd'U]
[todd'U]
{# of calculi}
{# of donor informative markers}
{# of fetuses}
{# of informative markers}
{2 or 3 times}/d
{3 times}/d
{4 times}/d
{5 times}/d
{cells}/[HPF]
{clock time}
U{G}
{P2Y12 Reaction Units}
1012/L
10
3
103.{RBC}
10
3.U
103/L
10
3/mL
103/uL
10
3{Copies}/mL
10*-3{Polarization'U}
105
10
6
106.[iU]
10
6.eq/mL
106.U
10
6/{Specimen}
106/kg
10
6/L
106/mL
10
6/mm3
106/uL
10
-6{Immunofluorescence'U}
108
10
9/L
109/mL
10
9/uL
cm[H2O]
cm[H2O]/(s.m)
cm[H2O]/L/s
cm[Hg]
dB
eq
eq/L
eq/mL
eq/mmol
eq/umol
GBq
[iU]
k[IU]/L
k[IU]/mL
kPa
m[iU]
m[IU]/L
m[IU]/mL
meq
meq/(12.h)
meq/(2.h)
meq/(24.h)
meq/(8.h)
meq/(8.h.kg)
meq/(kg.d)
meq/{Specimen}
meq/d
meq/dL
meq/g
meq/g{Cre}
meq/h
meq/kg
meq/kg/h
meq/kg/min
meq/L
meq/m2
meq/min
meq/mL
mg/d/(173.10*-2.m2)
mL/cm[H2O]
mL/min/(173.10*-2.m2)
mm[H2O]
mm[Hg]
mosm
mosm/kg
mosm/L
mPa
ng/106
osm/kg
osm/L
U/10
10{cells}
U/1012
U/10
6
U/109
u[IU]
u[IU]/L
u[IU]/mL
ueq
ueq/L
ueq/mL
10
4/uL
[bdsk'U]
cm[H2O]/s/m
{CPM}/103{cell}
U/10
10
U/(10.g){feces}
U{25Cel}/L
U{37Cel}/L
U/1012{RBCs}
{Globules}/[HPF]
g/(8.h){shift}
g/kg/(8.h){shift}
[HPF]
[GPL'U]/mL
[MPL'U]/mL
[in_i'H2O]
[IU]
[IU]/L{37Cel}
[IU]/mg{creat}
[ka'U]
[LPF]
[mclg'U]
meq/g{creat}
meq/{specimen}
meq/{total_volume}
10
6.[CFU]/L
106.[IU]
10
6/(24.h)
mPa.s
ng/106{RBCs}
nmol/min/10
6{cells}
{#}/[HPF]
{#}/[LPF]
osm
/104{RBCs}
/[IU]
/10
3
/103.{RBCs}
/10
12{RBCs}
103{copies}/mL
10
3{RBCs}
%[slope]
/100{Spermatozoa}
[Amb'a'1'U]
[CCID_50]
[D'ag'U]
[diop]
[dye'U]
[FFU]
[hnsf'U]
[hp_C]
[hp_M]
[hp_Q]
[hp_X]
[in_i'Hg]
[iU]/dL
[iU]/g
[iU]/kg
[iU]/L
[iU]/mL
[knk'U]
[Lf]
[mesh_i]
[MET]
[p'diop]
[PFU]
[PNU]
[S]
[smgy'U]
[smoot]
[TCID_50]
[USP'U]
10*
10^
a_g
a_j
a_t
b
B
B[kW]
B[mV]
B[SPL]
B[uV]
B[V]
B[W]
Bd
bit_s
k[iU]/mL
m[H2O]
m[Hg]
R
REM

Needs #59

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions