[eml-dev] Revisiting the <measurementScale> categories

inigo isangil at lternet.edu
Fri Sep 19 08:31:21 PDT 2008



For the record too. The <measurementScale> categories -- nominal, 
ordinal, ratio, interval and datetime -- are not the best solution to 
classify the recorded measurables in a data table.  The categories are 
not exempt from controversy, and the guidelines and attempts to explain 
the proper use and dissection of variable types have not succeeded in 
different uses.

Example.  a "date" has been documented sometimes as "datetime", but also 
as a "nominal or ordinal" and event "interval or ratio" - wonderful 
spread. 

these type of classification ambiguity is normal to certain extent, but 
something can be done to improve both the efficiency of the data 
documentation and post-processing and community agreement on practices.  

a good number of LTER sites have in practice simplified these categories 
further.  essentially, there are 'dates', 'quantifiable measurables' and 
'all the rest' ( all the rest includes free text such as comments, 
identifiers, pair of code-code definitions, nominals and ordinals).  
This practice has been adopted to remove part of ambiguities that the 
original categories present, and for clarity of use.  a few people may 
feel that the differences between those categories is crystal clear - no 
doubt - but i have not found many of those.  

If exploring different categories (identifiers, codes, quantifiable 
measure, text, flags, dates..) for EML is something that no eml-dev is 
even willing to consider, perhaps reducing the number from 5 to 3 would 
be a humbler goal for the sake of efficiency.  Sure, someone may be 
tempted to divide an interval type by a ratio type with undesirable 
results as a consequence, but i think the risk is still there now 
(because of the use in practice). 

cheers, inigo



More information about the Eml-dev mailing list