[eml-dev] Revisiting the <measurementScale> categories
inigo
isangil at lternet.edu
Fri Sep 19 08:31:21 PDT 2008
For the record too. The <measurementScale> categories -- nominal,
ordinal, ratio, interval and datetime -- are not the best solution to
classify the recorded measurables in a data table. The categories are
not exempt from controversy, and the guidelines and attempts to explain
the proper use and dissection of variable types have not succeeded in
different uses.
Example. a "date" has been documented sometimes as "datetime", but also
as a "nominal or ordinal" and event "interval or ratio" - wonderful
spread.
these type of classification ambiguity is normal to certain extent, but
something can be done to improve both the efficiency of the data
documentation and post-processing and community agreement on practices.
a good number of LTER sites have in practice simplified these categories
further. essentially, there are 'dates', 'quantifiable measurables' and
'all the rest' ( all the rest includes free text such as comments,
identifiers, pair of code-code definitions, nominals and ordinals).
This practice has been adopted to remove part of ambiguities that the
original categories present, and for clarity of use. a few people may
feel that the differences between those categories is crystal clear - no
doubt - but i have not found many of those.
If exploring different categories (identifiers, codes, quantifiable
measure, text, flags, dates..) for EML is something that no eml-dev is
even willing to consider, perhaps reducing the number from 5 to 3 would
be a humbler goal for the sake of efficiency. Sure, someone may be
tempted to divide an interval type by a ratio type with undesirable
results as a consequence, but i think the risk is still there now
(because of the use in practice).
cheers, inigo
More information about the Eml-dev
mailing list