[eml-dev] [Bug 2512] - require text content in elements to be non-empty
bugzilla-daemon at ecoinformatics.org
bugzilla-daemon at ecoinformatics.org
Sat Nov 8 12:56:55 PST 2008
http://bugzilla.ecoinformatics.org/show_bug.cgi?id=2512
------- Comment #3 from mob at icess.ucsb.edu 2008-11-08 12:56 -------
We need to look at the effect on instance documents of switching all xs:string
to NonEmptyStringType. This type-switch will probably have a bigger effect on
the ability of authors to migrate their documents than the changes to the
document structure itself. Structure changes will be accomplished by the xsl
stylesheet, but retyping all strings means that content could now be required
where none previously existed.
To start, I considered just the anonymous simple type elements that are
required by EML and are type="xs:string". It seemed reasonable that if an
element was optional, that its content could also be optional. In all, there
are 81 of these, which are generally easy to retype with a statement like:
sed -e '/\<xs:element\ name/{
/minOccurs=\"0\"/!s/xs:string/res:NonEmptyStringType/
}
'
There are other elements which could be examined and retyped manually, or would
be caught by a general s/xs:string/res:NonEmptyStringType/ E.g., see <keyword>
(eml-resource.xsd) -- a complexType/simpleContent, so the reference to
xs:string occurs below the element declaration. Other elements (and many
attributes) use xs:restriction base="xs:string" as the start of an enumeration
list, but changing these to base="NonEmptyStringType" seems superfluous.
So to start, only one schema file, "eml-resource.xsd", has been checked into
CVS, so that others can try out the effect of NonEmptyStringType while
its scope is small. Particularly, I was thinking about Morpho. 7 element
declarations occur in this file that were formerly of xs:string, and now are
NonEmptyStringType. See the list below. I think that Morpho wizards deal with
only title, references and keyword, although any are available in the tree
editor. My local copy has all 81 (anonymous, simple) element declarations
retyped (in 17 schema docs), plus the 5 anonymous attributes. I am testing a
variety of EML201 documents from the LTER metacat against this schema as I
convert them -- basically while I work on the XSL stylesheet.
title
distribution/connectionDefinition/parameterDefinition/name
distribution/connectionDefinition/parameterDefinition/description
distribution/connection/parameter/name
distribution/connection/parameter/description
distribution/offline/MediumName
references (multiple paths)
keyword (a named type)
More information about the Eml-dev
mailing list