[eml-dev] EML 2.0.2 changes to text leaf nodes
Margaret O'Brien
mob at icess.ucsb.edu
Thu Mar 20 17:15:43 PDT 2008
First, one comment on Chris's original question about drawbacks to mixed
content instead of xs:string: I recall one possible drawback in
searches. If someone searches on the string "uptake rates for Alnus
tenuifolia" the search will fail if <title> is text instead of a string
because of the embedded tags. One solution is an optional <complexTitle>
which is txt:TextType, in addition to <title>. The search on the simple
<title> returns the correct doc, and the title template needs to look
for the complexTitle first. Apologies for not remembering who to credit
for this - I think it was a verbal exchange.
And more generally: Titles frequently include scientific notation, units
and species binomials, and require formatting to convey all their
meaning and not look flattened. However, I think that txt:TypeText
should be applied on a case-by-case basis, not to all xs:strings, or as
a workaround for shortcomings. <name>s and <description>s are probably
adequate as strings. If an eml feature could benefit from richer xml,
then those needs should be addressed individually.
Cheers -
margaret
Christopher Jones wrote:
> Hi all,
>
> I strongly agree that content and presentation, ideally, should be
> kept separate by allowing stylesheets to handle the latter. I'm
> struggling a bit with what constitutes 'content'. A structural tag
> such as <title> lends 'meaning' to the contained text, at least in
> english. A <b> tag in HTML seems much more presentational - it
> doesn't add meaning, merely emphasis. However, when formatting
> conventions in scientific domains lend 'meaning' to text, like
> italicizing species binomials, it seems that we need to provide the
> facility for this, lest we lose semantic information.
>
> I agree with Wade that we walk a fine line here between expressing
> semantics and presenting. Cluttered EML docs could abound. Is the
> preservation of 'meaning' worth the trade-off?
>
> On Mar 20, 2008, at Mar20---3:06:43 PM, Wade Sheldon wrote:
>
>> I think your casual example makes this point very well - what real
>> use is preserving <emphasis> markup in a data set title? That's what
>> XSL is for. If this is a legacy issue for some metadata providers,
>> then I think they should be encouraged (or helped) to offload
>> embedded display markup when porting to EML.
>>
>
> True, my example was a bit simple. A better example would be the
> species binomial case:
>
> <title>
> Acetylene reduction and 15N2 uptake rates for
> <emphasis>Alnus tenuifolia</emphasis> and
> <emphasis>Alnus crispa</emphasis>
> in six different successional habitats
> </title>
>
> where the stylesheet treats title tags followed by emphasis tags with
> italics. This certainly is a presentation issue, but one that imparts
> meaning based on known conventions. Notice how the 15N2 also seems to
> lose meaning in this title without appropriate formatting.
>
> Perhaps there is another way to deal with this, though? It seems too
> big of a job to try to infer meaning from straight xs:string word
> combinations (such as Alnus tenuifolia) and then present it correctly
> with the right markup for presentation.
>
> On Mar 20, 2008, at Mar20---3:22:06 PM, inigo wrote:
>
>> Margaret O'Brien and myself with help of Mark Servilla, and to some
>> extent J. Brunt and Corinna Gries worked on this minor fix. In it,
>> we addressed the bug that Chris is talking about, yet the workaround
>> that Chris is proposing does not fix the fact that there are
>> DocBook 4.*
>> Schema tags present in the documentation module of EML not declared
>> in the text-module of EML. Examples are <url> and <citetitle>. By
>> redefining the types, we address these errors partially, yet some
>> stringent XML editors (the XML Spy 2007, 2008) will call on the
>> existence of these undeclared tag, critical errors. This makes the
>> schema
>> rather unprofessional.
>>
>
> On Mar 20, 2008, at Mar20---3:39:10 PM, James Brunt wrote:
>
>> Also, I'm in agreement with Inigo that making the schema "clean"
>> should be a priority in this bug-fix release.
>>
>
> Fair enough. Consistent and complete support for either DocBook 4.x
> or DocBook 5.x throughout the EML schemas (in the eml-text module and
> the documentation tags in every module) seems like a good goal, and
> one that isn't particularly onerous. Likewise, an audit of the
> documentation tags is in order to ensure completeness.
>
> Questions -
>
> Have the EML-2.0.2 proposed fixes stated in the "Community opinion on
> minor revision of EML" post been implemented in a branch in the
> Ecoinformatics EML repository? If so, are they tagged?
>
> Besides bug #s 2054 and 2073, have the other 11 bullets in this email
> post been entered into the ecoinfo bugzilla?
>
> Cheers,
> Chris
> _________________________________________________________________
> christopher jones cjones at msi.ucsb.edu (805) 680-5946
> marine science institute university of california, santa barbara
> _________________________________________________________________
>
>
>
>
> _______________________________________________
> Eml-dev mailing list
> Eml-dev at ecoinformatics.org
> http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>
More information about the Eml-dev
mailing list