[eml-dev] EML 2.0.2 changes to text leaf nodes

Margaret O'Brien mob at icess.ucsb.edu
Thu Mar 20 17:15:43 PDT 2008


First, one comment on Chris's original question about drawbacks to mixed 
content instead of xs:string: I recall one possible drawback in 
searches. If someone searches on the string "uptake rates for Alnus 
tenuifolia" the search will fail if <title> is text instead of a string 
because of the embedded tags. One solution is an optional <complexTitle> 
which is txt:TextType, in addition to <title>. The search on the simple 
<title> returns the correct doc, and the title template needs to look 
for the complexTitle first. Apologies for not remembering who to credit 
for this - I think it was a verbal exchange.

And more generally: Titles frequently include scientific notation, units 
and species binomials, and require formatting to convey all their 
meaning and not look flattened. However, I think that txt:TypeText 
should be applied on a case-by-case basis, not to all xs:strings, or as 
a workaround for shortcomings. <name>s and <description>s are probably 
adequate as strings. If an eml feature could benefit from richer xml, 
then those needs should be addressed individually.

Cheers -
margaret


Christopher Jones wrote:
> Hi all,
>
> I strongly agree that content and presentation, ideally, should be  
> kept separate by allowing stylesheets to handle the latter.  I'm  
> struggling a bit with what constitutes 'content'.  A structural tag  
> such as <title> lends 'meaning' to the contained text, at least in  
> english.  A <b> tag in HTML seems much more presentational - it  
> doesn't add meaning, merely emphasis.  However, when formatting  
> conventions in scientific domains lend 'meaning' to text, like  
> italicizing species binomials, it seems that we need to provide the  
> facility for this, lest we lose semantic information.
>
> I agree with Wade that we walk a fine line here between expressing  
> semantics and presenting.  Cluttered EML docs could abound.  Is the  
> preservation of 'meaning' worth the trade-off?
>
> On Mar 20, 2008, at Mar20---3:06:43 PM, Wade Sheldon wrote:
>   
>> I think your casual example makes this point very well - what real  
>> use is preserving <emphasis> markup in a data set title? That's what  
>> XSL is for. If this is a legacy issue for some metadata providers,  
>> then I think they should be encouraged (or helped) to offload  
>> embedded display markup when porting to EML.
>>     
>
> True, my example was a bit simple.  A better example would be the  
> species binomial case:
>
> <title>
>    Acetylene reduction and 15N2 uptake rates for
>    <emphasis>Alnus tenuifolia</emphasis> and
>    <emphasis>Alnus crispa</emphasis>
>    in six different successional habitats
> </title>
>
> where the stylesheet treats title tags followed by emphasis tags with  
> italics.  This certainly is a presentation issue, but one that imparts  
> meaning based on known conventions.  Notice how the 15N2 also seems to  
> lose meaning in this title without appropriate formatting.
>
> Perhaps there is another way to deal with this, though?  It seems too  
> big of a job to try to infer meaning from straight xs:string word  
> combinations (such as Alnus tenuifolia) and then present it correctly  
> with the right markup for presentation.
>
> On Mar 20, 2008, at Mar20---3:22:06 PM, inigo wrote:
>   
>> Margaret O'Brien and myself with help of Mark Servilla, and  to some
>> extent J. Brunt and Corinna Gries worked on this minor fix. In it,
>> we addressed the bug that Chris is talking about, yet the workaround
>> that Chris is proposing does not fix the fact that there are   
>> DocBook 4.*
>> Schema tags present in the documentation module of EML not declared
>> in the text-module of EML. Examples are <url> and <citetitle>. By
>> redefining the types, we address these errors partially, yet some
>> stringent XML editors (the XML Spy 2007, 2008) will call on the
>> existence of these undeclared tag, critical errors. This makes the  
>> schema
>> rather unprofessional.
>>     
>
> On Mar 20, 2008, at Mar20---3:39:10 PM, James Brunt wrote:
>   
>> Also, I'm in agreement with Inigo that making the schema "clean"  
>> should be a priority in this bug-fix release.
>>     
>
> Fair enough.  Consistent and complete support for either DocBook 4.x  
> or DocBook 5.x throughout the EML schemas (in the eml-text module and  
> the documentation tags in every module) seems like a good goal, and  
> one that isn't particularly onerous.  Likewise, an audit of the  
> documentation tags is in order to ensure completeness.
>
> Questions -
>
> Have the EML-2.0.2 proposed fixes stated in the "Community opinion on  
> minor revision of EML" post been implemented in a branch in the  
> Ecoinformatics EML repository? If so, are they tagged?
>
> Besides bug #s 2054 and 2073, have the other 11 bullets in this email  
> post been entered into the ecoinfo bugzilla?
>
> Cheers,
> Chris
> _________________________________________________________________
> christopher jones       cjones at msi.ucsb.edu      (805) 680-5946
> marine science institute  university of california, santa barbara
> _________________________________________________________________
>
>
>
>
> _______________________________________________
> Eml-dev mailing list
> Eml-dev at ecoinformatics.org
> http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>   


More information about the Eml-dev mailing list