[eml-dev] EML 2.0.2 changes to text leaf nodes

Wade Sheldon sheldon at uga.edu
Fri Mar 21 06:03:18 PDT 2008


Well, after reading the other replies to Chris' question, I also have to admit that the content/presentation line is fuzzier than I first thought. Presentational elements such as emphasis, superscript, subscript and perhaps others may be necessary to preserve meaning in archived metadata documents. I also agree with Inigo that structural components of Docbook (section, para, itemizedList) are useful for representing rich content shoe-horned into a single EML element, where there is a granularity mismatch between metadata models. Perhaps the best way to minimize unnecessary presentational elements in EML is through education, peer review, and best practices.

I humbly withdraw my objection to broader use of txt:TextType (and echo Inigo's call to shore up the DocBook element references in eml:text to prevent schema validation errors).

-Wade


Christopher Jones wrote:
> Hi all,
> 
> I strongly agree that content and presentation, ideally, should be  
> kept separate by allowing stylesheets to handle the latter.  I'm  
> struggling a bit with what constitutes 'content'.  A structural tag  
> such as <title> lends 'meaning' to the contained text, at least in  
> english.  A <b> tag in HTML seems much more presentational - it  
> doesn't add meaning, merely emphasis.  However, when formatting  
> conventions in scientific domains lend 'meaning' to text, like  
> italicizing species binomials, it seems that we need to provide the  
> facility for this, lest we lose semantic information.
> 
> I agree with Wade that we walk a fine line here between expressing  
> semantics and presenting.  Cluttered EML docs could abound.  Is the  
> preservation of 'meaning' worth the trade-off?
> 
> On Mar 20, 2008, at Mar20---3:06:43 PM, Wade Sheldon wrote:
>> I think your casual example makes this point very well - what real  
>> use is preserving <emphasis> markup in a data set title? That's what  
>> XSL is for. If this is a legacy issue for some metadata providers,  
>> then I think they should be encouraged (or helped) to offload  
>> embedded display markup when porting to EML.
> 
> True, my example was a bit simple.  A better example would be the  
> species binomial case:
> 
> <title>
>    Acetylene reduction and 15N2 uptake rates for
>    <emphasis>Alnus tenuifolia</emphasis> and
>    <emphasis>Alnus crispa</emphasis>
>    in six different successional habitats
> </title>
> 
> where the stylesheet treats title tags followed by emphasis tags with  
> italics.  This certainly is a presentation issue, but one that imparts  
> meaning based on known conventions.  Notice how the 15N2 also seems to  
> lose meaning in this title without appropriate formatting.
> 
> Perhaps there is another way to deal with this, though?  It seems too  
> big of a job to try to infer meaning from straight xs:string word  
> combinations (such as Alnus tenuifolia) and then present it correctly  
> with the right markup for presentation.
> 
> On Mar 20, 2008, at Mar20---3:22:06 PM, inigo wrote:
>> Margaret O'Brien and myself with help of Mark Servilla, and  to some
>> extent J. Brunt and Corinna Gries worked on this minor fix. In it,
>> we addressed the bug that Chris is talking about, yet the workaround
>> that Chris is proposing does not fix the fact that there are   
>> DocBook 4.*
>> Schema tags present in the documentation module of EML not declared
>> in the text-module of EML. Examples are <url> and <citetitle>. By
>> redefining the types, we address these errors partially, yet some
>> stringent XML editors (the XML Spy 2007, 2008) will call on the
>> existence of these undeclared tag, critical errors. This makes the  
>> schema
>> rather unprofessional.
> 
> On Mar 20, 2008, at Mar20---3:39:10 PM, James Brunt wrote:
>> Also, I'm in agreement with Inigo that making the schema "clean"  
>> should be a priority in this bug-fix release.
> 
> Fair enough.  Consistent and complete support for either DocBook 4.x  
> or DocBook 5.x throughout the EML schemas (in the eml-text module and  
> the documentation tags in every module) seems like a good goal, and  
> one that isn't particularly onerous.  Likewise, an audit of the  
> documentation tags is in order to ensure completeness.
> 
> Questions -
> 
> Have the EML-2.0.2 proposed fixes stated in the "Community opinion on  
> minor revision of EML" post been implemented in a branch in the  
> Ecoinformatics EML repository? If so, are they tagged?
> 
> Besides bug #s 2054 and 2073, have the other 11 bullets in this email  
> post been entered into the ecoinfo bugzilla?
> 
> Cheers,
> Chris
> _________________________________________________________________
> christopher jones       cjones at msi.ucsb.edu      (805) 680-5946
> marine science institute  university of california, santa barbara
> _________________________________________________________________
> 
> 
> 
> 
> _______________________________________________
> Eml-dev mailing list
> Eml-dev at ecoinformatics.org
> http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev

-- 
______________________________________________________________________________

Wade M. Sheldon
GCE-LTER Information Manager/SIMO Database Administrator
School of Marine Programs
University of Georgia
Athens, GA 30602-3636
Email: sheldon at uga.edu
WWW: http://gce-lter.marsci.uga.edu/public/app/personnel_bios.asp?id=wsheldon





More information about the Eml-dev mailing list