[eml-dev] [Bug 585] - internationalization needed in EML
David Blankman
dblankman1 at gmail.com
Mon Dec 8 11:59:27 PST 2008
As I think back upon the discussions in China and my discussions with Matt
at ISEI, it seems to me that my initial thought that multiple language
versions of EML documents are probably better handled by creating separate
EML documents for each language used. EML is already complex, I see no
reason to make it more complex.
In the ILTER situation we are asking ILTER member networks to provide a
core of EML in English, on the understanding that more complete metadata may
be in another language. In this case should there be an EML module,
eml-ilter or eml-language analogous to eml-access that specifies the
identifier of the "main" eml-document and the language of that document.
This module might also include an element to record a brief statement about
the amount of data in that foreign language. I am not sure what else might
be appropriate for this module. I know that Matt was thinking that there
might be some modifications to metacat replication that might be needed.
David
On Mon, Dec 8, 2008 at 1:34 PM, Matt Jones <jones at nceas.ucsb.edu> wrote:
> David and I discussed (briefly) some of these issues at ISEI. And we also
> discussed them at the ILTER meeting in China. The 'language' tag in
> eml-resource defines the language of the resource, which in the case of
> eml-dataset resources means the language of the data. Interestingly, we
> don't really have a language tag per se for the EML document content itself,
> except that all XML documents can use the built-in "xml:lang" attribute,
> which is optional for all XML elements (
> http://www.w3.org/TR/REC-xml/#sec-lang-tag). This allows one to set the
> language for each and every element in an XML document, such as:
>
> <title xml:lang="en">North American Forests</title>
> <title xml:lang="es">Bosques de Norte Americano</title>
>
> Two problems we would need to address with this approach come immedately to
> mind:
>
> 1) Many elements in EML are not repeatable, and therefore it is not
> possible to have one copy of the element in English and another in a
> different language. So cardinality would have to be updated throughout the
> EML schemas, which would make some aspects of validation more confusing.
> 2) For those elements that are already repeatable or are made repatable
> through a revision, there is no mechanism to indicate that the two element
> nodes are meant to be have the same semantic meaning in different languages,
> as opposed to two semantically different elements that happen to also differ
> in their language.
>
> This second issue is the one that would require more structural changes to
> EML. For example, one might sometimes want to have more than one title
> (which is why title is currently repeatable), but other times want to have
> one title in two different languages. Either way, EML's current structures
> don't allow these subtleties to be specified.
>
> Matt
>
>
> On Fri, Dec 5, 2008 at 12:54 PM, inigo san gil <isangil at lternet.edu>wrote:
>
>>
>> Metadata folks:
>>
>> I think this opens (perhaps re-opens) and interesting discussion.
>>
>> EML's resource (main module) offers us a <language> element that,
>> as I understand it, serves to specify the language used for the document.
>> The cardinality is set to <= 1, so it is optional, and if used, only one
>> language.
>>
>> However, we understood from Kristin Valnderbilt and David Blankman
>> that at a recent ILTER meeting, there was an agreement to provide
>> referencial-level EML for all metadata in English (and perhaps more
>> rich EML in their native languages).
>> The option David proposes, providing content in two languages
>> one being english, does not play well with the EML schema as is.
>> There are options in the interim, while we think whether 'we' tweak
>> the EML schema. Some solutions go in the direction of "duplicating" the
>> original EML record: Take what it is in the native language, and either
>> have it translate at some minimal-compliance level EML (ouch) or
>> run it by a translation web service and laugh (or rather cry) at the
>> results.
>>
>> There are of course many other approaches to this problem, Mark
>> Servilla mentioned some in the hallways of the LTER Network Office.
>>
>> The thing is that part of the international community in ecology has
>> expressed formal interest/commitment in using EML to document their
>> metadata. The ILTER group quickly realized of the Babelian challenge
>> ahead, (see Blankman's ISEI-6 presentation & future paper) and
>> David, Akiko Ocgawa and others took in helping the ILTER providing
>> basic EML in english (remember ILTER committed to use English
>> -chinglish and spanglish- as the lingua franca for referential level EML,
>> EML level 1, title, creator, abstract, contact at least
>>
>> Cheers,
>> Inigo
>>
>>
>>
>> bugzilla-daemon at ecoinformatics.org wrote:
>>
>>> http://bugzilla.ecoinformatics.org/show_bug.cgi?id=585
>>>
>>>
>>>
>>>
>>>
>>> ------- Comment #2 from mob at icess.ucsb.edu 2008-12-05 09:31 -------
>>> This comment from an email from David Blankman:
>>> As EML is becoming an international standard, we need to start thinking
>>> about
>>> ways to make EML more intelligent about multiple languages. While EML
>>> allows
>>> multiple titles, there is currently no way to indicated that multiple
>>> titles
>>> are equivalent. For example,if I have:
>>> <title> North American Forests </title> AND
>>> <title> Bosques de Norte Americano</title>
>>>
>>> EML currently has no way to indicate that these are the same title, just
>>> in a
>>> different language.
>>>
>>> Matt and I were talking about this at the ISEI-Cancun meeting, but I
>>> thought
>>> that it would be a good idea to get this discussion started within
>>> eml-dev and
>>> the ILTER group as well.
>>> _______________________________________________
>>> Eml-dev mailing list
>>> Eml-dev at ecoinformatics.org
>>> http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>>>
>>>
>>
>> _______________________________________________
>> Eml-dev mailing list
>> Eml-dev at ecoinformatics.org
>> http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>>
>
>
>
> --
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Matthew B. Jones
> Director of Informatics Research and Development
> National Center for Ecological Analysis and Synthesis (NCEAS)
> UC Santa Barbara
> jones at nceas.ucsb.edu Ph: 1-907-523-1960
> http://www.nceas.ucsb.edu/ecoinfo
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> _______________________________________________
> Eml-dev mailing list
> Eml-dev at ecoinformatics.org
> http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>
>
--
Nature is trying very hard to make us succeed, but nature does not depend on
us. We are not the only experiment.
- R. Buckminster Fuller
If I am not for myself, then who will be for me? If I am for myself alone,
then who am I? If not now, when?
- Rabbi Hillel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/attachments/20081208/30992102/attachment.html>
More information about the Eml-dev
mailing list