[eml-dev] Question about EML-based file access in Kepler
Wade Sheldon
sheldon at uga.edu
Tue Mar 18 14:33:34 PDT 2008
Jing,
That's not encouraging. I assume that means a Kepler user who finds a private GCE data set in Metacat (i.e. that has a dataTable distribution url) will see attribute ports for the data set, but they will get an error when executing a workflow that references those attributes because the data are not retrievable via the information url. Correct?
If that turns out to be the case, I'll probably continue to omit the distribution urls in EML destined for Metacat for our pre-release data to prevent that frustrating scenario.
Thanks.
-Wade
Jing Tao wrote:
> Hi, Wade:
>
> Base on my knowledge, I don't think kepler disguishes the "information"
> and "download" attributes. It will grab the content of the given url.
>
> Hope this is helpful.
>
> Jing
>
> Jing Tao
> National Center for Ecological
> Analysis and Synthesis (NCEAS)
> 735 State St. Suite 204
> Santa Barbara, CA 93101
>
> On Tue, 18 Mar 2008, Wade Sheldon wrote:
>
>> Hi Matt,
>>
>> I'm in the process of rolling out a new GCE website so I've been
>> reviewing and updating web application code for xml/xhtml
>> compatibility, etc. As part of this process I'm also making some minor
>> changes to the GCE EML implementation, including how data access urls
>> are encoded for data sets that aren't yet publicly downloadable. I
>> just wanted to run these changes by you to check for potential impact
>> on Kepler users accessing our docs via Metacat.
>>
>> In our original implementation I omitted the
>> dataTable/physical/distribution node entirely for unreleased data
>> sets, but as a consequence users viewing an outdated metadata document
>> would not easily be able to find the data object after it becomes
>> publicly accessible. This is particularly an issue for the EcoTrends
>> project, because we're providing pre-release data and EML for the
>> static web page and book they are producing, and the legacy metadata
>> will be retained and potentially accessed in the future (i.e. outside
>> of Metacat).
>>
>> In the new implementation, I will still include direct pass-through
>> links to data objects in EML in Metacat for public data sets, but I
>> will now include urls for private datasets as well. These private data
>> urls will point to a web page that will either allow the user to
>> register and download the data after it is public, or will inform them
>> of the private status and allow them to fill out a form to request the
>> data in advance of the release date. In order to distinguish between
>> these different endpoints I am explicitly setting the
>> distribution/online/url function attribute to "download" or
>> "information" as appropriate for data or a web page.
>>
>> My question for you is how does Kepler handle dataTable distribution
>> urls in EML with the function="information" attribute? Because I
>> differentially generate EML for Metacat I could revert to the old
>> practice to prevent problems, but I'd prefer to use the same approach
>> for both GCE-centric and KNB-centric metadata to prevent confusion.
>>
>> Here's a link to an example document with the new implementation for a
>> private data set:
>> http://gce-nas.marsci.uga.edu/public/app/send_eml.asp?detail=full&missing=NaN&delimiter=tab&metacat=yes&accession=INV-GCEM-0705c2
>>
>>
>> Thanks in advance for any input.
>>
>> -Wade Sheldon
>>
>>
>> --
>> ______________________________________________________________________________
>>
>>
>> Wade M. Sheldon
>> GCE-LTER Information Manager/SIMO Database Administrator
>> School of Marine Programs
>> University of Georgia
>> Athens, GA 30602-3636
>> Email: sheldon at uga.edu
>> WWW:
>> http://gce-lter.marsci.uga.edu/public/app/personnel_bios.asp?id=wsheldon
>>
>> _______________________________________________
>> Eml-dev mailing list
>> Eml-dev at ecoinformatics.org
>> http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>>
>>
--
______________________________________________________________________________
Wade M. Sheldon
GCE-LTER Information Manager/SIMO Database Administrator
School of Marine Programs
University of Georgia
Athens, GA 30602-3636
Email: sheldon at uga.edu
WWW: http://gce-lter.marsci.uga.edu/public/app/personnel_bios.asp?id=wsheldon
More information about the Eml-dev
mailing list