From bugzilla-daemon at ecoinformatics.org Tue Jan 14 11:23:19 2003 From: bugzilla-daemon at ecoinformatics.org (bugzilla-daemon@ecoinformatics.org) Date: Tue, 14 Jan 2003 11:23:19 -0800 (PST) Subject: [Bug 948] New: - develop pipeline/step library for MARINE QA tests Message-ID: <20030114192319.9DCDDB79D9@ecoinfo.nceas.ucsb.edu> http://bugzilla.ecoinformatics.org/show_bug.cgi?id=948 Summary: develop pipeline/step library for MARINE QA tests Product: Monarch Version: 1.0 Platform: Other OS/Version: other Status: NEW Severity: enhancement Priority: P2 Component: dsparse - general bugs AssignedTo: rwilliams at nceas.ucsb.edu ReportedBy: jones at nceas.ucsb.edu QAContact: jones at nceas.ucsb.edu CC: monarch-dev at ecoinformatics.org We need a small but compelling library of pipelines and steps that can be used for QA analysis in MARINE. This test library should exercise the features of monarch, including cross-plugin execution. Some examples: 1) Run proc univariate in SAS and display output 2) For all numeric vars in data, do pairwise scatterplot for outlier detection, format on one page as matrix of plots 3) User-configurable scatter plot, histogram, bar chart (with grouping) More are desirable. Will need to think about this some. From bugzilla-daemon at ecoinformatics.org Wed Jan 15 15:04:10 2003 From: bugzilla-daemon at ecoinformatics.org (bugzilla-daemon@ecoinformatics.org) Date: Wed, 15 Jan 2003 15:04:10 -0800 (PST) Subject: [Bug 948] - develop pipeline/step library for MARINE QA tests Message-ID: <20030115230410.1C3F3B79D9@ecoinfo.nceas.ucsb.edu> http://bugzilla.ecoinformatics.org/show_bug.cgi?id=948 berkley at nceas.ucsb.edu changed: What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|2.0.0Alpha1 |2.0.0Alpha2 From jones at nceas.ucsb.edu Wed Jan 15 18:13:06 2003 From: jones at nceas.ucsb.edu (Matt Jones) Date: Wed, 15 Jan 2003 17:13:06 -0900 Subject: ports as separate nodes Message-ID: <3E261532.4050303@nceas.ucsb.edu> Chad and Rich, I think I figured out why the diva folks suggest ports should be modeled as separate nodes in a Composite node. Its because, without it, you can't distinguish where the edges attach. Thus, you need the ports to be nodes unto themselves so that they can accept links from other ports attached to other steps. So, it looks like this: --------------- -------------- |Step1--Port1 | -----> |Port3--Step2| | | | / | | | | Port4 | --------------- -------------- So, here you can see that there are two composite entities. The first has one port for output, and the second has two ports for input. Port1 is connected to Port3. If we didn't have the ports as separate nodes, then all of the connections would generically go from Step1 to Step2, so there would be no way to tell to which input param the output param from Step1 maps. So, I think I agree that we need to represent our Steps as composite nodes. Comments? Matt From rwilliams at nceas.ucsb.edu Thu Jan 16 10:06:50 2003 From: rwilliams at nceas.ucsb.edu (Rich Williams) Date: Thu, 16 Jan 2003 10:06:50 -0800 Subject: ports as separate nodes In-Reply-To: <3E261532.4050303@nceas.ucsb.edu> Message-ID: Hey Matt and Chad - When I first read the email, I thought that made a lot of sense. On further reflection, I don't see the arrangements with the ports as being a whole lot different. Yes, you can unambiguously specify the link between ports with a simple line (a data item that contains only the head and tail of the link). The problem is that the link between step and port on one end and the link between port and step on the other have exactly the same problem that the old link between the two steps had. That is, you still need some extra data associated with the link to specify which data item it is linked to. So it seems like you trade one link that needs both ends specified as to which data item it attaches to for three links and two nodes where two of the links require the same kind of specification infomation at one end. I can see using the concept of ports as data IO nodes (steps) where the data the each port reads comes from a different source. For example, a port would read a dataset, or a port would be a scalar input variable, or a port could be a data file. The port is then in charge of whatever operations have to be performed to get the data ready for the step be able to access it. Then the pipeline would be a composite node, consisting of data input and output ports and one or more analysis steps. This is a lot like the input and output steps that we currently create for each StepGroup. When we turn the pipeline into a series of StepGroups, we create data load/unload steps at the beginning and end of each StepGroup - these would be the ports for that StepGroup. Something like that anyway... Rich -----Original Message----- From: monarch-dev-admin at ecoinformatics.org [mailto:monarch-dev-admin at ecoinformatics.org]On Behalf Of Matt Jones Sent: Wednesday, January 15, 2003 6:13 PM To: monarch-dev Subject: ports as separate nodes Chad and Rich, I think I figured out why the diva folks suggest ports should be modeled as separate nodes in a Composite node. Its because, without it, you can't distinguish where the edges attach. Thus, you need the ports to be nodes unto themselves so that they can accept links from other ports attached to other steps. So, it looks like this: --------------- -------------- |Step1--Port1 | -----> |Port3--Step2| | | | / | | | | Port4 | --------------- -------------- So, here you can see that there are two composite entities. The first has one port for output, and the second has two ports for input. Port1 is connected to Port3. If we didn't have the ports as separate nodes, then all of the connections would generically go from Step1 to Step2, so there would be no way to tell to which input param the output param from Step1 maps. So, I think I agree that we need to represent our Steps as composite nodes. Comments? Matt _______________________________________________ monarch-dev mailing list monarch-dev at ecoinformatics.org http://www.ecoinformatics.org/mailman/listinfo/monarch-dev From jones at nceas.ucsb.edu Thu Jan 16 16:04:25 2003 From: jones at nceas.ucsb.edu (Matt Jones) Date: Thu, 16 Jan 2003 15:04:25 -0900 Subject: Monarch is alive! In-Reply-To: <1042760908.23097.54.camel@trestles> References: <1042760908.23097.54.camel@trestles> Message-ID: <3E274889.7020409@nceas.ucsb.edu> Nice job, Chad! This is truly a watershed event! I gave it a spin, and everything seemed to be working. Matt Chad Berkley wrote: > Hey y'all, > > So, the new Monarch actually does something now. All the code is in > cvs. To see it work go to > > http://trestles.nceas.ucsb.edu:8080/monarch/servlet/monarchui?action=start. > > You have to use your full ldap user id (i.e. > uid=berkley,o=NCEAS,dc=ecoinformatics,dc=org) to login. > > to have a sure-thing successful run (hopefully), choose the dataset > "test package - berkley" and chose the "Graph Sum" pipeline. > > I'll try to put some other datasets and pipelines in there tomorrow. > > chad -- ******************************************************************* Matt Jones jones at nceas.ucsb.edu http://www.nceas.ucsb.edu/ Fax: 425-920-2439 Ph: 907-789-0496 National Center for Ecological Analysis and Synthesis (NCEAS) Interested in ecological informatics? http://www.ecoinformatics.org ******************************************************************* From berkley at nceas.ucsb.edu Tue Jan 21 13:18:58 2003 From: berkley at nceas.ucsb.edu (Chad Berkley) Date: 21 Jan 2003 13:18:58 -0800 Subject: MONARCH_RELEASE_1_0_0Alpha1 Message-ID: <1043183938.22872.19.camel@trestles> Hello, Monarch version 1.0.0Alpha1 has been made available under the tag MONARCH_RELEASE_1_0_0Alpha1 in the ecoinformatics.org CVS server. This is a minor building block release. It will not be made available for download on the web site, but should serve as a jumping-off-point for further Monarch development. Basic data analysis and pipelining architecture, as well as a simple HTML interface are included in this release. Monarch can be installed using Jakarta Ant (jakarta.apache.org/ant) and the included build.xml script. It requires Tomcat 4.0 or compatible servlet engine and any applicable back-end analytical engines (such as SAS or MatLab). See knb.ecoinoformatics.org/software/monarch for more information or reply to this message with specific questions. Please let me know if you have any problems retrieving this tagged release from CVS or if you have any questions about Monarch in general. thanks, Chad -- ----------------------- Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS) berkley at nceas.ucsb.edu ----------------------- From bugzilla-daemon at ecoinformatics.org Wed Jan 22 08:33:52 2003 From: bugzilla-daemon at ecoinformatics.org (bugzilla-daemon@ecoinformatics.org) Date: Wed, 22 Jan 2003 08:33:52 -0800 (PST) Subject: [Bug 948] - develop pipeline/step library for MARINE QA tests Message-ID: <20030122163352.321FCB79D9@ecoinfo.nceas.ucsb.edu> http://bugzilla.ecoinformatics.org/show_bug.cgi?id=948 rwilliams at nceas.ucsb.edu changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |DUPLICATE ------- Additional Comments From rwilliams at nceas.ucsb.edu 2003-01-22 08:33 ------- duplicate of 947 *** This bug has been marked as a duplicate of 947 *** From bugzilla-daemon at ecoinformatics.org Fri Jan 31 11:20:32 2003 From: bugzilla-daemon at ecoinformatics.org (bugzilla-daemon@ecoinformatics.org) Date: Fri, 31 Jan 2003 11:20:32 -0800 (PST) Subject: [Bug 979] New: - pipelines that mix engines can't parse physical file Message-ID: <20030131192032.A570AB7533@ecoinfo.nceas.ucsb.edu> http://bugzilla.ecoinformatics.org/show_bug.cgi?id=979 Summary: pipelines that mix engines can't parse physical file Product: Monarch Version: 1.0 Platform: Other OS/Version: other Status: NEW Severity: normal Priority: P1 Component: monarch - general bugs AssignedTo: berkley at nceas.ucsb.edu ReportedBy: jones at nceas.ucsb.edu QAContact: jones at nceas.ucsb.edu CC: monarch-dev at ecoinformatics.org When a pipeline contains steps that use various different AEPlugins, the output from one plugin needs to be parsed as the input to the next. The current plugins and TableEntity class do not output sufficient metadata (ie, an EML description) of the needed outputs. In particular, metadata describing the physical format of the file is missing. These mixed pipelines will fail until this is resolved.