From jones at nceas.ucsb.edu Fri Dec 13 11:05:52 2002 From: jones at nceas.ucsb.edu (Matt Jones) Date: Fri, 13 Dec 2002 10:05:52 -0900 Subject: Step/Pipeline abstraction References: Message-ID: <3DFA2F90.3060701@nceas.ucsb.edu> Rich, Looks good. Definitely a job for XML Schema rather than dtds. But technically we don't even need a DTD -- they both simply define the XML structure for out-of-band validation, but are not used during processing. Soo...I think doubly implementing them in DTDs *as if* they had inherited from an AbstractStep is a good first approximation, unless you want to launch into schemas now (its actually very easy). Your model looks good, and I agree it solves the problem and provides for extensibility. Nice. The only thing I didn't fully grok is the setup for AbstractStep inside of "execute" in Pipeline. I think the old mechanism handles linear pipelines, but not trees, and I'm not clear on how the new model fixes that. Here's a graph (forgive my poor ASCII art :): A -> B \ \ > E / / C -> D so our pipeline needs to say: 1.1 execute A (bind to data) 1.2 execute B (bind to A output) 2.1 execute C (bind to data) 2.2 execute D (bind to C output) 3.1 execute E (bind to B and D output) The first two blocks could theoretically execute in parallel. As I said, this doesn't work in the current DTD. We should fix it now if we're taking the time to redesign the XML structures. As I said, we should look again at MoML [1] before we really commit to a design for this. The major feature of MoML and Ptolemy in general that seems applicable to Monarch is the "Director" concept. It drives execution of their models, but I don't really understand how it works. I can see how it controls the time-course of the model, but I don't see how the pipeline is affected by that precisely. Given your knowledge of modeling, it would be useful if you could help explain what they're trying to do here. So, no i don't have solutions. Looking forward to working this out with you. Matt [1] http://ptolemy.eecs.berkeley.edu/publications/papers/00/moml/ Rich Williams wrote: > Hey Matt - > > Here's what I wrote on steps and pipelines. Let me know what you think. > > Right now, the inputs to a pipeline are the inputs to the first step, and > the outputs are the outputs from the last step. As we discussed earlier in > the week, this is obviously not general enough. I think it?s fairly easy to > come up with an appropriate abstraction scheme, and it also allows a > pipeline to be called as a step within some other pipeline. I?m not an XML > pro, so this is written in ?pseudocode?, but it?s meaning should be obvious. > Could XML schema give us the kind of hierarchy I?m showing here? If we want > to do it quick and dirty, we could just extend the dtds for now. > > AbstractStep > { > identifier // these first three field are currently in both > name // the step and pipeline dtds (but with different names) > description > inputs > (various fields, as currently specified) > outputs > (various fields, as currently specified) > } > > ScriptStep extends AbstractStep > { > code?(possibly more than one) > language > rawCode > } > > Pipeline extends AbstractStep > { > execute? (possibly more than one) > AbstractStep > StepMap > } > > We would need to extend the StepMap so that for each item in the map, it > included information about which AbstractStep the mapping referred to. > Right now, the StepMap implicitly refers to the previous Step in the > StepGroup. > > In the future, I could imagine us coming up with other types of steps we > want to run, and so we could extend the AbstractStep to do this. -- ******************************************************************* Matt Jones jones at nceas.ucsb.edu http://www.nceas.ucsb.edu/ Fax: 425-920-2439 Ph: 907-789-0496 National Center for Ecological Analysis and Synthesis (NCEAS) Interested in ecological informatics? http://www.ecoinformatics.org ******************************************************************* From rwilliams at nceas.ucsb.edu Mon Dec 16 10:36:43 2002 From: rwilliams at nceas.ucsb.edu (Rich Williams) Date: Mon, 16 Dec 2002 10:36:43 -0800 Subject: Step/Pipeline abstraction In-Reply-To: <3DFA2F90.3060701@nceas.ucsb.edu> Message-ID: Hey Matt - Here is the sum and plot example reworked using the "pipeline is a step" model and eliminating data pass-through. This allows tree structures such as the one in your example to be expressed, even though the steps are executed linearly as in the old pipe design. In the example below, the main things to note are: 1) The pipeline now has input and output elements 2) The "map" elements have an added attribute, called "fromstep". These are integers, if 0 or greater they are the step number. -1 is a special value used to refer to the enclosing pipeline, and in this case the pipeline input is mapped to the step input. 3) The pipeline has an "outputmap" element that tells where its output comes from. 4) The pipeline and step share "name" and "description" elements. I haven't put the changes into the DTDs as I'm not DTD syntax savvy. The changes would be: To pipeline.dtd, add input, output and outputmap elements; change the map element to have a fromstep attribute. To step.dtd, change stepName and stepDescription to name and description **************************************************************************** ************************************* dummypipe.1.1 pipeline.1.1 Graph Sum The graph of the sum of x and y versus y plot_title string Title of the plot Test plot sumdata table Data table containing the data to sum x decimal numeric data to be summed y decimal numeric data to be summed none image/gif the image produced from the plot dummysum.1.1 sum.1.1 sum Sum two attributes from an entity sumdata table Data table containing the data to sum x decimal numeric data to be summed y decimal numeric data to be summed sumdata table Data table containing the summed data z decimal numeric data sum dummyplot.1.1 scatterplot.1.1 scatterplot A two dimensional scatterplot plot_title string Title of the plot Default plot title plotdata table Data table containing the data to plot x_axis decimal numeric data to be plotted on abscissa y_axis decimal numeric data to be plotted on ordinate meter scatterplot.gif image/gif the image produced from the plot From jones at nceas.ucsb.edu Tue Dec 17 00:14:40 2002 From: jones at nceas.ucsb.edu (Matt Jones) Date: Mon, 16 Dec 2002 23:14:40 -0900 Subject: Step/Pipeline abstraction References: Message-ID: <3DFEDCF0.5030400@nceas.ucsb.edu> Rich, Didn't really get a chance to look at this before leaving. My brief glance: do we realy want to depend on order of steps to determine execution order. Seems to me like we should number the steps with an id attribute, then use those attributes to draw the links in the graph. Same effect, but more info possibly, especially of we want to execute steps out of order depending on dependencies? Lets chat in Jnauary -- looking forward to seeing where you're at at that point. Matt Rich Williams wrote: > Hey Matt - > > Here is the sum and plot example reworked using the "pipeline is a step" > model and eliminating data pass-through. This allows tree structures such > as the one in your example to be expressed, even though the steps are > executed linearly as in the old pipe design. In the example below, the main > things to note are: > 1) The pipeline now has input and output elements > 2) The "map" elements have an added attribute, called "fromstep". These are > integers, if 0 or greater they are the step number. -1 is a special value > used to refer to the enclosing pipeline, and in this case the pipeline input > is mapped to the step input. > 3) The pipeline has an "outputmap" element that tells where its output comes > from. > 4) The pipeline and step share "name" and "description" elements. > > I haven't put the changes into the DTDs as I'm not DTD syntax savvy. The > changes would be: > > To pipeline.dtd, add input, output and outputmap elements; change the map > element to have a fromstep attribute. > To step.dtd, change stepName and stepDescription to name and description > > **************************************************************************** > ************************************* > > dummypipe.1.1 > > > > pipeline.1.1 > Graph Sum > The graph of the sum of x and y versus y > > > plot_title > string > Title of the plot > Test plot > > > sumdata > table > Data table containing the data to > sum > > x > decimal > numeric data to be > summed > > > y > decimal > numeric data to be > summed > > > > > > none > image/gif > the image produced from the > plot > > > > > > > > > > > > > > > dummysum.1.1 > > > > sum.1.1 > sum > Sum two attributes from an entity > > > sumdata > table > Data table containing the data to > sum > > x > decimal > numeric data to be > summed > > > y > decimal > numeric data to be > summed > > > > > > sumdata > table > Data table containing the summed > data > > z > decimal > numeric data sum > > > > DATA sumdata; > z = x + y; > RUN; > ]]> > > > > dummyplot.1.1 > > > > scatterplot.1.1 > scatterplot > A two dimensional scatterplot > > > plot_title > string > Title of the plot > Default plot title > > > plotdata > table > Data table containing the data to > plot > > x_axis > decimal > numeric data to be plotted on > abscissa > > > y_axis > decimal > numeric data to be plotted on > ordinate > meter > > > > > > scatterplot.gif > image/gif > the image produced from the > plot > > > goptions gsfname=out gsfmode=replace device=gif; > > filename out "scatterplot.gif"; > proc gplot; > plot y_axis*x_axis; > run; quit; > ]]> > > > > -- ******************************************************************* Matt Jones jones at nceas.ucsb.edu http://www.nceas.ucsb.edu/ Fax: 425-920-2439 Ph: 907-789-0496 National Center for Ecological Analysis and Synthesis (NCEAS) Interested in ecological informatics? http://www.ecoinformatics.org ******************************************************************* From rwilliams at nceas.ucsb.edu Tue Dec 17 08:53:19 2002 From: rwilliams at nceas.ucsb.edu (Rich Williams) Date: Tue, 17 Dec 2002 08:53:19 -0800 Subject: Step/Pipeline abstraction In-Reply-To: <3DFEDCF0.5030400@nceas.ucsb.edu> Message-ID: Yep, obviously a better way to go. Thanks. Rich -----Original Message----- From: monarch-dev-admin at ecoinformatics.org [mailto:monarch-dev-admin at ecoinformatics.org]On Behalf Of Matt Jones Sent: Tuesday, December 17, 2002 12:15 AM To: monarch-dev Subject: Re: Step/Pipeline abstraction Rich, Didn't really get a chance to look at this before leaving. My brief glance: do we realy want to depend on order of steps to determine execution order. Seems to me like we should number the steps with an id attribute, then use those attributes to draw the links in the graph. Same effect, but more info possibly, especially of we want to execute steps out of order depending on dependencies? Lets chat in Jnauary -- looking forward to seeing where you're at at that point. Matt Rich Williams wrote: > Hey Matt - > > Here is the sum and plot example reworked using the "pipeline is a step" > model and eliminating data pass-through. This allows tree structures such > as the one in your example to be expressed, even though the steps are > executed linearly as in the old pipe design. In the example below, the main > things to note are: > 1) The pipeline now has input and output elements > 2) The "map" elements have an added attribute, called "fromstep". These are > integers, if 0 or greater they are the step number. -1 is a special value > used to refer to the enclosing pipeline, and in this case the pipeline input > is mapped to the step input. > 3) The pipeline has an "outputmap" element that tells where its output comes > from. > 4) The pipeline and step share "name" and "description" elements. > > I haven't put the changes into the DTDs as I'm not DTD syntax savvy. The > changes would be: > > To pipeline.dtd, add input, output and outputmap elements; change the map > element to have a fromstep attribute. > To step.dtd, change stepName and stepDescription to name and description > > **************************************************************************** > ************************************* > > dummypipe.1.1 > > > > pipeline.1.1 > Graph Sum > The graph of the sum of x and y versus y > > > plot_title > string > Title of the plot > Test plot > > > sumdata > table > Data table containing the data to > sum > > x > decimal > numeric data to be > summed > > > y > decimal > numeric data to be > summed > > > > > > none > image/gif > the image produced from the > plot > > > > > > > > > > > > > > > dummysum.1.1 > > > > sum.1.1 > sum > Sum two attributes from an entity > > > sumdata > table > Data table containing the data to > sum > > x > decimal > numeric data to be > summed > > > y > decimal > numeric data to be > summed > > > > > > sumdata > table > Data table containing the summed > data > > z > decimal > numeric data sum > > > > DATA sumdata; > z = x + y; > RUN; > ]]> > > > > dummyplot.1.1 > > > > scatterplot.1.1 > scatterplot > A two dimensional scatterplot > > > plot_title > string > Title of the plot > Default plot title > > > plotdata > table > Data table containing the data to > plot > > x_axis > decimal > numeric data to be plotted on > abscissa > > > y_axis > decimal > numeric data to be plotted on > ordinate > meter > > > > > > scatterplot.gif > image/gif > the image produced from the > plot > > > goptions gsfname=out gsfmode=replace device=gif; > > filename out "scatterplot.gif"; > proc gplot; > plot y_axis*x_axis; > run; quit; > ]]> > > > > -- ******************************************************************* Matt Jones jones at nceas.ucsb.edu http://www.nceas.ucsb.edu/ Fax: 425-920-2439 Ph: 907-789-0496 National Center for Ecological Analysis and Synthesis (NCEAS) Interested in ecological informatics? http://www.ecoinformatics.org ******************************************************************* _______________________________________________ monarch-dev mailing list monarch-dev at ecoinformatics.org http://www.ecoinformatics.org/mailman/listinfo/monarch-dev