Workflow is the automation of an application, in whole or parts, where data or tasks are passed from one stage to another to be processed, according to a set of procedural rules. Scientific workflow supports for large data flows and needs to be executed in dynamic environment where resources are not known a priori and may need to adapt to changes. A workflow based application consists of several stages and the output of one stage becomes the input of the next stage. The number of tasks, computation complexity, and data in each stage can be
CometCloud supports workflow paradigm in the programming layer and the master worker model is used inside of each stage. The master generates tasks for a stage and workers consume them. The results gathering from workers become the input data for the next stage and this affects workloads’ amount and complexity. See Figure 1.
Figure 1 Figure 2
Figure 2 shows an overview of the workflow engine supported by CometCloud. The workflow manager is responsible for coordinating the execution of the overall application workflow, based on user-defined polices, using Comet spaces. It includes a workflow planner as well as task monitors/managers. The workflow planner determines the computational tasks that can be scheduled at each stage(s) of the workflow, and once computational task are identified, appropriate metadata describing the tasks (including application hints about complexities, data dependencies, affinities, etc.) is inserted into Comet space and the autonomic schedule is notified. The task monitor/manager then monitors and manages the execution of each of these tasks and determines when a stage has completed so the next stage(s) can be initiated.