One of the most critical uses of CMB is processing various Simulation workflows. Currently this has been done on a case by case basis and has resulted in one workflow (say Surface water modeling ) being able to see some aspects of other workflows (like ground water). This can cause confusion and is not ideal. What is needed is to explicitly model a workflow and then have CMB only present tools and resources required just for that workflow. In addition, since a workflow needs to coordinate between different components of the application (selection, required resources, GUI, operations, etc..) CMB needs to conceptualize these aspects and provide the necessary "management" structure.
Possible Management Architecture
CMB already has some management structure. Models and Meshes are managed by their respective manager classes. Attributes are "managed" by the attribute system class though the class behaves more like a model or mesh class instead of their managers. The Plugin manager (which comes primarily from ParaView) deals with external plugins. The UI Manager (from SMTK) currently deals with View management for attributes. In addition under the covers is ParaView's screen/view management system for dealing with the 3D Views of the application. Currently in 4.x there is only a single 3D view (plus some dialogue views which have 3D view functionality) but in the High Energy Physics project we have created a prototype that provides multiple views. In the above design I propose that we combine these two concepts into a single UI Manager. In fact there could be two potential versions:
A simple UI Manager that assumes the application is a VTK based one
A ParaView based one that would be used in CMB applications
Currently each application provides its own mechanism for managing selections. This has the undesirable consequence of having to duplicate the functionality in new applications as well as difficultly in coordinating various sources of selection such as :
Select a model entity from the 3D View and having it highlighted in an attribute association panel
Select a set of model entities and having an operation panel populate itself
Have an operator constrain which types of model entities should be allowed to be selected
Adding new selection functionality - for example select an attribute and having the model entities associated with it being highlighted
Operation management is very rudimentary in that based on a model session being selected, all of the model operations are presented. Also, the current design assumes that only sessions can have operations forces other types such as meshing, exporting, job submission to be managed directly by the application. In addition, the current design does not allow for asynchronous operations which were extremely useful in CMB 3.x. The next section provides a high level description of the managers proposed in the above figure.
Right now there are two managers in CMB for dealing with models and meshes. All of the resources held by these managers are displayed in the Model Tree View which can cause confusion in terms of which resource is the main focus of the current task. In addition there is no attribute manager so various attribute systems (export, simulation, operation, etc..) are managed directly either by the application or by various components. The model and mesh managers also behave somewhat differently. The mesh manager primarily provides a mechanism of accessing a mesh (in SMTK it is referred to as a mesh collection since multiple meshes can be contained in a MOAB file. Given a UUID it can return the correct mesh collection (if it is loaded in memory). The model manager is somewhat different in that it not only holds onto the model resource itself but also all of the resource's components (in this case the various model entities). In the past I have proposed that the model manager should act more like the mesh manager (only holding "model collections - since a model file could hold multiple models). If the yet to be implemented attribute manager worked the same way - notice that all three managers could actually be merged into a single one (assuming that a mesh collection, model collection and attribute system were derived from a "resource" base class.
In addition, I propose that the managers should be able to save and reload their mapping between URL and UUID so they could be later restored when the workflow is reloaded into the system.
I propose that the role of the current UI Manager be expanded to provide the following functionality:
menu bar management - passing in the application's menu bar the manager would be able to setup the appropriate QT Actions.
status management - provide a mechanism to display various status messages
Widget management - provide a place where new widgets and layouts are registered and can be accessed by the the other components of the system
Action management - provide a mechanism for various QT action can be registered - this would allow other components a way for turning on and off actions based on the current context of the workflow
Context menu management - provide access to the various context menus in the 3D Views
3D View management - provide a mechanism of handing ParaView's view management that is more appropriate to a CMB application.
Selection and the coordination of the various ways things can be selected in a core piece of CMB. As mentioned above, the selection manager provides a centralized way of doing this coordination as well as providing a mechanism of filtering selection. There is the added issue of mapping between VTK/ParaView objects selected on the screen and their corresponding SMTK/CMB objects. I proposed that this is done by a separate class that is then registered with this manager as another way of doing or visualizing a selection. This would also allow the possibility of having both a simple VTK-SMTK Selector class and the ParaView-SMTK Selector class used by CMB. Basically this class would have a slot for an object making a selection to call (or signal) and a signal for broadcasting a selection.
The manager would also be able to convert a current selection to a desired one. For example if the user wants to assign a boundary condition to edges and the selection contains faces the manager should be able to convert the faces into a set of edges associated with it.
This manager maintains a set of registered operations which can be model-based, mesh-based, or other operations required by the workflow such as simulation export or job submission. Note that though this manager may need to communicate with job oriented systems like ReMUs, MoleQueue, or Cumulus it is not responsible for dealing with queuing system etc.. It does provide the necessary infrastructure for supporting asynchronous operations as well as a front end to the user showing status and the ability to terminate of running operations. This manager could be connected to the above selection manager. First to determine which operation could be applied to the current selection and second, to restrict what should be selected based on a specific operation.
In CMB 5 this would be the heart of the system. A workflow would coordinate between the different managers based on the current task. It would also guide the user through the various tasks (and sub-tasks).
So What is a Workflow?
A workflow is a sequence if tasks. Some tasks are simple, others composed of sub-tasks (or sub-workflows). A task potentially have a beginning and end. Meaning a set of conditions that must be met before a task can start, and a set of conditions that must be met before a task is considered completed.
Some Example Tasks
View a model - starting condition we need a model, ending condition none
Construct a Surface Water Model from scratch - starting condition none ending condition a polygonal model
Export an ADH Surface Water Simulation - starting condition, have a discrete model, mesh, a well defined attribute system end condition - job submitted
In the last example the "well defined" attribute system condition would be represented by a Python script that would be applied to appropriate resources.
The above figure shows the initial types of tasks CMB will need to support in order to support many of our targeted workflows:
Workflow Task- a Task representing a workflow (or sub-workflows). These tasks will always contain children tasks
Sequential Task Group - contains an order set of tasks that must be carried out in the specified sequence
Parallel Task Group - a set of tasks that can be done in any order
Selector Task Group - one and only of the children task must be performed
Single Task - a task that contains no children
In addition, a task may have modifiers that effect how a task is processed. The only one to be supported in CMB 5.0 would be Optional. Later on the ability to support conditional tasks could support more advanced workflows.
Reusing Tasks and Workflows
Like Attributes and their definitions it will make sense in the future to be able to share tasks (and therefore workflows) when creating new tasks.
A Possible Surface Water Workflow
Lets see how we could possibly model our current surface water workflow. To start off with we would need a top-level workflow task as depicted in the following figure.
Here I have decomposed the workflow into 7 tasks which must be performed in sequence. The first task deals with creating the initial discrete model. Since there are several ways this could be done I've modeled it as a Selector Task. After we have the model the user may optional want to modify it, re-mesh it and change bathymetry. The user then defined the simulation attributes, runs the simulation and visualizes the results.
Lets next examine how the user is allowed to create the discrete model which is depicted in the following figure.
The choices given to the user is the following:
He can load an existing one in
He can create a new empty one
He can import one in from an existing mesh
He can create one from a polygonal model
Note that he can only do one of these actions hence the use of the selector.
Lets assume he selects the create from polygon model workflow. The following figure shows its representation.
First he needs to create the polygon model. There are two ways to do this (create a new one or load one from file). He can then optional modify it. For example he could extract contours from DEM or from imagery. He then needs to mesh it and finally create a discrete model from it.
Finally, lets examine the top level define ADH Surface Water Attributes Workflow Task which is shown in the following figure.
Here the process is broken down into a set of tasks but the user is free to choose the order they are performed hence the use of the Parallel Task Group. Note that this workflow assumed the user is always starting from scratch. We could modify the workflow to allow the user to also be able to load in an existing attribute file as shown below
Putting it All Together!
Lets seem hypothetically how the proposed managers might be used to process parts of the workflow presented in the previous section.
The user selects the ADH Surface Water Workflow (prior to an application like ModelBuilder might not have any exposed functionality or may have a "default" model/mesh viewer capability.
The Workflow Manager using the UI Manager displays the selected workflow and shows the tasks involved. The only possible active task is the create discrete model selector task
The selector task goes to the UI Manager to build a selection dialog listed his 4 choices.
The user selects Create From Polygon Model
The Task manager pushes the current workflow onto the stack and begins processing the sub-workflow. Here the new sub-workflow is displayed and once again he is presented with a list of 2 choices.
The user selects create new polygon model.
Lets assume that the next required task (mesh model) requires at least one model face. The only task the user can now perform is the optional modify model task.
The Task Manager now informs the Operation Manager the list of operations associated with the modified model task. For example it might expose most of polygon model's operation (perhaps renaming them to better match the user's conceptual model of the problem domain. The Operation Manager might then use the UI Manager to create a toolbar showing the available operations defined by the task (such as extract contours from DEM).
A selected operation would inform the Selection Manager the types of model entities that can be selected to properly specify the operation.
During the course of applying the operations the user will have created at least one model face which will make the meshing workflow task available.
If the user selects the meshing workflow, the previously available operations will be replaced by operations pertinent to the meshing workflow.
Once the meshing workflow has been completed and we have a mesh the last task in this workflow becomes available and once done this workflow can now be completed and the original top-level workflow resumed.
Driving a CMB Application Externally
The above discussion assumes that the entire workflow of the user is being managed entirely by the CMB application. In the case of external workflow system like Sextant, it will be possible to drive the application in a command line mode where the Task or Sub-Workflow is specified as an argument. In this case the Task Manager would process the task normally and return control to the external workflow manager when the task is completed.
johnt comments 05-Jan-2018
I think this is a good starting point for designing the workflow manager.
The thing we need next is probably a representative NEAMS workflow(s) outlined with the approach described here, to go with the surface water workflow. It would also be useful to further refine the SLAC workflow discussed elsewhere.
Also that -- and this might already be implied -- I'll reinforce my thinking that each task is defined by its inputs and outputs, and that the workflow tracks the data consumed by each task and the data produced by each task. (Initially these data are probably files somewhere in the desktop-cumulus-HPC environment, but we should also be able to support things like PDM/database content in the future.)
Finally, in terms of nomenclature, I am currently favoring modeling step for the user-interactive activities, and computational step for pure-execution activities.
johnt comments 07-Nov-2016
This is a good enumeration of CMB modules for performing workflows. And the extension to external workflow execution is great.
Is the thinking that each "single task" is specified with declarative syntax? That would still be my recommendation -- using the same general approach as provisioning systems like Ansible.
@bob.obara : I was thinking we would start with a declarative syntax but maybe also allow a simple task to be a Python Script as well.
The choice of a single flow sequence (with hierarchy) versus a directed graph -- was that a keep-it-simple decision? It will limit the granularity at which workflows can be specified (which might be a good thing). In the AdH simulation, for example, I would typically consider the meshing and simulation-attribute tasks to be independent and not done serially.
@bob.obara : I did want to keep the design initially simple (though expanding from a tree to a DAG would be doable). In terms of doing meshing and simulation-attribute specification in either order. You could group them in a Parallel Task Group which would result in the workflow you described.
I like the ideal of task hierarchy, however, encapsulation could quickly break down for things like "selector" and "parallel" tasks, when the input requirements are different among the various sub-tasks. So there will be situations were some parts of a selector or parallel task can be performed but not others, so the readiness of the parent task is not a binary yes or no.
You don't mention whether feedback is supported or not, but I am presuming that a workflow can be started/restarted at any point in the sequence.
@bob.obara : Yes - you should be able to save the state of a partially saved workflow and reload it in.
I'll add my usual comment that we should support workflow tasks that are executed by "external" processing, that is, scripts or other non-CMB executables.