New filter design requirements

The structure of filter classes is due for a redesign. The purpose of this document is to capture the requirements for the new filter design and the rational for these requirements. Once these requirements are captured, we will use a separate document to capture a proposed design.

Filters can be used without using a device compiler

For example, say VTK-m is compiled with CUDA. VTK should be able to completely include all the VTK-m accelerator code without ever having to use nvcc.

Currently this is not possible because the filter headers include worklets that need to be compiled on devices. These should be removed (using some of the following requirements).

Filters are compiled into libraries

Originally, VTK-m was considered a header-only library, but those days are long gone. Having filter implementations in header files is problematic on many fronts. One of the biggest issues is that to use any filter in VTK-m, you have to use the device compiler and satisfy all the compiler flags that VTK-m wants. This is a big no-no for many downstream users including ECP customers. Rather, you should be able to use VTK-m filters using only a standard C++ compiler.

Filters should be compiled into many different libraries/modules

The granularity of the libraries should be pretty fine, and some might contain just one filter. For example, the clip filter library will have a couple of filters (clip with field and clip with implicit function). The Lagrangian filter library might have just one filter. This makes some of the following requirements possible.

It should be noted that although very large libraries can be problematic (often linkers have trouble creating them), having lots of small libraries can also be problematic. This is particularly true if using shared object libraries in HPC systems. Loading lots of small shared objects in large MPI jobs can create significant overhead during initialization.

Source for each filter library should be in its own directory

The list of filters is already becoming unwieldy, and having them all in a single directory is problematic. They should each be in their own directory. Likely the namespace will also be different for each library (e.g. vtkm::filter::clip::ClipWithField).

Compiling filter libraries can be independently turned on/off

Compiling the filters will quickly become the majority of time spent in compiling. For users that only need specific functionality, they should be able to cherry-pick the filters they need to compile by selecting libraries/modules.

Dependencies among filters should be managed by CMake

If one filter depends on another filter, then CMake needs to ensure that the latter library is enabled when the former one is.

Filter code should be colocated with the worklets that implement them

For historical reasons, worklet implementations are in vtkm/worklet and the filter implementations that use them are in vtkm/filter. This no longer makes sense.

Instead, vtkm/worklet should only contain code that makes creating worklets possible (e.g. WorkletMapField and Keys).

The actual worklet implementations themselves should be moved to the directory (or a subdirectory) of the filter. For example, vtkm/worklet/Clip.h should be moved to vtkm/filter/clip or some subdirectory of that such as vtkm/filter/clip/worklet or vtkm/filter/clip/internal.

A nice feature of this change is that we can reduce some redundancy in the tests. There is no real need to have one test for the worklet and another for the filter. A test for the filter should exercise both.

Exceptions could be made for worklets that are intended to be used with multiple different filters or other contexts (such as AverageByKey). But these are pretty rare.

Policy objects no longer supported

The concept of policy only works if you do a special compile of the templates with that policy. But that will not work if filters are compiled into libraries.

That said, the types of a policy have to specified in other ways. The primary way to specify what types a filter should support are the type lists defined in vtkm/cont/DefaultTypes.h. This is where lists like VTKM_DEFAULT_TYPE_LIST, VTKM_DEFAULT_STORAGE_LIST, and VTKM_DEFAULT_CELL_SET_LIST are defined. More importantly, these lists can be modified by CMake configuration options.

Filter superclasses should support resolving input field array types

In the current implementation of filters, you can implement a DoExecute method that is given the active field resolved to an ArrayHandle type. Filters commonly have to discover the type of a field and instantiate templates accordingly. This can be complicated (and we continually evolve the best way to do it). Thus, we should continue to have filter superclasses do this resolution for the subclass.

Note that this is somewhat more complicated to implement because the current implementation can rely on whatever uses the filter to instantiate the templates. When building for libraries, something has to create the code to instantiate the templates, and we don't want filter developers to have to create these by hand.

The current implementation allows the filter subclass to select which types to support in the field (via filter "traits"). This or something like it should continue to be supported.

It is also possible to bypass this behavior by overriding PrepareForExecution. Something like this should continue to be supported.

Filter superclasses should support resolving mapped field array types

Similar to DoExecute, current filter implementations can define a DoMapField and the superclass will automatically discover the type of the field array to be mapped and pass that to the DoMapField template. This behavior should also continue to be supported.

A filter should also be able to instead override the behavior of MapFieldOntoOutput to handle the field mapping using unknown arrays. (This will be relatively common.) It should also be possible to do this override of MapFieldOntoOutput and then call the superclass to discover the type for some arrays. For example, a filter might trivially pass point data but need to do some processing for cell data.

ApplyPolicy functionality is still necessary

Although policies are being removed, many of the ApplyPolicy functions take into account other factors such as the filter traits. When policies go away, these functions should be renamed and implemented with the policy part removed. (Perhaps they should be protected methods in vtkm::filter::Filter.)

Split filter instantiations into multiple translation units

By their nature, the filter implementations tend to compile the same worklets with different template arguments. Although it is straightforward to compile all implementations using the same source file, it is often advantagous to split the instantiations into multiple translation units. Compiling all template resolutions at once can take a long time and might even overwhelm the compiler. Splitting up the code into multiple translation units means the compiler can take less memory per instance and can help with the time in parallel compiles.

Thread safety

The current implementation of filters has thread safety issues. Many filters store an array during their DoExecute in the state that is later used to compute the MapField part of their execution. If multiple threads are using the same filter object, they are liable to overwrite the arrays used in the state. The problem gets even worse when considering temporal objects that have to hold the data and results from one Execute to the next.

The filter structure should have an elegant way of running a filter on multiple threads without having to perform the same initialization on every thread (which is likely not reasonable).

Exporting to non-CMake projects

It needs to be considered that these filters (and VTK-m in general) are going to be used by projects that do not use CMake. These external projects will not have the luxury of importing CMake targets to generate the necessary library link flags in the proper order. To ease this, VTK-m needs to create makefile source that can be imported into external projects that look something like this:

VTKM_INCLUDE_FLAGS = -I/dir/to/vtk-m/include ...
VTKM_LIB_FLAGS = -L/dir/to/vtk-m/lib -lvtkm_filter1 ... -lvtkm_cont

It's worth noting that this requirement is not strictly related to the filter design. But if we are going to move to a place where we could have 10's or maybe one day 100's of filter libraries, then it is completely unreasonable to expect external projects to manage linking them all.

Edited Feb 18, 2021 by Kenneth Moreland