Skip to content

Distributed filters (histogram)

This MR does the following:

  1. Enhances filter API to better support multiblock & distributed use-cases.
  2. Updates Histogram filter to correctly work in multiblock & distirbute cases (by reducing results across all blocks)
  3. Adding Histogram example that uses MPI directly, instead DIY, for reduction across MPI ranks.

Here's an extract of the Filter.h docs that describes the updated Filter API.


This is the base class for all filters. To add a new filter, one can subclass this (or any of the existing subclasses e.g. FilterField, FilterDataSet, FilterDataSetWithField, etc. and implement relevant methods.

Usage

To execute a filter, one typically calls the auto result = filter.Execute(input). Typical usage is as follows:

// create the concrete subclass (e.g. MarchingCubes).
vtkm::filter::MarchingCubes marchingCubes;

// select fieds to map to the output, if different from default which is to
// map all input fields.
marchingCubes.SetFieldToPass({"var1", "var2"});

// execute the filter on vtkm::cont::DataSet.
vtkm::cont::DataSet dsInput = ...
auto outputDS = filter.Execute(dsInput);

// or, execute on a vtkm::cont::MultiBlock
vtkm::cont::MultiBlock mbInput = ...
auto outputMB = filter.Execute(mbInput);

Execute methods take in the input dataset or multiblock to process and return the result. The type of the result is same as the input type, thus Execute(DataSet&) returns a DataSet while Execute(MultiBlock&) returns a MultiBlock.

The implementation for Execute(DataSet&) is merely provided for convenience. Internally, it creates MultiBlock with a single block for the input and then forwards the call to Execute(MultiBlock&). The method returns the first block, if any, from the MultiBlock returned by the forwarded call. If the MultiBlock returned has more than 1 block, then vtkm::cont::ErrorFilterExecution will be thrown.

Subclassing

Typically, one subclasses one of the immediate subclasses of this class such as FilterField, FilterDataSet, FilterDataSetWithField, etc. Those may impose additional constraints on the methods to implement in the subclasses. Here, we describes the things to consider when directly subclassing vtkm::filter::Filter.

PreExecute and PostExecute

Subclasses may provide implementations for either or both of the following methods.

template <typename DerivedPolicy>
void PreExecute(const vtkm::cont::MultiBlock& input,
          const vtkm::filter::PolicyBase<DerivedPolicy>& policy);

template <typename DerivedPolicy>
void PostExecute(const vtkm::cont::MultiBlock& input, vtkm::cont::MultiBlock& output
          const vtkm::filter::PolicyBase<DerivedPolicy>& policy);

As the name suggests, these are called and the beginning and before the end of an Filter::Execute call. Most filters that don't need to handle mutliblock datasets specially, e.g. clip, cut, iso-contour, need not worry about these methods or provide any implementation. If, however, your filter needs do to some initialization e.g. allocation buffers to accumulate results, or finalization e.g. reduce results across all blocks, then these methods provide convenient hooks for the same.

PrepareForExecution

A concrete subclass of Filter must provide PrepareForExecution implementation that provides the meat for the filter i.e. the implementation for the filter's data processing logic. There are two signatures available; which one to implement depends on the nature of the filter.

Let's consider simple filters that do not need to do anything special to handle multiblock datasets e.g. clip, contour, etc. These are the filters where executing the filter on a MultiBlock simply means executing the filter on one block at a time and packing the output for each iteration info the result MultiBlock. For such filters, one must implement the following signature.

template <typename DerivedPolicy>
vtkm::cont::DataSet PrepareForExecution(
        const vtkm::cont::DataSet& input,
        const vtkm::filter::PolicyBase<DerivedPolicy>& policy);

The role of this method is to execute on the input dataset and generate the result and return it. If there are any errors, the subclass must throw an exception (e.g. vtkm::cont::ErrorFilterExecution).

In this case, the Filter superclass handles iterating over multiple blocks in the input MultiBlock and calling PrepareForExecution iteratively.

The aforementioned approach is also suitable for filters that need special handling for multiblock datasets which can be modelled as PreExecute and PostExecute steps (e.g. vtkm::filter::Histogram).

For more complex filters, like streamlines, particle tracking, where the processing of multiblock datasets cannot be modelled as a reduction of the results, one can implement the following signature.

template <typename DerivedPolicy>
vtkm::cont::MultiBlock PrepareForExecution(
        const vtkm::cont::MultiBlock& input,
        const vtkm::filter::PolicyBase<DerivedPolicy>& policy);

The responsibility of this method is the same, except now the subclass is given full control over the execution, including any mapping of fields to output (described in next sub-section).

MapFieldOntoOutput

Subclasses may provide MapFieldOntoOutput method with the following signature:

template <typename DerivedPolicy>
VTKM_CONT bool MapFieldOntoOutput(vtkm::cont::DataSet& result,
                                  const vtkm::cont::Field& field,
                                  const vtkm::filter::PolicyBase<DerivedPolicy>& policy);

When present, this method will be called after each block execution to map an input field from the corresponding input block to the output block.

Edited by Utkarsh Ayachit

Merge request reports