Tags give the ability to mark specific points in history as being important
  • v1.5.0   1.5.0 is our 6th official release of VTK-m. The major changes to VTK-m from 1.4.0 can be found in: docs/changelog/1.5/release-notes.md

    VTK-m 1.5 Release Notes

    Table of Contents

    1. Core
      • Provide pre-built filters in a vtkm_filter library
      • Provide pre-build sources in a vtkm_source library
      • Add aliases for common vtkm::Vec types
      • Improve WorkletMapTopology worklet names
      • vtkm::cont::DataSet only contain a single vtkm::cont::CellSet
      • vtkm::cont::MultiBlock renamed to vtkm::cont::PartitionedDataSet
    2. ArrayHandle
      • Add vtkm::cont::ArrayGetValues to retrieve a subset of ArrayHandle values from a device
      • Add vtkm::cont::ArrayHandleMultiplexer
      • Add vtkm::cont::ArrayHandleDecorator
      • Add vtkm::cont::ArrayHandleSOA
      • vtkm::cont::ArrayHandleCast is now writeable
      • Remove ArrayPortalShrink, behavior subsumed by vtkm::cont::ArrayHandleView
    3. Control Environment
      • vtkm::cont::CellSetExplicit refactored to remove redundant array
      • vtkm::cont::CellSets now don't have a name field
      • vtkm::cont::CellLocatorUniformGrid and vtkm::cont::CellLocatorRectilinearGrid support 2D grids
      • vtkm::cont::DataSet queries for CoordinateSystem ondices don't throw exceptions
      • vtkm::cont::Fields now don't require the associated vtkm::cont::CellSet name
      • Invoker moved to vtkm::cont
      • Refactor of vtkm::cont::CellSet PrepareForInput signatures
      • Simplify creating vtkm::cont::Fields from vtkm::cont::ArrayHandles
    4. Execution Environment
      • Corrected cell derivatives for polygon cell shape
      • A ScanExtended device algorithm has been added
      • Provide base component queries to vtkm::VecTraits
    5. Worklets and Filters
      • ExecutionSignatures are now optional for simple worklets
      • Refactor topology mappings to clarify meaning
      • Simplify creating results for vtkm::filter::filters
      • Provide a simplified way to state allowed value types for vtkm::filter::filters
      • vtkm::cont::Invoker is now a member of all VTK-m filters
      • vtkm::filter::Filter now don't have an active vtkm::cont::CellSet
      • vtkm::filter::FilterField now provides all functionality of vtkm::filter::FilterCell
      • Add ability to get an array from a vtkm::cont::Field for a particular type
      • vtkm::worklet::WorkletPointNeighborhood can query exact neighbor offset locations
      • Add Lagrangian Coherent Structures (LCS) Filter for VTK-m
      • SurfaceNormals filter can now orient normals
      • Particle advection components have better status query support
      • vtkm::filter::Threshold now outputs a vtkm::cont::CellSetExplicit
    6. Build
      • Introduce vtkm_add_target_information cmake function to make using vtk-m easier
    7. Other
      • Simplify examples
      • vtkm::Vec const& operator[] is now constexpr

    Core

    Provide pre-built filters in a vtkm_filter library

    VTK-m now provides the following pre built versions of the following filters as part of the vtkm_filter library, when executed with the default types.

    • CellAverage
    • CleanGrid
    • ClipWithField
    • ClipWithImplicitFunction
    • Contour
    • ExternalFaces
    • ExtractStuctured
    • PointAverage
    • Threshold
    • VectorMagnitude

    The decision on providing a subset of filters as a library was based on balancing the resulting library size and cross domain applicibaility of the filter. So the initial set of algorithms have been selected by looking at what is commonly used by current VTK-m consuming applications.

    By default types we mean that no explicit user policy has been passed to the Execute method on these filters. For example the following will use the pre-build Threshold and CleanGrid filters:

      vtkm::cont::DataSet input = ...;
    
      //convert input to an unstructured grid
      vtkm::filter::CleanGrid clean;
      auto cleaned = clean.Execute(input);
    
      vtkm::filter::Threshold threshold;
      threshold.SetLowerThreshold(60.1);
      threshold.SetUpperThreshold(60.1);
      threshold.SetActiveField("pointvar");
      threshold.SetFieldsToPass("cellvar");
      auto output = threshold.Execute(cleaned);
      ...

    Provide pre-build sources in a vtkm_source library

    A new class hierarchy for dataset source was added. The intention is to consolidate and refactor various (procedural) dataset generators for unit tests, especially the multiple copy&past-ed implementations of the Tangle field. As they are compiled into a library rather than as header files, we also expect the overall compile time to decrease.

    The public interface of dataset source is modeled after Filter. A new DataSet is returned by calling the Execute() method of the dataset source, for example:

    vtkm::Id3 dims(4, 4, 4);
    vtkm::source::Tangle tangle(dims);
    vtkm::cont::DataSet dataSet = tangle.Execute();

    Add aliases for common vtkm::Vec types

    Specifying Vec types can be verbose. For example, to simply express a vector in 3-space, you would need a declaration like this:

    vtkm::Vec<vtkm::FloatDefault, 3>

    This is very verbose and frankly confusing to users. To make things easier, we have introduced several aliases for common Vec types. For example, the above type can now be referenced simply with vtkm::Vec3f, which is a 3-vec of floating point values of the default width. If you want to specify the width, then you can use either vtkm::Vec3f_32 or vtkm::Vec3f_64.

    There are likewise types introduced for integers and unsigned integers (e.g. vtkm::Vec3i and vtkm::Vec3ui). You can specify the width of these all the way down to 8 bit (e.g. vtkm::Vec3ui_8, vtkm::Vec3ui_16, vtkm::Vec3ui_32, and vtkm::Vec3ui_64).

    For completeness, vtkm::Id4 was added as well as vtkm::IdComponent2, vtkm::IdComponent3, and vtkm::IdComponent4.

    Improve WorkletMapTopology worklet names

    The convenience implementations of WorkletMapTopology have been renamed for clarity as follows:

    WorkletMapPointToCell --> WorkletVisitCellsWithPoints
    WorkletMapCellToPoint --> WorkletVisitPointsWithCells

    vtkm::cont::DataSet only contain a single vtkm::cont::CellSet

    Multiple vtkm::cont::CellSets on a datasets increased the complexity of using VTK-m correctly without any significant benefits.

    It had the effect that vtkm::cont::Fields that representing cell fields needed to be associated with a given cellset. This has to be a loose coupling to allow for filters to generate new output cellsets. At the same time it introduced errors when that output had a different name.

    It raised questions about how should filters propagate cell fields. Should a filter drop all cell fields not associated with the active CellSet, or is that too aggressive given the fact that maybe the algorithm just mistakenly named the field, or the IO routine added a field with the wrong cellset name.

    It increased the complexity of filters, as the developer needed to determine if the algorithm should support execution on a single CellSet or execution over all CellSets.

    Given these issues it was deemed that removing multiple CellSets was the correct way forward. People using multiple CellSets will need to move over to vtkm::cont::MultiBlock which supports shared points and fields between multiple blocks.

    vtkm::cont::MultiBlock renamed to vtkm::cont::PartitionedDataSet

    The MultiBlock class has been renamed to PartitionedDataSet, and its API has been refactored to refer to "partitions", rather than "blocks". Additionally, the AddBlocks method has been changed to AppendPartitions to more accurately reflect the operation performed. The associated AssignerMultiBlock class has also been renamed to AssignerPartitionedDataSet.

    This change is motivated towards unifying VTK-m's data model with VTK. VTK has started to move away from vtkMultiBlockDataSet, which is a hierarchical tree of nested datasets, to vtkPartitionedDataSet, which is always a flat vector of datasets used to assist geometry distribution in multi-process environments. This simplifies traversal during processing and clarifies the intent of the container: The component datasets are partitions for distribution, not organizational groupings (e.g. materials).

    ArrayHandle

    Add vtkm::cont::ArrayGetValues to retrieve a subset of ArrayHandle values from a device

    An algorithm will often want to pull just a single value (or small subset of values) back from a device to check the results of a computation. Previously, there was no easy way to do this, and algorithm developers would often transfer vast quantities of data back to the host just to check a single value.

    The new vtkm::cont::ArrayGetValue and vtkm::cont::ArrayGetValues functions simplify this operations and provide a method to just retrieve a portion of an array.

    This utility provides several convenient overloads:

    A single id may be passed into ArrayGetValue, or multiple ids may be specified to ArrayGetValues as an ArrayHandlevtkm::Id, a std::vectorvtkm::Id, a c-array (pointer and size), or as a brace-enclosed initializer list.

    The single result from ArrayGetValue may be returned or written to an output argument. Multiple results from ArrayGetValues may be returned as an std::vector, or written to an output argument as an ArrayHandle or a std::vector.

    Examples:

    vtkm::cont::ArrayHandle<T> data = ...;
    
    // Fetch the first value in an array handle:
    T firstVal = vtkm::cont::ArrayGetValue(0, data);
    
    // Fetch the first and third values in an array handle:
    std::vector<T> firstAndThird = vtkm::cont::ArrayGetValues({0, 2}, data);
    
    // Fetch the first and last values in an array handle:
    std::vector<T> firstAndLast =
        vtkm::cont::ArrayGetValues({0, data.GetNumberOfValues() - 1}, data);
    
    // Fetch the first 4 values into an array handle:
    const std::vector<vtkm::Id> ids{0, 1, 2, 3};
    vtkm::cont::ArrayHandle<T> firstFour;
    vtkm::cont::ArrayGetValues(ids, data, firstFour);

    Add vtkm::cont::ArrayHandleMultiplexer

    vtkm::cont::ArrayHandleMultiplexer is a fancy ArrayHandle that can mimic being any one of a list of other ArrayHandles. When declared, a set of a list of ArrayHandles is given to ArrayHandleMultiplexer. To use the ArrayHandleMultiplexer it is set to an instance of one of these other ArrayHandles. Thus, once you compile code to use an ArrayHandleMultiplexer, you can at runtime select any of the types it supports.

    The intention is convert the data from a vtkm::cont::VariantArrayHandle to a vtkm::cont::ArrayHandleMultiplexer of some known types. The ArrayHandleMultiplexer can be compiled statically (that is, no virtual methods are needed). Although the compiler must implement all possible implementations of the multiplexer, two or more ArrayHandleMultiplexers can be used together without having to compile every possible combination of all of them.

    Motivation

    ArrayHandle is a very flexible templated class that allows us to use the compiler to adapt our code to pretty much any type of memory layout or on-line processing. Unfortunately, the template approach requires the code to know the exact type during compile time.

    That is a problem when retrieving data from a vtkm::cont::VariantArrayHandle, which is the case, for example, when getting data from a vtkm::cont::DataSet. The actual type of the array stored in a vtkm::cont::VariantArrayHandle is generally not known at compile time at the code location where the data is pulled.

    Our first approach to this problem was to use metatemplate programming to iterate over all possible types in the VariantArrayHandle. Although this works, it means that if two or more VariantArrayHandles are dispatched in a function call, the compiler needs to generate all possible combinations of the two. This causes long compile times and large executable sizes. It has lead us to limit the number of types we support, which causes problems with unsupported arrays.

    Our second approach to this problem was to create ArrayHandleVirtual to hide the array type behind a virtual method. This works very well, but is causing us significant problems on accelerators. Although virtual methods are supported by CUDA, there are numerous problems that can come up with the compiled code (such as unknown stack depths or virtual methods extending across libraries). It is also unknown what problems we will encounter with other accelerator architectures.

    ArrayHandleMultiplexer is meant to be a compromise between these two approaches. Although we are still using metatemplate programming tricks to iterate over multiple implementations, this compiler looping is localized to the code to lookup values in the array. This, it is a small amount of code that needs to be created for each version supported by the ArrayHandle. Also, the code paths can be created independently for each ArrayHandleMultiplexer. Thus, you do not get into the problem of a combinatorial explosion of types that need to be addressed.

    Although ArrayHandleMultiplexer still has the problem of being unable to store a type that is not explicitly listed, the localized expression should allow us to support many types. By default, we are adding lots of ArrayHandleCasts to the list of supported types. The intention of this is to allow a filter to specify a value type it operates on and then cast everything to that type. This further allows us to reduce combination of types that we have to support.

    Use

    The ArrayHandleMultiplexer templated class takes a variable number of template parameters. All the template parameters are expected to be types of ArrayHandles that the ArrayHandleMultiplexer can assume.

    For example, let's say we have a use case where we need an array of indices. Normally, the indices are sequential (0, 1, 2,...), but sometimes we need to define a custom set of indices. When the indices are sequential, then an ArrayHandleIndex is the best representation. Normally if you also need to support general arrays you would first have to deep copy the indices into a physical array. However, with an ArrayHandleMultiplexer you can support both.

    vtkm::cont::ArrayHandleMultiplexer<vtkm::cont::ArrayHandleIndex,
                                       vtkm::cont::ArrayHandle<vtkm::Id>> indices;
    indices = vtkm::cont::ArrayHandleIndex(ARRAY_SIZE);

    indices can now be used like any other ArrayHandle, but for now is behaving like an ArrayHandleIndex. That is, it takes (almost) no actual space. But if you need to use explicit indices, you can set the indices array to an actual array of indices

    vtkm::cont::ArrayHandle<vtkm::Id> indicesInMemory;
    // Fill indicesInMemory...
    
    indices = indicesInMemory;

    All the code that uses indices will continue to work.

    Variant

    To implement ArrayHandleMultiplexer, the class vtkm::internal::Variant was introduced. Although this is an internal class that is not exposed through the array handle, it is worth documenting its addition as it will be useful to implement other multiplexing type of objects (such as for cell sets and locators).

    vtkm::internal::Variant is a simplified version of C++17's std::variant or boost's variant. One of the significant differences between VTK-m's Variant and these other versions is that VTK-m's version does not throw exceptions on error. Instead, behavior becomes undefined. This is intentional as not all platforms support exceptions and there could be consequences on just the possibility for those that do.

    Like the aforementioned classes that vtkm::internal::Variant is based on, it behaves much like a union of a set of types. Those types are listed as the Variant's template parameters. The Variant can be set to any one of these types either through construction or assignment. You can also use the Emplace method to construct the object in a Variant.

    vtkm::internal::Variant<int, float, std::string> variant(5);
    // variant is now an int.
    
    variant = 5.0f;
    // variant is now a float.
    
    variant.Emplace<std::string>("Hello world");
    // variant is now an std::string.

    The Variant maintains the index of which type it is holding. It has several helpful items to manage the type and index of contained objects:

    • GetIndex(): A method to retrieve the template parameter index of the type currently held. In the previous example, the index starts at 0, becomes 1, then becomes 2.
    • GetIndexOf<T>(): A static method that returns a constexpr of the index of a given type. In the previous example, variant.GetIndexOf<float>() would return 1.
    • Get<T or I>(): Given a type, returns the contained object as that type. Given a number, returns the contained object as a type of the corresponding index. In the previous example, either variant.Get<1>() or variant.Get<float>() would return the float value. The behavior is undefined if the object is not the requested type.
    • IsValid(): A method that can be used to determine whether the Variant holds an object that can be operated on.
    • Reset(): A method to remove any contained object and restore to an invalid state.

    Finally, Variant contains a CastAndCall method. This method takes a functor followed by a list of optional arguments. The contained object is cast to the appropriate type and the functor is called with the cast object followed by the provided arguments. If the functor returns a value, that value is returned by CastAndCall.

    CastAndCall is an important functionality that makes it easy to wrap multiplexer objects around a Variant. For example, here is how you could implement executing the Value method in an implicit function multiplexer.

    class ImplicitFunctionMultiplexer
    {
      vtkm::internal::Variant<Box, Plane, Sphere> ImplicitFunctionVariant;
    
      // ...
    
      struct ValueFunctor
      {
        template <typename ImplicitFunctionType>
    	vtkm::FloatDefault operator()(const ImplicitFunctionType& implicitFunction,
    	                              const vtkm::Vec<vtkm::FloatDefault, 3>& point)
    	{
    	  return implicitFunction.Value(point);
    	}
      };
    
      vtkm::FloatDefault Value(const vtkm::Vec<vtkm::FloatDefault, 3>& point) const
      {
        return this->ImplicitFunctionVariant.CastAndCall(ValueFunctor{}, point);
      }
    

    Add vtkm::cont::ArrayHandleDecorator

    ArrayHandleDecorator is given a DecoratorImpl class and a list of one or more source ArrayHandles. There are no restrictions on the size or type of the source ArrayHandles.

    The decorator implementation class is described below:

    struct ExampleDecoratorImplementation
    {
    
      // Takes one portal for each source array handle (only two shown).
      // Returns a functor that defines:
      //
      // ValueType operator()(vtkm::Id id) const;
      //
      // which takes an index and returns a value which should be produced by
      // the source arrays somehow. This ValueType will be the ValueType of the
      // ArrayHandleDecorator.
      //
      // Both SomeFunctor::operator() and CreateFunctor must be const.
      //
      template <typename Portal1Type, typename Portal2Type>
      SomeFunctor CreateFunctor(Portal1Type portal1, Portal2Type portal2) const;
    
      // Takes one portal for each source array handle (only two shown).
      // Returns a functor that defines:
      //
      // void operator()(vtkm::Id id, ValueType val) const;
      //
      // which takes an index and a value, which should be used to modify one
      // or more of the source arrays.
      //
      // CreateInverseFunctor is optional; if not provided, the
      // ArrayHandleDecorator will be read-only. In addition, if all of the
      // source ArrayHandles are read-only, the inverse functor will not be used
      // and the ArrayHandleDecorator will be read only.
      //
      // Both SomeInverseFunctor::operator() and CreateInverseFunctor must be
      // const.
      //
      template <typename Portal1Type, typename Portal2Type>
      SomeInverseFunctor CreateInverseFunctor(Portal1Type portal1,
                                              Portal2Type portal2) const;
    
    };

    Add vtkm::cont::ArrayHandleSOA

    ArrayHandleSOA behaves like a regular ArrayHandle (with a basic storage) except that if you specify a ValueType of a Vec or a Vec-like, it will actually store each component in a separate physical array. When data are retrieved from the array, they are reconstructed into Vec objects as expected.

    The intention of this array type is to help cover the most common ways data is lain out in memory. Typically, arrays of data are either an "array of structures" like the basic storage where you have a single array of structures (like Vec) or a "structure of arrays" where you have an array of a basic type (like float) for each component of the data being represented. The ArrayHandleSOA makes it easy to cover this second case without creating special types.

    ArrayHandleSOA can be constructed from a collection of ArrayHandle with basic storage. This allows you to construct Vec arrays from components without deep copies.

    std::vector<vtkm::Float32> accel0;
    std::vector<vtkm::Float32> accel1;
    std::vector<vtkm::Float32> accel2;
    
    // Let's say accel arrays are set to some field of acceleration vectors by
    // some other software.
    
    vtkm::cont::ArrayHandle<vtkm::Float32> accelHandle0 = vtkm::cont::make_ArrayHandle(accel0);
    vtkm::cont::ArrayHandle<vtkm::Float32> accelHandle1 = vtkm::cont::make_ArrayHandle(accel1);
    vtkm::cont::ArrayHandle<vtkm::Float32> accelHandle2 = vtkm::cont::make_ArrayHandle(accel2);
    
    vtkm::cont::ArrayHandleSOA<vtkm::Vec3f_32> accel = { accelHandle0, accelHandle1, accelHandle2 };

    Also provided are constructors and versions of make_ArrayHandleSOA that take std::vector or C arrays as either initializer lists or variable arguments.

    std::vector<vtkm::Float32> accel0;
    std::vector<vtkm::Float32> accel1;
    std::vector<vtkm::Float32> accel2;
    
    // Let's say accel arrays are set to some field of acceleration vectors by
    // some other software.
    
    vtkm::cont::ArrayHandleSOA<vtkm::Vec3f_32> accel = { accel0, accel1, accel2 };

    However, setting arrays is a little awkward because you also have to specify the length. This is done either outside the initializer list or as the first argument.

    vtkm::cont::make_ArrayHandleSOA({ array0, array1, array2 }, ARRAY_SIZE);
    vtkm::cont::make_ArrayHandleSOA(ARRAY_SIZE, array0, array1, array2);

    vtkm::cont::ArrayHandleCast is now writeable

    Previously, ArrayHandleCast was considered a read-only array handle. However, it is trivial to reverse the cast (now that ArrayHandleTransform supports an inverse transform). So now you can write to a cast array (assuming the underlying array is writable).

    One trivial consequence of this change is that you can no longer make a cast that cannot be reversed. For example, it was possible to cast a simple scalar to a Vec even though it is not possible to convert a Vec to a scalar value. This was of dubious correctness (it is more of a construction than a cast) and is easy to recreate with ArrayHandleTransform.

    Remove ArrayPortalShrink, behavior subsumed by vtkm::cont::ArrayHandleView

    ArrayPortalShrink originaly allowed a user to pass in a delegate array portal and then shrink the reported array size without actually modifying the underlying allocation. An iterator was also provided that would correctly iterate over the shrunken size of the stored array.

    Instead of directly shrinking the original array, it is prefered to create an ArrayHandleView from an ArrayHandle and then specify the number of values to use in the ArrayHandleView constructor.

    Control Enviornment

    vtkm::cont::CellSetExplicit refactored to remove redundant array

    The CellSetExplicit class has been refactored to remove the NumIndices array. This information is now derived from the Offsets array, which has been changed to contain [numCells + 1] entries.

    Old Layout:
    -----------
    NumIndices:   [  2,  4,  3,  3,  2 ]
    IndexOffset:  [  0,  2,  6,  9, 12 ]
    Connectivity: [  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13 ]
    
    New Layout:
    -----------
    Offsets:      [  0,  2,  6,  9, 12, 14 ]
    Connectivity: [  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13 ]

    This will reduce the memory overhead of the cellset by roughly [numCells * 4] bytes.

    The IndexOffset array / typedefs / methods have been renamed to Offsets for brevity and consistency (similar members were plural, e.g. Shapes).

    The NumIndices array can be recovered from the Offsets array by using an ArrayHandleDecorator. This is done automatically by the CellSetExplicit::GetNumIndicesArray method.

    The CellSetExplicit::Fill signature has changed to remove numIndices as a parameter and to require the offsets array as a non-optional argument. To assist in porting old code, an offsets array can be generated from numIndices using the new vtkm::cont::ConvertNumIndicesToOffsets methods, defined in CellSetExplicit.h.

    vtkm::Id numPoints = ...;
    auto cellShapes = ...;
    auto numIndices = ...;
    auto connectivity = ...;
    vtkm::cont::CellSetExplicit<> cellSet = ...;
    
    // Old:
    cellSet.Fill(numPoints, cellShapes, numIndices, connectivity);
    
    // New:
    auto offsets = vtkm::cont::ConvertNumIndicesToOffsets(numIndices);
    cellSet.Fill(numPoints, cellShapes, connectivity, offsets);

    Since the offsets array now contains an additional offset at the end, it cannot be used directly with ArrayHandleGroupVecVariable with the cellset's connectivity array to create an array handle containing cell definitions. This now requires an ArrayHandleView to trim the last value off of the array:

    vtkm::cont::CellSetExplicit<> cellSet = ...;
    auto offsets = cellSet.GetOffsetsArray(vtkm::TopologyElementTagCell{},
                                           vtkm::TopologyElementTagPoint{});
    auto conn = cellSet.GetConnectivityArray(vtkm::TopologyElementTagCell{},
                                             vtkm::TopologyElementTagPoint{});
    
    // Old:
    auto cells = vtkm::cont::make_ArrayHandleGroupVecVariable(conn, offsets);
    
    // New:
    const vtkm::Id numCells = offsets.GetNumberOfValues - 1;
    auto offsetsTrim = vtkm::cont::make_ArrayHandleView(offsets, 0, numCells);
    auto cells = vtkm::cont::make_ArrayHandleGroupVecVariable(conn, offsetsTrim);

    vtkm::cont::CellSets now don't have a name field

    The requirement that vtkm::cont::CellSets have a name was so cell based vtkm::cont::Field's could be associated with the correct CellSet in a vtkm::cont::DataSet.

    Now that vtkm::cont::DataSet's don't support multiple CellSets, we can remove the CellSet name member variable.

    vtkm::cont::CellLocatorUniformGrid and vtkm::cont::CellLocatorRectilinearGrid support 2D grids

    VTK-m will now allow locating containing cells for a point using CellLocatorUniformGrid and CellLocatorRectilinearGrid for 2D grids.

    Users are required to create the locator objects as they normally would. However, the FindCell method in vtkm::exec::CellLocator still requires users to pass a 3D point as an input.

    Further, the structured grid locators no longer use the vtkm::exec::WorldToParametricCoordinates method to return parametric coordinates, instead they use fast paths for locating points in a cell of an axis-aligned grid.

    Another change for the CellLocatorRectilinearGrid is that now it uses binary search on individual component arrays to search for a point.

    vtkm::cont::DataSet queries for CoordinateSystem ondices don't throw exceptions

    Asking for the index of a vtkm::cont::CoordinateSystem by name now returns a -1 when no matching item has been found instead of throwing an exception.

    This was done to make the interface of vtkm::cont::DataSet to follow the guideline "Only unrepresentable things should raise exceptions". The index of a non-existent item is representable by -1 and therefore we shouldn't throw, like wise the methods that return references can still throw exceptions as you can't have a reference to an non-existent item.

    vtkm::cont::Fields now don't require the associated vtkm::cont::CellSet name

    Now that vtkm::cont::DataSet can only have a single vtkm::cont::CellSet the requirement that cell based vtkm::cont::Fields need a CellSet name has been lifted.

    Invoker moved to vtkm::cont

    Previously, Invoker was located in the vtkm::worklet namespace to convey it was a replacement for using vtkm::worklet::Dispatcher*. In actuality it should be in vtkm::cont as it is the proper way to launch worklets for execution, and that shouldn't exist inside the worklet namespace.

    Refactor of vtkm::cont::CellSet PrepareForInput signatures

    The From and To nomenclature for topology mapping has been confusing for both users and developers, especially at lower levels where the intention of mapping attributes from one element to another is easily conflated with the concept of mapping indices (which maps in the exact opposite direction).

    These identifiers have been renamed to VisitTopology and IncidentTopology to clarify the direction of the mapping. The order in which these template parameters are specified for PrepareForInput have also been reversed, since eventually there may be more than one IncidentTopology, and having IncidentTopology at the end will allow us to replace it with a variadic template parameter pack in the future.

    Simplify creating vtkm::cont::Fields from vtkm::cont::ArrayHandles

    VTK-m now offers make_FieldPoint and make_FieldCell functions that reduce the complexity of construction vtkm::cont::Fields from vtkm::cont::ArrayHandles.

    Previously to construct a point and cell fields you would do:

    vtkm::cont::ArrayHandle<int> pointHandle;
    vtkm::cont::ArrayHandle<int> cellHandle;
    vtkm::cont::Field pointField("p", vtkm::cont::Field::Association::POINTS, pointHandle);
    vtkm::cont::Field cellField("c", vtkm::cont::Field::Association::CELL_SET, "cells", cellHandle);

    Now with the new make_ functions you can do:

    vtkm::cont::ArrayHandle<int> pointHandle;
    vtkm::cont::ArrayHandle<int> cellHandle;
    auto pointField = vtkm::cont::make_FieldPoint("p", pointHandle);
    auto cellField = vtkm::cont::make_FieldCell("c", "cells", cellHandle);

    Execution Environment

    Corrected cell derivatives for polygon cell shape

    For polygon cell shapes (that are not triangles or quadrilaterals), interpolations are done by finding the center point and creating a triangle fan around that point. Previously, the gradient was computed in the same way as interpolation: identifying the correct triangle and computing the gradient for that triangle.

    The problem with that approach is that makes the gradient discontinuous at the boundaries of this implicit triangle fan. To make things worse, this discontinuity happens right at each vertex where gradient calculations happen frequently. This means that when you ask for the gradient at the vertex, you might get wildly different answers based on floating point imprecision.

    Get around this problem by creating a small triangle around the point in question, interpolating values to that triangle, and use that for the gradient. This makes for a smoother gradient transition around these internal boundaries.

    A ScanExtended device algorithm has been added

    This new scan algorithm produces an array that contains both an inclusive scan and an exclusive scan in the same array:

    #include <vtkm/cont/Algorithm.h>
    #include <vtkm/cont/ArrayGetValue.h>
    #include <vtkm/cont/ArrayHandle.h>
    #include <vtkm/cont/ArrayHandleView.h>
    
    vtkm::cont::ArrayHandle<T> inputData = ...;
    const vtkm::Id size = inputData.GetNumberOfValues();
    
    vtkm::cont::ArrayHandle<T> extendedScan;
    vtkm::cont::Algorithm::ScanExtended(inputData, extendedScan);
    
    // The exclusive scan is the first `inputSize` values starting at index 0:
    auto exclusiveScan = vtkm::cont::make_ArrayHandleView(extendedScan, 0, size);
    
    // The inclusive scan is the first `inputSize` values starting at index 1:
    auto inclusiveScan = vtkm::cont::make_ArrayHandleView(extendedScan, 1, size);
    
    // The total sum of the input data is the last value in the extended scan.
    const T totalSum = vtkm::cont::ArrayGetValue(size, extendedScan);

    This can also be thought of as an exclusive scan that appends the total sum, rather than returning it.

    Provide base component queries to vtkm::VecTraits

    This change adds a recursive BaseComponentType to VecTraits that recursively finds the base (non-Vec) type of a Vec. This is useful when dealing with potentially nested Vecs (e.g. Vec<Vec<T, M>, N>) and you want to know the precision of the math being defined.

    using NestedVec = vtkm::Vec<vtkm::Vec<vtkm::Float32, 3>, 8>;
    
    // ComponentType becomes vtkm::Vec<vtkm::Float32, 3>
    using ComponentType = typename vtkm::VecTraits<NestedVec>::ComponentType;
    
    // BaseComponentType becomes vtkm::Float32
    using BaseComponentType = typename vtkm::VecTraits<NestedVec>::BaseComponentType;

    Also added the ability to VecTraits to change the component type of a vector. The template RepalceComponentType resolves to a Vec of the same type with the component replaced with a new type. The template ReplaceBaseComponentType traverses down a nested type and replaces the base type.

    using NestedVec = vtkm::Vec<vtkm::Vec<vtkm::Float32, 3>, 8>;
    
    // NewVec1 becomes vtkm::Vec<vtkm::Float64, 8>
    using NewVec1 =
      typename vtkm::VecTraits<NestedVec>::template ReplaceComponentType<vtkm::Float64>;
    
    // NewVec2 becomes vtkm::Vec<vtkm::Vec<vtkm::Float64, 3>, 8>
    using NewVec1 =
      typename vtkm::VecTraits<NestedVec>::template ReplaceBaseComponentType<vtkm::Float64>;

    This functionality replaces the functionality in vtkm::BaseComponent. Unfortunately, vtkm::BaseComponent did not have the ability to replace the base component and there was no straightforward way to implement that outside of VecTraits.

    Worklets and Filters

    ExecutionSignatures are now optional for simple worklets

    If a worklet doesn't explicitly state an ExecutionSignature, VTK-m assumes the worklet has no return value, and each ControlSignature argument is passed to the worklet in the same order.

    For example if we had this worklet:

    struct DotProduct : public vtkm::worklet::WorkletMapField
    {
      using ControlSignature = void(FieldIn, FieldIn, FieldOut);
      using ExecutionSignature = void(_1, _2, _3);
    
      template <typename T, vtkm::IdComponent Size>
      VTKM_EXEC void operator()(const vtkm::Vec<T, Size>& v1,
                                const vtkm::Vec<T, Size>& v2,
                                T& outValue) const
      {
        outValue = vtkm::Dot(v1, v2);
      }
    };

    It can be simplified to be:

    struct DotProduct : public vtkm::worklet::WorkletMapField
    {
      using ControlSignature = void(FieldIn, FieldIn, FieldOut);
    
      template <typename T, vtkm::IdComponent Size>
      VTKM_EXEC void operator()(const vtkm::Vec<T, Size>& v1,
                                const vtkm::Vec<T, Size>& v2,
                                T& outValue) const
      {
        outValue = vtkm::Dot(v1, v2);
      }
    };

    Refactor topology mappings to clarify meaning

    The From and To nomenclature for topology mapping has been confusing for both users and developers, especially at lower levels where the intention of mapping attributes from one element to another is easily conflated with the concept of mapping indices (which maps in the exact opposite direction).

    These identifiers have been renamed to VisitTopology and IncidentTopology to clarify the direction of the mapping. The order in which these template parameters are specified for WorkletMapTopology have also been reversed, since eventually there may be more than one IncidentTopology, and having IncidentTopology at the end will allow us to replace it with a variadic template parameter pack in the future.

    Other implementation details supporting these worklets, include Fetch tags, Connectivity classes, and methods on the various CellSet classes (such as PrepareForInput have also reversed their template arguments.

    Provide a simplified way to state allowed value types for vtkm::filter::filters

    Previously VTK-m filters used a specialization of vtkm::filter::FilterTraits<> to control the acceptable input value types. For example if the WarpVector filter want to only allow vtkm::Vec3f_32 and vtkm::Vec3f_64 it would use:

    namespace vtkm { namespace filter {
    template <>
    class FilterTraits<WarpVector>
    {
    public:
      // WarpVector can only applies to Float and Double Vec3 arrays
      using InputFieldTypeList = vtkm::TypeListTagFieldVec3;
    };
    }}

    This increase the complexity of writing filters. To make this easier VTK-m now looks for a SupportedTypes define on the filter when a vtkm::filter::FilterTraits specialization doesn't exist. This allows filters to succinctly specify supported types, such as seen below for the WarpVector filter.

    class WarpVector : public vtkm::filter::FilterField<WarpVector>
    {
    public:
      using SupportedTypes = vtkm::TypeListTagFieldVec3;
    ...
    };
    ## `vtkm::cont::Invoker` is now a member of all VTK-m filters
    
    To simplify how vtkm filters are written we have made each vtkm::filter
    have a `vtkm::cont::Invoker` as member variable. The goal with this change
    is provide an uniform API for launching all worklets from within a filter.
    
    Lets consider the PointElevation filter. Previous to these changes the
    `DoExecute` would need to construct the correct dispatcher with the
    correct parameters as seen below:
    
    ```cpp
    template <typename T, typename StorageType, typename DerivedPolicy>
    inline VTKM_CONT vtkm::cont::DataSet PointElevation::DoExecute(
      const vtkm::cont::DataSet& inDataSet,
      const vtkm::cont::ArrayHandle<T, StorageType>& field,
      const vtkm::filter::FieldMetadata& fieldMetadata,
      vtkm::filter::PolicyBase<DerivedPolicy>)
    {
      vtkm::cont::ArrayHandle<vtkm::Float64> outArray;
    
      vtkm::worklet::DispatcherMapField<vtkm::worklet::PointElevation> dispatcher(this->Worklet);
      dispatcher.Invoke(field, outArray);
      ...
    }

    With these changes the filter can instead use this->Invoke and have the correct dispatcher executed. This makes it easier to teach and learn how to write new filters.

    template <typename T, typename StorageType, typename DerivedPolicy>
    inline VTKM_CONT vtkm::cont::DataSet PointElevation::DoExecute(
      const vtkm::cont::DataSet& inDataSet,
      const vtkm::cont::ArrayHandle<T, StorageType>& field,
      const vtkm::filter::FieldMetadata& fieldMetadata,
      vtkm::filter::PolicyBase<DerivedPolicy>)
    {
      vtkm::cont::ArrayHandle<vtkm::Float64> outArray;
    
      this->Invoke(this->Worklet, field, outArray);
      ...
    }

    Simplify creating results for vtkm::filter::filters

    As part of the process of making VTK-m filters easier to write for newcomers whe have a couple of changes to make constructing the output vtkm::cont::DataSet easier.

    First we have moved the CreateResult functions out of the internals namespace and directly into vtkm::filter. This makes it clearer to developers that this was the 'proper' way to construct the output DataSet.

    Second we have streamlined the collection of vtkm::filter::CreateResult methods to require the user to provide less information and provide clearer names explaing what they do.

    To construct output identical to the input but with a new field you now just pass the vtkm::filter::FieldMetadata as a paramter instead of explictly stating the field association, and the possible cell set name:

    return CreateResult(input, newField, name, fieldMetadata);

    To construct output identical to the input but with a cell field added you can now pass the vtkm::cont::CellSet as a paramter instead of explictly stating the field association, and the cell set name:

    return CreateResultFieldCell(input, newCellField, name, cellset);

    vtkm::filter::Filter now don't have an active CellSet

    vtkm::filter::FilterField has removed the concept of ActiveCellSetIndex. This has been done as vtkm::cont::DataSet now only contains a single vtkm::cont::CellSet.

    vtkm::filter::FilterField now provides all functionality of vtkm::filter::FilterCell

    The FilterCell was a subclass of vtkm::filter::FilterField and behaves essentially the same but provided the pair of methods SetActiveCellSetIndex and GetActiveCellSetIndex. It was a common misconception that FilterCell was meant for Cell based algorithms, instead of algorithms that required access to the active vtkm::cont::CellSet.

    By moving SetActiveCellSetIndex and GetActiveCellSetIndex to FilterField, we remove this confusion.

    vtkm::worklet::WorkletPointNeighborhood can query exact neighbor offset locations

    Add ability to test exact neighbor offset locations in BoundaryState.

    The following methods:

    BoundaryState::InXBoundary
    BoundaryState::InYBoundary
    BoundaryState::InZBoundary
    BoundaryState::InBoundary

    have been renamed to:

    BoundaryState::IsRadiusInXBoundary
    BoundaryState::IsRadiusInYBoundary
    BoundaryState::IsRadiusInZBoundary
    BoundaryState::IsRadiusInBoundary

    to distinguish them from the new methods:

    BoundaryState::IsNeighborInXBoundary
    BoundaryState::IsNeighborInYBoundary
    BoundaryState::IsNeighborInZBoundary
    BoundaryState::IsNeighborInBoundary

    which check a specific neighbor sample offset instead of a full radius.

    The method BoundaryState::ClampNeighborIndex has also been added, which clamps a 3D neighbor offset vector to the dataset boundaries.

    This allows iteration through only the valid points in a neighborhood using either of the following patterns:

    Using ClampNeighborIndex to restrict the iteration space:

    struct MyWorklet : public vtkm::worklet::WorkletPointNeighborhood
    {
    public:
      using ControlSignature = void(CellSetIn, FieldInNeighborhood, FieldOut);
      using ExecutionSignature = void(_2, Boundary, _3);
    
      template <typename InNeighborhoodT, typename OutDataT>
      VTKM_EXEC void operator()(const InNeighborhoodT& inData,
                                const vtkm::exec::BoundaryState &boundary,
                                OutDataT& outData) const
      {
        // Clamp the radius to the dataset bounds (discard out-of-bounds points).
        const auto minRadius = boundary.ClampNeighborIndex({-10, -10, -10});
        const auto maxRadius = boundary.ClampNeighborIndex({10, 10, 10});
    
        for (vtkm::IdComponent k = minRadius[2]; k <= maxRadius[2]; ++k)
        {
          for (vtkm::IdComponent j = minRadius[1]; j <= maxRadius[1]; ++j)
          {
            for (vtkm::IdComponent i = minRadius[0]; i <= maxRadius[0]; ++i)
            {
              outData = doSomeConvolution(i, j, k, outdata, inData.Get(i, j, k));
            }
          }
        }
      }
    };

    or, using IsNeighborInBoundary methods to skip out-of-bounds loops:

    struct MyWorklet : public vtkm::worklet::WorkletPointNeighborhood
    {
    public:
      using ControlSignature = void(CellSetIn, FieldInNeighborhood, FieldOut);
      using ExecutionSignature = void(_2, Boundary, _3);
    
      template <typename InNeighborhoodT, typename OutDataT>
      VTKM_EXEC void operator()(const InNeighborhoodT& inData,
                                const vtkm::exec::BoundaryState &boundary,
                                OutDataT& outData) const
      {
        for (vtkm::IdComponent k = -10; k <= 10; ++k)
        {
          if (!boundary.IsNeighborInZBoundary(k))
          {
            continue;
          }
    
          for (vtkm::IdComponent j = -10; j <= 10; ++j)
          {
            if (!boundary.IsNeighborInYBoundary(j))
            {
              continue;
            }
    
            for (vtkm::IdComponent i = -10; i <= 10; ++i)
            {
              if (!boundary.IsNeighborInXBoundary(i))
              {
                continue;
              }
    
              outData = doSomeConvolution(i, j, k, outdata, inData.Get(i, j, k));
            }
          }
        }
      }
    };

    The latter is useful for implementing a convolution that substitutes a constant value for out-of-bounds indices:

    struct MyWorklet : public vtkm::worklet::WorkletPointNeighborhood
    {
    public:
      using ControlSignature = void(CellSetIn, FieldInNeighborhood, FieldOut);
      using ExecutionSignature = void(_2, Boundary, _3);
    
      template <typename InNeighborhoodT, typename OutDataT>
      VTKM_EXEC void operator()(const InNeighborhoodT& inData,
                                const vtkm::exec::BoundaryState &boundary,
                                OutDataT& outData) const
      {
        for (vtkm::IdComponent k = -10; k <= 10; ++k)
        {
          for (vtkm::IdComponent j = -10; j <= 10; ++j)
          {
            for (vtkm::IdComponent i = -10; i <= 10; ++i)
            {
              if (boundary.IsNeighborInBoundary({i, j, k}))
              {
                outData = doSomeConvolution(i, j, k, outdata, inData.Get(i, j, k));
              }
              else
              { // substitute zero for out-of-bounds samples:
                outData = doSomeConvolution(i, j, k, outdata, 0);
              }
            }
          }
        }
      }
    };

    Add ability to get an array from a vtkm::cont::Field for a particular type

    Previously, whenever you got an array from a Field object from a call to an ApplyPolicy, you would get back a VariantArrayHandle that allows you to cast to multiple types. To use that, you then have to cast it to multiple different types and multiple different storage.

    Often, this is what you want. If you are operating on a field, then you want to cast to the native type. But there are also cases where you know a specific type you want. For example, if you are operating on two fields, it makes sense to find the exact type for the first field and then cast the second field to that type if necessary rather than pointlessly unroll templates for the cross of every possible combination. Also, we are not unrolling for different storage types or attempting to create a virtual array. Instead, we are using an ArrayHandleMultiplexer so that you only have to compile for this array once.

    This is done through a new version of ApplyPolicy. This version takes a type of the array as its first template argument, which must be specified.

    This requires having a list of potential storage to try. It will use that to construct an ArrayHandleMultiplexer containing all potential types. This list of storages comes from the policy. A StorageList item was added to the policy. It is also sometimes necessary for a filter to provide its own special storage types. Thus, an AdditionalFieldStorage type was added to Filter which is set to a ListTag of storage types that should be added to those specified by the policy.

    Types are automatically converted. So if you ask for a vtkm::Float64 and field contains a vtkm::Float32, it will the array wrapped in an ArrayHandleCast to give the expected type.

    Here is an example where you are doing an operation on a field and coordinate system. The superclass finds the correct type of the field. Your result is just going to follow the type of the field.

    template <typename T, typename StorageType, typename DerivedPolicy>
    inline VTKM_CONT vtkm::cont::DataSet MyFilter::DoExecute(
      const vtkm::cont::DataSet& inDataSet,
      const vtkm::cont::ArrayHandle<T, StorageType>& field,
      const vtkm::filter::FieldMetadata& fieldMetadata,
      vtkm::filter::PolicyBase<DerivedPolicy> policy)
    {
      vtkm::cont::CoordinateSystem coords = inDataSet.GetCoordianteSystem();
      auto coordsArray = vtkm::filter::ApplyPolicy<T>(coords, policy, *this);

    Add Lagrangian Coherent Structures (LCS) Filter for VTK-m

    The new filter vtkm::filter::LagrangianStructures is meant for Finite Time Lyapunov Exponent (FTLE) calculation using VTK-m. The filter allows users to calculate FTLE in two ways

    1. Provide a dataset with a vector field, which will be used to generate a flow map.
    2. Provide a dataset containing a flow map, which can be readily used for the FTLE field calculation.

    The filter returns a dataset with a point field named FTLE. Is the input is strucutred and an auxiliary grid was not used, the filter will add the field to the original dataset set, else a new structured dataset is returned.

    SurfaceNormals filter can now orient normals

    The OrientNormals worklet has been added to the SurfaceNormals filter, and is enabled by turning on the AutoOrientNormals option. This feature ensures that all normals generated by the filter will point out of the dataset (or inward if the FlipNormals option is true). In addition, SurfaceNormals now has a Consistency option that forces all triangle windings to be consistent with the cell normal direction (the cell points are specified in counter-clockwise order around the normal).

    This functionality is provided by the following new worklets:

    • OrientNormals
      • RunOrientCellNormals
      • RunOrientPointNormals
      • RunOrientPointAndCellNormals
      • RunFlipNormals
    • TriangleWinding

    Particle advection components have better status query support

    There are now special statuses for Particle, Integrator, and Evaluator.

    The particle advection modules only supported statuses for particles and made it difficult to handle advanced integtator statuses. Now each of the three important modules return their own statuses

    Particles have vtkm::worklet::particleadvection::ParticleStatus, Integrators have vtkm::worklet::particleadvection::IntegratorStatus, and Evaluators have vtkm::worklet::particleadvection::EvaluatorStatus.

    Further, names of the statuses in vtkm::worklet::particleadvection::ParticleStatus have changed

    ParticleStatus::STATUS_OK is now ParticleStatus::SUCCESS, and there is another status ParticleStatus::TOOK_ANY_STEPS which is active if the particle has taken at least one step with the current data.

    There are few more changes that allow particle advection in 2D structured grids.

    vtkm::filter::Threshold now outputs a vtkm::cont::CellSetExplicit

    Perhaps a better title for this change would be "Make the Threshold filter not totally useless."

    A long standing issue with the Threshold filter is that its output CellSet was stored in a CellSetPermutation. This made Threshold hyper- efficient because it required hardly any data movement to implement. However, the problem was that any other unit that had to use the CellSet failed. To have VTK-m handle that output correctly in other filters and writers, they all would have to check for the existance of CellSetPermutation. And CellSetPermutation is templated on the CellSet type it is permuting, so all units would have to compile special cases for all these combinations. This is not likely to be feasible in any static solution.

    The simple solution, implemented here, is to deep copy the cells to a CellSetExplicit, which is a known type that is already used everywhere in VTK-m. The solution is a bit disappointing since it requires more memory and time to build. But it is on par with solutions in other libraries (like VTK). And it really does not matter how efficient the old solution was if it was useless.

    Build

    Introduce vtkm_add_target_information cmake function to make using vtk-m easier

    This higher order function allow build-systems that use VTK-m to use add_library or add_executable calls but still have an easy to way to get the required information to have VTK-m using compilation units compile correctly.

     vtkm_add_target_information(
       target[s]
       [ DROP_UNUSED_SYMBOLS ]
       [ MODIFY_CUDA_FLAGS ]
       [ EXTENDS_VTKM ]
       [ DEVICE_SOURCES <source_list> ]
       )

    Usage:

      add_library(lib_that_uses_vtkm STATIC a.cxx)
      vtkm_add_target_information(lib_that_uses_vtkm
                                  MODIFY_CUDA_FLAGS
                                  DEVICE_SOURCES a.cxx
                                  )
      target_link_libraries(lib_that_uses_vtkm PRIVATE vtkm_filter)

    Options to vtkm_add_target_information

    • DROP_UNUSED_SYMBOLS: If enabled will apply the appropiate link flags to drop unused VTK-m symbols. This works as VTK-m is compiled with -ffunction-sections which allows for the linker to remove unused functions. If you are building a program that loads runtime plugins that can call VTK-m this most likely shouldn't be used as symbols the plugin expects to exist will be removed. Enabling this will help keep library sizes down when using static builds of VTK-m as only the functions you call will be kept. This can have a dramatic impact on the size of the resulting executable / shared library.

    • MODIFY_CUDA_FLAGS: If enabled will add the required -arch= flags that VTK-m was compiled with. If you have multiple libraries that use VTK-m calling vtkm_add_target_information multiple times with MODIFY_CUDA_FLAGS will cause duplicate compiler flags. To resolve this issue you can; pass all targets and sources to a single vtkm_add_target_information call, have the first one use MODIFY_CUDA_FLAGS, or use the provided standalone vtkm_get_cuda_flags function.

    • DEVICE_SOURCES: The collection of source files that are used by target(s) that need to be marked as going to a special compiler for certain device adapters such as CUDA.

    • EXTENDS_VTKM: Some programming models have restrictions on how types can be used, passed across library boundaries, and derived from. For example CUDA doesn't allow device side calls across dynamic library boundaries, and requires all polymorphic classes to be reachable at dynamic library/executable link time. To accommodate these restrictions we need to handle the following allowable use-cases:

      • Object library: do nothing, zero restrictions
      • Executable: do nothing, zero restrictions
      • Static library: do nothing, zero restrictions
      • Dynamic library:
        • Wanting to use VTK-m as implementation detail, doesn't expose VTK-m types to consumers. This is supported no matter if CUDA is enabled.
        • Wanting to extend VTK-m and provide these types to consumers. This is only supported when CUDA isn't enabled. Otherwise we need to ERROR!
        • Wanting to pass known VTK-m types across library boundaries for others to use in filters/worklets. This is only supported when CUDA isn't enabled. Otherwise we need to ERROR!

      For most consumers they can ignore the EXTENDS_VTKM property as the default will be correct.

    The vtkm_add_target_information higher order function leverages the vtkm_add_drop_unused_function_flags and vtkm_get_cuda_flags functions which can be used by VTK-m consuming applications.

    The vtkm_add_drop_unused_function_flags function implements all the behavior of DROP_UNUSED_SYMBOLS for a single target.

    The vtkm_get_cuda_flags function implements a general form of MODIFY_CUDA_FLAGS but instead of modiyfing the CMAKE_CUDA_FLAGS it will add the flags to any variable passed to it.

    Other

    Simplify examples

    Lots of the examples were out of date or way too verbose. The examples have been simplified and brought up to modern VTK-m conventions.

    We have also added a "hello worklet" example to be a minimal example of creating a working algorithm (wrapped in a filter) in VTK-m (and used).

    vtkm::Vec const& operator[] is now constexpr

    This was done to allow for developers to write normal operations on vtkm::Vec but have the resolved at compile time, allowing for both readible code and no runtime cost.

    Now you can do things such as:

      constexpr vtkm::Id2 dims(16,16);
      constexpr vtkm::Float64 dx = vtkm::Float64(4.0 * vtkm::Pi()) / vtkm::Float64(dims[0] - 1);
  • v1.4.0   Release VTK-m 1.4.0 1.4.0 is our fifth official release of VTK-m. The major changes to VTK-m from 1.3.0 can be found in: docs/changelog/1.4/release-notes.md
    903c2604 · Release VTK-m 1.4.0 ·

    VTK-m 1.4 Release Notes

    Table of Contents

    1. Core
      • Remove templates from ControlSignature field tags
      • Worklets can now be specialized for a specific device adapter
      • Worklets now support an execution mask
      • Redesign VTK-m Runtime Device Tracking
      • vtkm::cont::Initialize added to make setting up VTK-m runtime state easier
    2. ArrayHandle
      • Add vtkm::cont::ArrayHandleVirtual
      • vtkm::cont::ArrayHandleZip provides a consistent API even with non-writable handles
      • vtkm::cont::VariantArrayHandle replaces vtkm::cont::DynamicArrayHandle
      • vtkm::cont::VariantArrayHandle CastAndCall supports casting to concrete types
      • vtkm::cont::VariantArrayHandle::AsVirtual<T>() performs casting
      • StorageBasic::StealArray() now provides delete function to new owner
    3. Control Environment
      • vtkm::cont::CellLocatorGeneral has been added
      • vtkm::cont::CellLocatorTwoLevelUniformGrid has been renamed to vtkm::cont::CellLocatorUniformBins
      • vtkm::cont::Timer now supports asynchronous and device independent timers
      • vtkm::cont::DeviceAdapterId construction from strings are now case-insensitive
      • vtkm::cont::Initialize will only parse known arguments
    4. Execution Environment
      • VTK-m logs details about each CUDA kernel launch
      • VTK-m CUDA allocations can have managed memory (cudaMallocManaged) enabled/disabled from C++
      • VTK-m CUDA kernel scheduling improved including better defaults, and user customization support
      • VTK-m Reduction algorithm now supports differing input and output types
      • Added specialized operators for ArrayPortalValueReference
    5. Worklets and Filters
      • vtkm::worklet::Invoker now supports worklets which require a Scatter object
      • BitFields are now a support field input/out type for VTK-m worklets
      • Added a Point Merging worklet
      • vtkm::filter::CleanGrid now can do point merging
      • Added a connected component worklets and filters
    6. Build
      • CMake 3.8+ now required to build VTK-m
      • VTK-m now can verify that it installs itself correctly
      • VTK-m now requires CUDA separable compilation to build
      • VTK-m provides a vtkm_filter CMake target
      • vtkm::cont::CellLocatorBoundingIntervalHierarchy is compiled into vtkm_cont
    7. Other
      • LodePNG added as a thirdparty package
      • Optionparser added as a thirdparty package
      • Thirdparty diy now can coexist with external diy
      • Merge benchmark executables into a device dependent shared library
      • Merge rendering testing executables to a shared library
      • Merge worklet testing executables into a device dependent shared library
      • VTK-m runtime device detection properly handles busy CUDA devices

    Core

    Remove templates from ControlSignature field tags

    Previously, several of the ControlSignature tags had a template to specify a type list. This was to specify potential valid value types for an input array. The importance of this typelist was to limit the number of code paths created when resolving a vtkm::cont::VariantArrayHandle (formerly a DynamicArrayHandle). This (potentially) reduced the compile time, the size of libraries/executables, and errors from unexpected types.

    Much has changed since this feature was originally implemented. Since then, the filter infrastructure has been created, and it is through this that most dynamic worklet invocations happen. However, since the filter infrastrcture does its own type resolution (and has its own policies) the type arguments in ControlSignature are now of little value.

    Script to update code

    This update requires changes to just about all code implementing a VTK-m worklet. To facilitate the update of this code to these new changes (not to mention all the code in VTK-m) a script is provided to automatically remove these template parameters from VTK-m code.

    This script is at Utilities/Scripts/update-control-signature-tags.sh. It needs to be run in a Unix-compatible shell. It takes a single argument, which is a top level directory to modify files. The script processes all C++ source files recursively from that directory.

    Selecting data types for auxiliary filter fields

    The main rational for making these changes is that the types of the inputs to worklets is almost always already determined by the calling filter. However, although it is straightforward to specify the type of the "main" (active) scalars in a filter, it is less clear what to do for additional fields if a filter needs a second or third field.

    Typically, in the case of a second or third field, it is up to the DoExecute method in the filter implementation to apply a policy to that field. When applying a policy, you give it a policy object (nominally passed by the user) and a traits of the filter. Generally, the accepted list of types for a field should be part of the filter's traits. For example, consider the WarpVector filter. This filter only works on Vecs of size 3, so its traits class looks like this.

    template <>
    class FilterTraits<WarpVector>
    {
    public:
      // WarpVector can only applies to Float and Double Vec3 arrays
      using InputFieldTypeList = vtkm::TypeListTagFieldVec3;
    };

    However, the WarpVector filter also requires two fields instead of one. The first (active) field is handled by its superclass (FilterField), but the second (auxiliary) field must be managed in the DoExecute. Generally, this can be done by simply applying the policy with the filter traits.

    The corner cases

    Most of the calls to worklets happen within filter implementations, which have their own way of narrowing down potential types (as previously described). The majority of the remainder either use static types or work with a variety of types.

    However, there is a minority of corner cases that require a reduction of types. Since the type argument of the worklet ControlSignature arguments are no longer available, the narrowing of types must be done before the call to Invoke.

    This narrowing of arguments is not particularly difficult. Such type-unsure arguments usually come from a VariantArrayHandle (or something that uses one). You can select the types from a VariantArrayHandle simply by using the ResetTypes method. For example, say you know that a variant array is supposed to be a scalar.

    dispatcher.Invoke(variantArray.ResetTypes(vtkm::TypeListTagFieldScalar()),
                      staticArray);

    Even more common is to have a vtkm::cont::Field object. A Field object internally holds a VariantArrayHandle, which is accessible via the GetData method.

    dispatcher.Invoke(field.GetData().ResetTypes(vtkm::TypeListTagFieldScalar()),
                      staticArray);

    Change in executable size

    The whole intention of these template parameters in the first place was to reduce the number of code paths compiled. The hypothesis of this change was that in the current structure the code paths were not being reduced much if at all. If that is true, the size of executables and libraries should not change.

    Here is a recording of the library and executable sizes before this change (using ds -h).

    3.0M    libvtkm_cont-1.2.1.dylib
    6.2M    libvtkm_rendering-1.2.1.dylib
    312K    Rendering_SERIAL
    312K    Rendering_TBB
     22M    Worklets_SERIAL
     23M    Worklets_TBB
     22M    UnitTests_vtkm_filter_testing
    5.7M    UnitTests_vtkm_cont_serial_testing
    6.0M    UnitTests_vtkm_cont_tbb_testing
    7.1M    UnitTests_vtkm_cont_testing

    After the changes, the executable sizes are as follows.

    3.0M    libvtkm_cont-1.2.1.dylib
    6.0M    libvtkm_rendering-1.2.1.dylib
    312K    Rendering_SERIAL
    312K    Rendering_TBB
     21M    Worklets_SERIAL
     21M    Worklets_TBB
     22M    UnitTests_vtkm_filter_testing
    5.6M    UnitTests_vtkm_cont_serial_testing
    6.0M    UnitTests_vtkm_cont_tbb_testing
    7.1M    UnitTests_vtkm_cont_testing

    As we can see, the built sizes have not changed significantly. (If anything, the build is a little smaller.)

    Worklets can now be specialized for a specific device adapter

    This change adds an execution signature tag named Device that passes a DeviceAdapterTag to the worklet's parenthesis operator. This allows the worklet to specialize its operation. This features is available in all worklets.

    The following example shows a worklet that specializes itself for the CUDA device.

    struct DeviceSpecificWorklet : vtkm::worklet::WorkletMapField
    {
      using ControlSignature = void(FieldIn, FieldOut);
      using ExecutionSignature = _2(_1, Device);
    
      // Specialization for the Cuda device.
      template <typename T>
      T operator()(T x, vtkm::cont::DeviceAdapterTagCuda) const
      {
        // Special cuda implementation
      }
    
      // General implementation
      template <typename T, typename Device>
      T operator()(T x, Device) const
      {
        // General implementation
      }
    };

    Effect on compile time and binary size

    This change necessitated adding a template parameter for the device that followed at least from the schedule all the way down. This has the potential for duplicating several of the support methods (like DoWorkletInvokeFunctor) that would otherwise have the same type. This is especially true between the devices that run on the CPU as they should all be sharing the same portals from ArrayHandles. So the question is whether it causes compile to take longer or cause a significant increase in binaries.

    To informally test, I first ran a clean debug compile on my Windows machine with the serial and tbb devices. The build itself took 3 minutes, 50 seconds. Here is a list of the binary sizes in the bin directory:

    kmorel2 0> du -sh *.exe *.dll
    200K    BenchmarkArrayTransfer_SERIAL.exe
    204K    BenchmarkArrayTransfer_TBB.exe
    424K    BenchmarkAtomicArray_SERIAL.exe
    424K    BenchmarkAtomicArray_TBB.exe
    440K    BenchmarkCopySpeeds_SERIAL.exe
    580K    BenchmarkCopySpeeds_TBB.exe
    4.1M    BenchmarkDeviceAdapter_SERIAL.exe
    5.3M    BenchmarkDeviceAdapter_TBB.exe
    7.9M    BenchmarkFieldAlgorithms_SERIAL.exe
    7.9M    BenchmarkFieldAlgorithms_TBB.exe
    22M     BenchmarkFilters_SERIAL.exe
    22M     BenchmarkFilters_TBB.exe
    276K    BenchmarkRayTracing_SERIAL.exe
    276K    BenchmarkRayTracing_TBB.exe
    4.4M    BenchmarkTopologyAlgorithms_SERIAL.exe
    4.4M    BenchmarkTopologyAlgorithms_TBB.exe
    712K    Rendering_SERIAL.exe
    712K    Rendering_TBB.exe
    708K    UnitTests_vtkm_cont_arg_testing.exe
    1.7M    UnitTests_vtkm_cont_internal_testing.exe
    13M     UnitTests_vtkm_cont_serial_testing.exe
    14M     UnitTests_vtkm_cont_tbb_testing.exe
    18M     UnitTests_vtkm_cont_testing.exe
    13M     UnitTests_vtkm_cont_testing_mpi.exe
    736K    UnitTests_vtkm_exec_arg_testing.exe
    136K    UnitTests_vtkm_exec_internal_testing.exe
    196K    UnitTests_vtkm_exec_serial_internal_testing.exe
    196K    UnitTests_vtkm_exec_tbb_internal_testing.exe
    2.0M    UnitTests_vtkm_exec_testing.exe
    83M     UnitTests_vtkm_filter_testing.exe
    476K    UnitTests_vtkm_internal_testing.exe
    148K    UnitTests_vtkm_interop_internal_testing.exe
    1.3M    UnitTests_vtkm_interop_testing.exe
    2.9M    UnitTests_vtkm_io_reader_testing.exe
    548K    UnitTests_vtkm_io_writer_testing.exe
    792K    UnitTests_vtkm_rendering_testing.exe
    3.7M    UnitTests_vtkm_testing.exe
    320K    UnitTests_vtkm_worklet_internal_testing.exe
    65M     UnitTests_vtkm_worklet_testing.exe
    11M     vtkm_cont-1.3.dll
    2.1M    vtkm_interop-1.3.dll
    21M     vtkm_rendering-1.3.dll
    3.9M    vtkm_worklet-1.3.dll

    After making the singular change to the Invocation object to add the DeviceAdapterTag as a template parameter (which should cause any extra compile instances) the compile took 4 minuts and 5 seconds. Here is the new list of binaries.

    kmorel2 0> du -sh *.exe *.dll
    200K    BenchmarkArrayTransfer_SERIAL.exe
    204K    BenchmarkArrayTransfer_TBB.exe
    424K    BenchmarkAtomicArray_SERIAL.exe
    424K    BenchmarkAtomicArray_TBB.exe
    440K    BenchmarkCopySpeeds_SERIAL.exe
    580K    BenchmarkCopySpeeds_TBB.exe
    4.1M    BenchmarkDeviceAdapter_SERIAL.exe
    5.3M    BenchmarkDeviceAdapter_TBB.exe
    7.9M    BenchmarkFieldAlgorithms_SERIAL.exe
    7.9M    BenchmarkFieldAlgorithms_TBB.exe
    22M     BenchmarkFilters_SERIAL.exe
    22M     BenchmarkFilters_TBB.exe
    276K    BenchmarkRayTracing_SERIAL.exe
    276K    BenchmarkRayTracing_TBB.exe
    4.4M    BenchmarkTopologyAlgorithms_SERIAL.exe
    4.4M    BenchmarkTopologyAlgorithms_TBB.exe
    712K    Rendering_SERIAL.exe
    712K    Rendering_TBB.exe
    708K    UnitTests_vtkm_cont_arg_testing.exe
    1.7M    UnitTests_vtkm_cont_internal_testing.exe
    13M     UnitTests_vtkm_cont_serial_testing.exe
    14M     UnitTests_vtkm_cont_tbb_testing.exe
    19M     UnitTests_vtkm_cont_testing.exe
    13M     UnitTests_vtkm_cont_testing_mpi.exe
    736K    UnitTests_vtkm_exec_arg_testing.exe
    136K    UnitTests_vtkm_exec_internal_testing.exe
    196K    UnitTests_vtkm_exec_serial_internal_testing.exe
    196K    UnitTests_vtkm_exec_tbb_internal_testing.exe
    2.0M    UnitTests_vtkm_exec_testing.exe
    86M     UnitTests_vtkm_filter_testing.exe
    476K    UnitTests_vtkm_internal_testing.exe
    148K    UnitTests_vtkm_interop_internal_testing.exe
    1.3M    UnitTests_vtkm_interop_testing.exe
    2.9M    UnitTests_vtkm_io_reader_testing.exe
    548K    UnitTests_vtkm_io_writer_testing.exe
    792K    UnitTests_vtkm_rendering_testing.exe
    3.7M    UnitTests_vtkm_testing.exe
    320K    UnitTests_vtkm_worklet_internal_testing.exe
    68M     UnitTests_vtkm_worklet_testing.exe
    11M     vtkm_cont-1.3.dll
    2.1M    vtkm_interop-1.3.dll
    21M     vtkm_rendering-1.3.dll
    3.9M    vtkm_worklet-1.3.dll

    So far the increase is quite negligible.

    Worklets now support an execution mask

    There have recently been use cases where it would be helpful to mask out some of the invocations of a worklet. The idea is that when invoking a worklet with a mask array on the input domain, you might implement your worklet more-or-less like the following.

    VTKM_EXEC void operator()(bool mask, /* other parameters */)
    {
      if (mask)
      {
        // Do interesting stuff
      }
    }

    This works, but what if your mask has mostly false values? In that case, you are spending tons of time loading data to and from memory where fields are stored for no reason.

    You could potentially get around this problem by adding a scatter to the worklet. However, that will compress the output arrays to only values that are active in the mask. That is problematic if you want the masked output in the appropriate place in the original arrays. You will have to do some complex (and annoying and possibly expensive) permutations of the output arrays.

    Thus, we would like a new feature similar to scatter that instead masks out invocations so that the worklet is simply not run on those outputs.

    New Interface

    The new "Mask" feature that is similar (and orthogonal) to the existing "Scatter" feature. Worklet objects now define a MaskType that provides on object that manages the selections of which invocations are skipped. The following Mask objects are defined.

    • MaskNone - This removes any mask of the output. All outputs are generated. This is the default if no MaskType is explicitly defined.
    • MaskSelect - The user to provides an array that specifies whether each output is created with a 1 to mean that the output should be created an 0 the mean that it should not.
    • MaskIndices - The user provides an array with a list of indices for all outputs that should be created.

    It will be straightforward to implement other versions of masks. (For example, you could make a mask class that selectes every Nth entry.) Those could be made on an as-needed basis.

    Implementation

    The implementation follows the same basic idea of how scatters are implemented.

    Mask Classes

    The mask class is required to implement the following items.

    • ThreadToOutputType - A type for an array that maps a thread index (an index in the array) to an output index. A reasonable type for this could be vtkm::cont::ArrayHandle<vtkm::Id>.
    • GetThreadToOutputMap - Given the range for the output (e.g. the number of items in the output domain), returns an array of type ThreadToOutputType that is the actual map.
    • GetThreadRange - Given a range for the output (e.g. the number of items in the output domain), returns the range for the threads (e.g. the number of times the worklet will be invoked).

    Dispatching

    The vtkm::worklet::internal::DispatcherBase manages a mask class in the same way it manages the scatter class. It gets the MaskType from the worklet it is templated on. It requires a MaskType object during its construction.

    Previously the dispatcher (and downstream) had to manage the range and indices of inputs and threads. They now have to also manage a separate output range/index as now all three may be different.

    The vtkm::Invocation is changed to hold the ThreadToOutputMap array from the mask. It likewises has a templated ChangeThreadToOutputMap method added (similar to those already existing for the arrays from a scatter). This method is used in DispatcherBase::InvokeTransportParameters to add the mask's array to the invocation before calling InvokeSchedule.

    Thread Indices

    With the addition of masks, the ThreadIndices classes are changed to manage the actual output index. Previously, the output index was always the same as the thread index. However, now these two can be different. The GetThreadIndices methods of the worklet base classes have an argument added that is the portal to the ThreadToOutputMap.

    The worklet GetThreadIndices is called from the Task classes. These classes are changed to pass in this additional argument. Since the Task classes get an Invocation object from the dispatcher, which contains the ThreadToOutputMap, this change is trivial.

    Interaction Between Mask and Scatter

    Although it seems weird, it should work fine to mix scatters and masks. The scatter will first be applied to the input to generate a (potential) list of output elements. The mask will then be applied to these output elements.

    Redesign VTK-m Runtime Device Tracking

    The device tracking infrastructure in VTK-m has been redesigned to remove multiple redundant codes paths and to simplify reasoning about around what an instance of RuntimeDeviceTracker will modify.

    vtkm::cont::RuntimeDeviceTracker tracks runtime information on a per-user thread basis. This is done to allow multiple calling threads to use different vtk-m backends such as seen in this example:

      vtkm::cont::DeviceAdapterTagCuda cuda;
      vtkm::cont::DeviceAdapterTagOpenMP openmp;
      { // thread 1
        auto& tracker = vtkm::cont::GetRuntimeDeviceTracker();
        tracker->ForceDevice(cuda);
        vtkm::worklet::Invoker invoke;
        invoke(LightTask{}, input, output);
        vtkm::cont::Algorithm::Sort(output);
        invoke(HeavyTask{}, output);
      }
    
     { // thread 2
        auto& tracker = vtkm::cont::GetRuntimeDeviceTracker();
        tracker->ForceDevice(openmp);
        vtkm::worklet::Invoker invoke;
        invoke(LightTask{}, input, output);
        vtkm::cont::Algorithm::Sort(output);
        invoke(HeavyTask{}, output);
      }

    Note: GetGlobalRuntimeDeviceTracker has ben refactored to be GetRuntimeDeviceTracker as it always returned a unique instance for each control side thread. This design allows for different threads to have different runtime device settings. By removing the term Global from the name it becomes more clear what scope this class has.

    While this address the ability for threads to specify what device they should run on. It doesn't make it easy to toggle the status of a device in a programmatic way, for example the following block forces execution to only occur on cuda and doesn't restore previous active devices after

      {
      vtkm::cont::DeviceAdapterTagCuda cuda;
      auto& tracker = vtkm::cont::GetRuntimeDeviceTracker();
      tracker->ForceDevice(cuda);
      vtkm::worklet::Invoker invoke;
      invoke(LightTask{}, input, output);
      }
      //openmp/tbb/... still inactive

    To resolve those issues we have vtkm::cont::ScopedRuntimeDeviceTracker which has the same interface as vtkm::cont::RuntimeDeviceTracker but additionally resets any per-user thread modifications when it goes out of scope. So by switching over the previous example to use ScopedRuntimeDeviceTracker we correctly restore the threads RuntimeDeviceTracker state when tracker goes out of scope.

      {
      vtkm::cont::DeviceAdapterTagCuda cuda;
      vtkm::cont::ScopedRuntimeDeviceTracker tracker(cuda);
      vtkm::worklet::Invoker invoke;
      invoke(LightTask{}, input, output);
      }
      //openmp/tbb/... are now again active

    The vtkm::cont::ScopedRuntimeDeviceTracker is not limited to forcing execution to occur on a single device. When constructed it can either force execution to a device, disable a device or enable a device. These options also work with the DeviceAdapterTagAny.

      {
      //enable all devices
      vtkm::cont::DeviceAdapterTagAny any;
      vtkm::cont::ScopedRuntimeDeviceTracker tracker(any,
                                                     vtkm::cont::RuntimeDeviceTrackerMode::Enable);
      ...
      }
    
      {
      //disable only cuda
      vtkm::cont::DeviceAdapterTagCuda cuda;
      vtkm::cont::ScopedRuntimeDeviceTracker tracker(cuda,
                                                     vtkm::cont::RuntimeDeviceTrackerMode::Disable);
    
      ...
      }

    vtkm::cont::Initialize added to make setting up VTK-m runtime state easier

    A new initialization function, vtkm::cont::Initialize, has been added. Initialization is not required, but will configure the logging utilities (when enabled) and allows forcing a device via a -d or --device command line option.

    Usage:

    #include <vtkm/cont/Initialize.h>
    
    int main(int argc, char *argv[])
    {
      auto config = vtkm::cont::Initialize(argc, argv);
    
      ...
    }

    ArrayHandle

    Add vtkm::cont::ArrayHandleVirtual

    Added a new class named vtkm::cont::ArrayHandleVirtual that allows you to type erase an ArrayHandle storage type by using virtual calls. This simplification makes storing Fields and Coordinates significantly easier as VTK-m doesn't need to deduce both the storage and value type when executing worklets.

    To construct an vtkm::cont::ArrayHandleVirtual one can do the following:

    vtkm::cont::ArrayHandle<vtkm::Float32> pressure;
    vtkm::cont::ArrayHandleConstant<vtkm::Float32> constant(42.0f);
    
    
    // constrcut from an array handle
    vtkm::cont::ArrayHandleVirtual<vtkm::Float32> v(pressure);
    
    // or assign from an array handle
    v = constant;
    

    To help maintain performance vtkm::cont::ArrayHandleVirtual provides a collection of helper functions/methods to query and cast back to the concrete storage and value type:

    vtkm::cont::ArrayHandleConstant<vtkm::Float32> constant(42.0f);
    vtkm::cont::ArrayHandleVirtual<vtkm::Float32> v = constant;
    
    const bool isConstant = vtkm::cont::IsType< decltype(constant) >(v);
    if(isConstant)
      vtkm::cont::ArrayHandleConstant<vtkm::Float32> t = vtkm::cont::Cast< decltype(constant) >(v);
    

    Lastly, a common operation of calling code using ArrayHandleVirtual is a desire to construct a new instance of an existing virtual handle with the same storage type. This can be done by using the NewInstance method as seen below

    vtkm::cont::ArrayHandle<vtkm::Float32> pressure;
    vtkm::cont::ArrayHandleVirtual<vtkm::Float32> v = pressure;
    
    vtkm::cont::ArrayHandleVirtual<vtkm::Float32> newArray = v->NewInstance();
    bool isConstant = vtkm::cont::IsType< vtkm::cont::ArrayHandle<vtkm::Float32> >(newArray); //will be true

    vtkm::cont::ArrayHandleZip provides a consistent API even with non-writable handles

    Previously vtkm::cont::ArrayHandleZip could not wrap an implicit handle and provide a consistent experience. The primary issue was that if you tried to use the PortalType returned by GetPortalControl() you would get a compile failure. This would occur as the PortalType returned would try to call Set on an ImplicitPortal which doesn't have a set method.

    Now with this change, the ZipPortal use SFINAE to determine if Set and Get should call the underlying zipped portals.

    vtkm::cont::VariantArrayHandle replaces vtkm::cont::DynamicArrayHandle

    vtkm::cont::ArrayHandleVariant replaces vtkm::cont::DynamicArrayHandle as the primary method for holding onto a type erased vtkm::cont::ArrayHandle. The major difference between the two implementations is how they handle the Storage component of an array handle.

    vtkm::contDynamicArrayHandle approach was to find the fully deduced type of the ArrayHandle meaning it would check all value and storage types it knew about until it found a match. This cross product of values and storages would cause significant compilation times when a DynamicArrayHandle had multiple storage types.

    vtkm::cont::VariantArrayHandle approach is to only deduce the value type of the ArrayHandle and return a vtkm::cont::ArrayHandleVirtual which uses polymorpishm to hide the actual storage type. This approach allows for better compile times, and for calling code to always expect an ArrayHandleVirtual instead of the fully deduced type. This conversion to ArrayHandleVirtual is usually done internally within VTK-m when a worklet or filter is invoked.

    In certain cases users of VariantArrayHandle want to be able to access the concrete ArrayHandle<T,S> and not have it wrapped in a ArrayHandleVirtual. For those occurrences VariantArrayHandle provides a collection of helper functions/methods to query and cast back to the concrete storage and value type:

    vtkm::cont::ArrayHandleConstant<vtkm::Float32> constant(42.0f);
    vtkm::cont::ArrayHandleVariant v(constant);
    
    const bool isConstant = vtkm::cont::IsType< decltype(constant) >(v);
    if(isConstant)
      vtkm::cont::ArrayHandleConstant<vtkm::Float32> t = vtkm::cont::Cast< decltype(constant) >(v);
    

    Lastly, a common operation of calling code using VariantArrayHandle is a desire to construct a new instance of an existing virtual handle with the same storage type. This can be done by using the NewInstance method as seen below:

    vtkm::cont::ArrayHandle<vtkm::Float32> pressure;
    vtkm::cont::ArrayHandleVariant v(pressure);
    
    vtkm::cont::ArrayHandleVariant newArray = v->NewInstance();
    const bool isConstant = vtkm::cont::IsType< decltype(pressure) >(newArray); //will be true

    vtkm::cont::VariantArrayHandle CastAndCall supports casting to concrete types

    Previously, the VariantArrayHandle::CastAndCall (and indirect calls through vtkm::cont::CastAndCall) attempted to cast to only vtkm::cont::ArrayHandleVirtual with different value types. That worked, but it meant that whatever was called had to operate through virtual functions.

    Under most circumstances, it is worthwhile to also check for some common storage types that, when encountered, can be accessed much faster. This change provides the casting to concrete storage types and now uses vtkm::cont::ArrayHandleVirtual as a fallback when no concrete storage type is found.

    By default, CastAndCall checks all the storage types in VTKM_DEFAULT_STORAGE_LIST_TAG, which typically contains only the basic storage. The ArrayHandleVirtual::CastAndCall method also allows you to override this behavior by specifying a different type list in the first argument. If the first argument is a list type, CastAndCall assumes that all the types in the list are storage tags. If you pass in vtkm::ListTagEmpty, then CastAndCall will always cast to an ArrayHandleVirtual (the previous behavior). Alternately, you can pass in storage tags that might be likely under the current usage.

    As an example, consider the following simple code.

    vtkm::cont::VariantArrayHandle array;
    
    // stuff happens
    
    array.CastAndCall(myFunctor);

    Previously, myFunctor would be called with vtkm::cont::ArrayHandleVirtual<T> with different type Ts. After this change, myFunctor will be called with that and with vtkm::cont::ArrayHandle<T> of the same type Ts.

    If you want to only call myFunctor with vtkm::cont::ArrayHandleVirtual<T>, then replace the previous line with

    array.CastAndCall(vtkm::ListTagEmpty(), myFunctor);

    Let's say that additionally using vtkm::cont::ArrayHandleIndex was also common. If you want to also specialize for that array, you can do so with the following line.

    array.CastAndCall(vtkm::ListTagBase<vtkm::cont::StorageBasic,
                                        vtkm::cont::ArrayHandleIndex::StorageTag>,
                      myFunctor);

    Note that myFunctor will be called with vtkm::cont::ArrayHandle<T,vtkm::cont::ArrayHandleIndex::StorageTag>, not vtkm::cont::ArrayHandleIndex.

    vtkm::cont::VariantArrayHandle::AsVirtual<T>() performs casting

    The AsVirtual<T> method of vtkm::cont::VariantArrayHandle now works for any arithmetic type, not just the actual type of the underlying array. This works by inserting an ArrayHandleCast between the underlying concrete array and the new ArrayHandleVirtual when needed.

    StorageBasic::StealArray() now provides delete function to new owner

    Memory that is stolen from VTK-m has to be freed correctly. This is required as the memory could have been allocated with new, malloc or even cudaMallocManaged.

    Previously it was very easy to transfer ownership of memory out of VTK-m and either fail to capture the free function, or ask for it after the transfer operation which would return a nullptr. Now stealing an array also provides the free function reducing one source of memory leaks.

    To properly steal memory from VTK-m you do the following:

      vtkm::cont::ArrayHandle<T> arrayHandle;
    
      ...
    
      auto* stolen = arrayHandle.StealArray();
      T* ptr = stolen.first;
      auto free_function = stolen.second;
    
      ...
    
      free_function(ptr);

    Control Environment

    vtkm::cont::CellLocatorGeneral has been added

    vtkm::cont::CellLocatorUniformBins can work with all kinds of datasets, but there are cell locators that are more efficient for specific data sets. Therefore, a new cell locator - vtkm::cont::CellLocatorGeneral has been implemented that can be configured to use specialized cell locators based on its input data. A "configurator" function object can be specified using the SetConfigurator() function. The configurator should have the following signature:

    void (std::unique_ptr<vtkm::cont::CellLocator>&,
         const vtkm::cont::DynamicCellSet&,
         const vtkm::cont::CoordinateSystem&);

    The configurator is invoked whenever the Update method is called and the input has changed. The current cell locator is passed in a std::unique_ptr. Based on the types of the input cellset and coordinates, and possibly some heuristics on their values, the current cell locator's parameters can be updated, or a different cell-locator can be instantiated and transferred to the unique_ptr. The default configurator configures a vtkm::cont::CellLocatorUniformGrid for uniform grid datasets, a vtkm::cont::CellLocatorRecitlinearGrid for rectilinear datasets, and vtkm::cont::CellLocatorUniformBins for all other dataset types.

    The class CellLocatorHelper that implemented similar functionality to CellLocatorGeneral has been removed.

    vtkm::cont::CellLocatorTwoLevelUniformGrid has been renamed to vtkm::cont::CellLocatorUniformBins

    CellLocatorTwoLevelUniformGrid has been renamed to CellLocatorUniformBins for brevity. It has been modified to be a subclass of vtkm::cont::CellLocator and can be used wherever a CellLocator is accepted.

    vtkm::cont::Timer now supports asynchronous and device independent timers

    vtkm::cont::Timer can now track execution time on a single device or across all enabled devices as seen below:

    vtkm::cont::Timer tbb_timer{vtkm::cont::DeviceAdaptertagTBB()};
    vtkm::cont::Timer all_timer;
    
    all_timer.Start();
    tbb_timer.Start();
    // Run blocking algorithm on tbb
    tbb_timer.Stop();
    // Run async-algorithms cuda
    all_timer.Stop();
    
    // Do more work
    
    //Now get time for all tbb work, and tbb_cuda work
    auto tbb_time = tbb_timer.GetElapsedTime();
    auto all_time = tbb_timer.GetElapsedTime();

    When Timer is constructed without an explicit vtkm::cont::DeviceAdapterId it will track all device adapters and return the maximum elapsed time over all devices when GetElapsedTime is called.

    vtkm::cont::DeviceAdapterId construction from strings are now case-insensitive

    You can now construct a vtkm::cont::DeviceAdapterId from a string no matter the case of it. The following all will construct the same vtkm::cont::DeviceAdapterId.

    vtkm::cont::DeviceAdapterId id1 = vtkm::cont::make_DeviceAdapterId("cuda");
    vtkm::cont::DeviceAdapterId id2 = vtkm::cont::make_DeviceAdapterId("CUDA");
    vtkm::cont::DeviceAdapterId id3 = vtkm::cont::make_DeviceAdapterId("Cuda");
    
    auto& tracker = vtkm::cont::GetRuntimeDeviceTracker();
    vtkm::cont::DeviceAdapterId id4 = tracker.GetDeviceAdapterId("cuda");
    vtkm::cont::DeviceAdapterId id5 = tracker.GetDeviceAdapterId("CUDA");
    vtkm::cont::DeviceAdapterId id6 = tracker.GetDeviceAdapterId("Cuda");

    vtkm::cont::Initialize will only parse known arguments

    When a library requires reading some command line arguments through a function like Initialize, it is typical that it will parse through arguments it supports and then remove those arguments from argc and argv so that the remaining arguments can be parsed by the calling program. Recent changes to the vtkm::cont::Initialize function support that.

    Use Case

    Say you are creating a simple benchmark where you want to provide a command line option --size that allows you to adjust the size of the data that you are working on. However, you also want to support flags like --device and -v that are performed by vtkm::cont::Initialize. Rather than have to re-implement all of Initialize's parsing, you can now first call Initialize to handle its arguments and then parse the remaining objects.

    The following is a simple (and rather incomplete) example:

    int main(int argc, char** argv)
    {
      vtkm::cont::InitializeResult initResult = vtkm::cont::Initialize(argc, argv);
    
      if ((argc > 1) && (strcmp(argv[1], "--size") == 0))
      {
        if (argc < 3)
    	{
    	  std::cerr << "--size option requires a numeric argument" << std::endl;
    	  std::cerr << "USAGE: " << argv[0] << " [options]" << std::endl;
    	  std::cerr << "Options are:" << std::endl;
    	  std::cerr << "  --size <number>\tSpecify the size of the data." << std::endl;
    	  std::cerr << initResult.Usage << std::endl;
    	  exit(1);
    	}
    
    	g_size = atoi(argv[2]);
      }
    
      std::cout << "Using device: " << initResult.Device.GetName() << std::endl;

    Additional Initialize Options

    Because vtkm::cont::Initialize no longer has the assumption that it is responsible for parsing all arguments, some options have been added to vtkm::cont::InitializeOptions to manage these different use cases. The following options are now supported.

    • None A placeholder for having all options off, which is the default. (Same as before this change.)
    • RequireDevice Issue an error if the device argument is not specified. (Same as before this change.)
    • DefaultAnyDevice If no device is specified, treat it as if the user gave --device=Any. This means that DeviceAdapterTagUndefined will never be return in the result.
    • AddHelp Add a help argument. If -h or --help is provided, prints a usage statement. Of course, the usage statement will only print out arguments processed by VTK-m.
    • ErrorOnBadOption If an unknown option is encountered, the program terminates with an error and a usage statement is printed. If this option is not provided, any unknown options are returned in argv. If this option is used, it is a good idea to use AddHelp as well.
    • ErrorOnBadArgument If an extra argument is encountered, the program terminates with an error and a usage statement is printed. If this option is not provided, any unknown arguments are returned in argv.
    • Strict If supplied, Initialize treats its own arguments as the only ones supported by the application and provides an error if not followed exactly. This is a convenience option that is a combination of ErrorOnBadOption, ErrorOnBadArgument, and AddHelp.

    InitializeResult Changes

    The changes in Initialize have also necessitated the changing of some of the fields in the InitializeResult structure. The following fields are now provided in the InitializeResult struct.

    • Device Returns the device selected in the command line arguments as a DeviceAdapterId. If no device was selected, DeviceAdapterTagUndefined is returned. (Same as before this change.)
    • Usage Returns a string containing the usage for the options recognized by Initialize. This can be used to build larger usage statements containing options for both Initialize and the calling program. See the example above.

    Note that the Arguments field has been removed from InitializeResult. This is because the unparsed arguments are now returned in the modified argc and argv, which provides a more complete result than the Arguments field did.

    Execution Environment

    VTK-m logs details about each CUDA kernel launch

    The VTK-m logging infrastructure has been extended with a new log level KernelLaunches which exists between MemTransfer and Cast.

    This log level reports the number of blocks, threads per block, and the PTX version of each CUDA kernel launched.

    This logging level was primarily introduced to help developers that are tracking down issues that occur when VTK-m components have been built with different sm_XX flags and help people looking to do kernel performance tuning.

    VTK-m CUDA allocations can have managed memory (cudaMallocManaged) enabled/disabled from C++

    Previously it was impossible for calling code to explicitly disable cuda managed memory. This can be desirable for projects that know they don't need managed memory and are super performance critical.

    const bool usingManagedMemory = vtkm::cont::cuda::internal::CudaAllocator::UsingManagedMemory();
    if(usingManagedMemory)
      {  //disable managed memory
      vtkm::cont::cuda::internal::CudaAllocator::ForceManagedMemoryOff();
      }

    VTK-m CUDA kernel scheduling improved including better defaults, and user customization support

    VTK-m now offers a more GPU aware set of defaults for kernel scheduling. When VTK-m first launches a kernel we do system introspection and determine what GPU's are on the machine and than match this information to a preset table of values. The implementation is designed in a way that allows for VTK-m to offer both specific presets for a given GPU ( V100 ) or for an entire generation of cards ( Pascal ).

    Currently VTK-m offers preset tables for the following GPU's:

    • Tesla V100
    • Tesla P100

    If the hardware doesn't match a specific GPU card we than try to find the nearest know hardware generation and use those defaults. Currently we offer defaults for

    • Older than Pascal Hardware
    • Pascal Hardware
    • Volta+ Hardware

    Some users have workloads that don't align with the defaults provided by VTK-m. When that is the cause, it is possible to override the defaults by binding a custom function to vtkm::cont::cuda::InitScheduleParameters. As shown below:

      ScheduleParameters CustomScheduleValues(char const* name,
                                              int major,
                                              int minor,
                                              int multiProcessorCount,
                                              int maxThreadsPerMultiProcessor,
                                              int maxThreadsPerBlock)
      {
    
        ScheduleParameters params  {
            64 * multiProcessorCount,  //1d blocks
            64,                        //1d threads per block
            64 * multiProcessorCount,  //2d blocks
            { 8, 8, 1 },               //2d threads per block
            64 * multiProcessorCount,  //3d blocks
            { 4, 4, 4 } };             //3d threads per block
        return params;
      }
      vtkm::cont::cuda::InitScheduleParameters(&CustomScheduleValues);

    VTK-m Reduction algorithm now supports differing input and output types

    It is common to want to perform a reduction where the input and output types are of differing types. A basic example would be when the input is vtkm::UInt8 but the output is vtkm::UInt64. This has been supported since v1.2, as the input type can be implicitly convertible to the output type.

    What we now support is when the input type is not implicitly convertible to the output type, such as when the output type is vtkm::Pair< vtkm::UInt64, vtkm::UInt64>. For this to work we require that the custom binary operator implements also an operator() which handles the unary transformation of input to output.

    An example of a custom reduction operator for differing input and output types is:

    
      struct CustomMinAndMax
      {
        using OutputType = vtkm::Pair<vtkm::Float64, vtkm::Float64>;
    
        VTKM_EXEC_CONT
        OutputType operator()(vtkm::Float64 a) const
        {
        return OutputType(a, a);
        }
    
        VTKM_EXEC_CONT
        OutputType operator()(vtkm::Float64 a, vtkm::Float64 b) const
        {
          return OutputType(vtkm::Min(a, b), vtkm::Max(a, b));
        }
    
        VTKM_EXEC_CONT
        OutputType operator()(const OutputType& a, const OutputType& b) const
        {
          return OutputType(vtkm::Min(a.first, b.first), vtkm::Max(a.second, b.second));
        }
    
        VTKM_EXEC_CONT
        OutputType operator()(vtkm::Float64 a, const OutputType& b) const
        {
          return OutputType(vtkm::Min(a, b.first), vtkm::Max(a, b.second));
        }
    
        VTKM_EXEC_CONT
        OutputType operator()(const OutputType& a, vtkm::Float64 b) const
        {
          return OutputType(vtkm::Min(a.first, b), vtkm::Max(a.second, b));
        }
      };
    
    

    Added specialized operators for ArrayPortalValueReference

    The ArrayPortalValueReference is supposed to behave just like the value it encapsulates and does so by automatically converting to the base type when necessary. However, when it is possible to convert that to something else, it is possible to get errors about ambiguous overloads. To avoid these, add specialized versions of the operators to specify which ones should be used.

    Also consolidated the CUDA version of an ArrayPortalValueReference to the standard one. The two implementations were equivalent and we would like changes to apply to both.

    Worklets and Filters

    vtkm::worklet::Invoker now supports worklets which require a Scatter object

    This change allows the Invoker class to support launching worklets that require a custom scatter operation. This is done by providing the scatter as the second argument when launch a worklet with the () operator.

    The following example shows a scatter being provided with a worklet launch.

    struct CheckTopology : vtkm::worklet::WorkletMapPointToCell
    {
      using ControlSignature = void(CellSetIn cellset, FieldOutCell);
      using ExecutionSignature = _2(FromIndices);
      using ScatterType = vtkm::worklet::ScatterPermutation<>;
      ...
    };
    
    
    vtkm::worklet::Ivoker invoke;
    invoke( CheckTopology{}, vtkm::worklet::ScatterPermutation{}, cellset, result );

    BitFields are now a support field input/out type for VTK-m worklets

    BitFields are:

    • Stored in memory using a contiguous buffer of bits.
    • Accessible via portals, a la ArrayHandle.
    • Portals operate on individual bits or words.
    • Operations may be atomic for safe use from concurrent kernels.

    The new BitFieldToUnorderedSet device algorithm produces an ArrayHandle containing the indices of all set bits, in no particular order.

    The new AtomicInterface classes provide an abstraction into bitwise atomic operations across control and execution environments and are used to implement the BitPortals.

    BitFields may be used as boolean-typed ArrayHandles using the ArrayHandleBitField adapter. vtkm::cont::ArrayHandleBitField uses atomic operations to read and write bits in the BitField, and is safe to use in concurrent code.

    For example, a simple worklet that merges two arrays based on a boolean condition:

    class ConditionalMergeWorklet : public vtkm::worklet::WorkletMapField
    {
    public:
    using ControlSignature = void(FieldIn cond,
                                  FieldIn trueVals,
                                  FieldIn falseVals,
                                  FieldOut result);
    using ExecutionSignature = _4(_1, _2, _3);
    
    template <typename T>
    VTKM_EXEC T operator()(bool cond, const T& trueVal, const T& falseVal) const
    {
      return cond ? trueVal : falseVal;
    }
    
    };
    
    BitField bits = ...;
    auto condArray = vtkm::cont::make_ArrayHandleBitField(bits);
    auto trueArray = vtkm::cont::make_ArrayHandleCounting<vtkm::Id>(20, 2, NUM_BITS);
    auto falseArray = vtkm::cont::make_ArrayHandleCounting<vtkm::Id>(13, 2, NUM_BITS);
    vtkm::cont::ArrayHandle<vtkm::Id> output;
    
    vtkm::worklet::Invoker invoke( vtkm::cont::DeviceAdaptertagTBB{} );
    invoke(ConditionalMergeWorklet{}, condArray, trueArray, falseArray, output);
    

    Added a Point Merging worklet

    We have added vtkm::worklet::PointMerge which uses a virtual grid approach to identify nearby points. The worklet works by creating a very fine but sparsely represented locator grid. It then groups points by grid bins and finds those within a specified radius.

    vtkm::filter::CleanGrid now can do point merging

    The CleanGrid filter has been extended to use vtkm::worklet::PointMerge to allow for point merging. The following flags have been added to CleanGrid to modify the behavior of point merging.

    • Set/GetMergePoints - a flag to turn on/off the merging of duplicated coincident points. This extra operation will find points spatially located near each other and merge them together.
    • Set/GetTolerance - Defines the tolerance used when determining whether two points are considered coincident. If the ToleranceIsAbsolute flag is false (the default), then this tolerance is scaled by the diagonal of the points. This parameter is only used when merge points is on.
    • Set/GetToleranceIsAbsolute - When ToleranceIsAbsolute is false (the default) then the tolerance is scaled by the diagonal of the bounds of the dataset. If true, then the tolerance is taken as the actual distance to use. This parameter is only used when merge points is on.
    • Set/GetFastMerge - When FastMerge is true (the default), some corners are cut when computing coincident points. The point merge will go faster but the tolerance will not be strictly followed.

    Added a connected component worklets and filters

    We have added the vtkm::filter::ImageConnectivity and vtkm::filter::CellSetConnectivity filters to identify connected components in DataSets and the corresponding worklets. The ImageConnectivity identify connected components in vtkm::cont::CellSetStructured, based on same field value of neighboring cells. The CellSetConnectivity identify connected components based on cell connectivity.

    Currently Moore neighborhood (i.e. 8 neighboring pixels for 2D and 27 neighboring pixels for 3D) is used for ImageConnectivity. For CellSetConnectivity, neighborhood is defined as cells sharing a common edge.

    Build

    CMake 3.8+ now required to build VTK-m

    Historically VTK-m has offered the ability to build a small subset of device adapters with CMake 3.3. As both our primary consumers have moved to CMake 3.8, and HPC machines continue to provide new CMake versions we have decided to simplify our CMake build system by requiring CMake 3.8 everywhere.

    VTK-m now can verify that it installs itself correctly

    It was a fairly common occurrence of VTK-m to have a broken install tree as it had no easy way to verify that all headers would be installed.

    Now VTK-m offers a testing infrastructure that creates a temporary installed version and compile tests that build against the installed VTK-m version. Currently we have tests that verify each header listed in VTK-m is installed, users can compile a custom vtkm::filter that uses diy, and users can call vtkm::rendering.

    VTK-m now requires CUDA separable compilation to build

    With the introduction of vtkm::cont::ArrayHandleVirtual and the related infrastructure, vtk-m now requires that all CUDA code be compiled using separable compilation ( -rdc ).

    VTK-m provides a vtkm_filter CMake target

    VTK-m now provides a vtkm_filter target that contains pre-built components of filters for consuming projects.

    vtkm::cont::CellLocatorBoundingIntervalHierarchy is compiled into vtkm_cont

    All of the methods in CellLocatorBoundingIntervalHierarchy were listed in header files. This is sometimes problematic with virtual methods. Since everything implemented in it can just be embedded in a library, move the code into the vtkm_cont library.

    These changes caused some warnings in clang to show up based on virtual methods in other cell locators. Hence, the rest of the cell locators have also had some of their code moved to vtkm_cont.

    Other

    LodePNG added as a thirdparty package

    The lodepng library was brought is an thirdparty library. This has allowed the VTK-m rendering library to have a robust png decode functionality.

    Optionparser added as a thirdparty package

    Previously we just took the optionparser.h file and stuck it right in our source code. That was problematic for a variety of reasons.

    • It incorrectly assigned our license to external code.
    • It made lots of unnecessary changes to the original source (like reformatting).
    • It made it near impossible to track patches we make and updates to the original software.

    Now we use the third-party system to track changes to optionparser.h in the https://gitlab.kitware.com/third-party/optionparser repository.

    Thirdparty diy now can coexist with external diy

    Previously VTK-m would leak macros that would cause an external diy to be incorrectly mangled breaking consumers of VTK-m that used diy.

    Going forward to use diy from VTK-m all calls must use the vtkmdiy namespace instead of the diy namespace. This allows for VTK-m to properly forward calls to either the external or internal version correctly.

    Merge benchmark executables into a device dependent shared library

    VTK-m has been updated to replace old per device benchmark executables with a single multi-device executable. Selection of the device adapter is done at runtime through the --device= argument.

    Merge rendering testing executables to a shared library

    VTK-m has been updated to replace old per device rendering testing executables with a single multi-device executable. Selection of the device adapter is done at runtime through the --device= argument.

    Merge worklet testing executables into a device dependent shared library

    VTK-m has been updated to replace old per device working testing executables with a single multi-device executable. Selection of the device adapter is done at runtime through the --device= argument.

    VTK-m runtime device detection properly handles busy CUDA devices

    When an application that uses VTK-m is first launched it will do a check to see if CUDA is supported at runtime. If for some reason that CUDA card is not allowing kernel execution VTK-m would report the hardware doesn't have CUDA support.

    This was problematic as was over aggressive in disabling CUDA support for hardware that could support kernel execution in the future. With the fact that every VTK-m worklet is executed through a TryExecute it is no longer necessary to be so aggressive in disabling CUDA support.

    Now the behavior is that VTK-m considers a machine to have CUDA runtime support if it has 1+ GPU's of Kepler or higher hardware (SM_30+).

  • v1.3.0   Release VTK-m 1.3.0 1.3.0 is our fourth official release of VTK-m. The major changes to VTK-m from 1.2.0 can be found in: docs/changelog/1.3/release-notes.md
    a8da749e · Release VTK-m 1.3.0 ·

    VTK-m 1.3 Release Notes

    Table of Contents

    1. Core
      • OpenMP Device Adapter
      • Make all worklets dispatches invoke using a TryExecute
      • Remove TryExecute from each filter
      • DeviceAdapterTags are usable for runtime device selection
      • New geometry classes
      • Logging support via loguru.
    2. ArrayHandle
      • ArrayHandle can now take ownership of a user allocated memory location
      • StorageBasic now can provide or be provided a delete function
      • ArrayHandleTransform works with ExecObject
      • Add ArrayHandleView fancy array
      • ArrayHandleCompositeVector simplified and made writable.
      • ArrayHandleExtractComponent target component is now set at runtime
      • ArrayHandleSwizzle component maps are now set at runtime
    3. Control Environment
      • Interfaces for VTK-m spatial search structures added
      • vtkm::cont::Algorithm now can be told which device to use at runtime
      • Support ExecArg behavior in vtkm::cont::Algorithm methods
      • vtkm::cont::TryExecuteOnDevice allows for runtime selection of which device to execute on
      • vtkm::cont::CellSetExplicit now caches CellToPoint table when used with Invoke
      • vtkm::cont::RuntimeDeviceInformation doesn't require a device template
      • vtkm::cont::VirtualObjectHandle can transfer to a device using runtime DeviceAdapterId value
      • Add new vtkm::exec and vtkm::cont execution objects
      • Added a ReleaseResources API to CellSet and its derived classes
      • Added a ReleaseResourcesExecution API for Field to unload execution resources
      • Added a ReleaseResourcesExecution API for CoordinateSystem to unload execution resources
      • Use the strong typed enums for vtkm::cont::Field
      • vtkm::cont::DeviceAdapterId has becomes a real constexpr type and not an alias to vtkm::UInt8
    4. Execution Environment
      • User defined execution objects now usable with runtime selection of device adapter
      • Dot function name changed
      • Added float version operations for vtkm::Math Pi()
      • vtkm::Math Pi functions are now constexpr
      • CellDerivativeFor3DCell has a better version for Vec of Vec fields.
      • Add a CUDA-safe vtkm::Swap method
    5. Worklets and Filters
      • Worklets are now asynchronous on CUDA
      • Worklets now execute on CUDA using grid stride loops
      • Scatter class moved to dispatcher
      • VTK-m now supports dispatcher parameters being pointers
      • Histogram filter supports custom types
      • Cell measure functions, worklet, and filter
      • Add a WaveletGenerator worklet (e.g. vtkRTAnalyticSource)
      • Add a filter to support Lagrangian analysis capabilities
      • Add a warp vector worklet and filter
      • Add a warp scalar worklet and filter
      • Add a split sharp edge filter
      • Time-varying "oscillator" filter and example
    6. Build
      • Build System Redesign and new minimum CMake
      • Add none option to VTKm_CUDA_Architecture
      • Add a new cmake option: VTKm_ENABLE_DEVELOPER_FLAGS
      • Add a new cmake option: VTKm_INSTALL_ONLY_LIBRARIES
    7. Other
      • Control CUDA managed memory with an environment variable
      • Add support for deferred freeing of CUDA memory
      • Allow variable arguments to VTKM_TEST_ASSERT
      • Support constexpr and variadic constructor for Vec
      • vtkm::Vec< vtkm::Vec<T> > can't be constructed from vtkm::Vec<U>
      • Use std::call_once to construct singeltons
      • Use thread_local in vtkm::cont::GetGlobalRuntimeDeviceTracker function if possible
      • Replace std::random_shuffle with std::shuffle

    Core

    OpenMP Device Adapter

    A device adapter that leverages OpenMP 4.0 for threading is now available. The new adapter is enabled using the CMake option VTKm_ENABLE_OPENMP and its performance is comparable to the TBB device adapter.

    Performance comparisions of OpenMP against the TBB and Serial device adapters can be found at: #223

    Make all worklets dispatches invoke using a TryExecute

    Rather than force all dispatchers to be templated on a device adapter, instead use a TryExecute internally within the invoke to select a device adapter.

    This changes touches quite a bit a code. The first pass of the change usually does the minimum amount of work, which is to change the compile-time specification of the device to a run-time call to SetDevice on the dispatcher. Although functionally equivalent, it might mean calling TryExecute within itself.

    Remove TryExecute from each filter

    The recenet change to dispatchers has embedded a TryExecute internally within the Invoke function of all dispatchers. This means that it is no longer necessary to specify a device when invoking a worklet.

    Previously, this TryExecute was in the filter layer. The filter superclasses would do a TryExecute and use that to pass to subclasses in methods like DoExecute and DoMapField. Since the dispatcher no longer needs a device this TryExecute is generally unnecessary and always redundant. Thus, it has been removed.

    Because of this, the device argument to DoExecute and DoMapField has been removed. This will cause current implementations of filter to change, but it usually simplifies code. That said, there might be some code that needs to be wrapped into a vtkm::cont::ExecObjectBase.

    No changes need to be made to code that uses filters.

    DeviceAdapterTags are usable for runtime device selection

    VTK-m DeviceAdapterTags now are both a compile time representation of which device to use, and also the runtime representation of that device. Previously the runtime representation was handled by vtkm::cont::DeviceAdapterId. This was done by making DeviceAdapterTag's' a constexpr type that inherits from the constexpr vtkm::cont::DeviceAdapterId type.

    At at ten thousand foot level this change means that in general instead of using vtkm::cont::DeviceAdapterTraits<DeviceTag> you can simply use DeviceTag, or an instance of if DeviceTag runtimeDeviceId;.

    Previously if you wanted to get the runtime representation of a device you would do the following:

    template<typename DeviceTag>
    vtkm::cont::DeviceAdapterId getDeviceId()
    {
      using Traits = vtkm::cont::DeviceAdapterTraits<DeviceTag>;
      return Traits::GetId();
    }
    ...
    vtkm::cont::DeviceAdapterId runtimeId = getDeviceId<DeviceTag>();

    Now with the updates you could do the following.

    vtkm::cont::DeviceAdapterId runtimeId = DeviceTag();

    More importantly this conversion is unnecessary as you can pass instances DeviceAdapterTags into methods or functions that want vtkm::cont::DeviceAdapterId as they are that type!

    Previously if you wanted to see if a DeviceAdapter was enabled you would the following:

    using Traits = vtkm::cont::DeviceAdapterTraits<DeviceTag>;
    constexpr auto isValid = std::integral_constant<bool, Traits::Valid>();

    Now you would do:

    constexpr auto isValid = std::integral_constant<bool, DeviceTag::IsEnabled>();

    So why did VTK-m make these changes?

    That is a good question, and the answer for that is two fold. The VTK-m project is working better support for ArraysHandles that leverage runtime polymorphism (aka virtuals), and the ability to construct vtkm::worklet::Dispatchers without specifying the explicit device they should run on. Both of these designs push more of the VTK-m logic to operate at runtime rather than compile time. This changes are designed to allow for consistent object usage between runtime and compile time instead of having to convert between compile time and runtime types.

    New geometry classes

    There are now some additional structures available both the control and execution environments for representing geometric entities (mostly of dimensions 2 and 3). These new structures are now in vtkm/Geometry.h and demonstrated/tested in vtkm/testing/TestingGeometry.h:

    • Ray<CoordType, Dimension, IsTwoSided>. Instances of this struct represent a semi-infinite line segment in a 2-D plane or in a 3-D space, depending on the integer dimension specified as a template parameter. Its state is the point at the start of the ray (Origin) plus the ray's Direction, a unit-length vector. If the third template parameter (IsTwoSided) is true, then the ray serves as an infinite line. Otherwise, the ray will only report intersections in its positive halfspace.
    • LineSegment<CoordType, Dimension>. Instances of this struct represent a finite line segment in a 2-D plane or in a 3-D space, depending on the integer dimension specified as a template parameter. Its state is the coordinates of its Endpoints.
    • Plane<CoordType>. Instances of this struct represent a plane in 3-D. Its state is the coordinates of a base point (Origin) and a unit-length normal vector (Normal).
    • Sphere<CoordType, Dimension>. Instances of this struct represent a d-dimensional sphere. Its state is the coordinates of its center plus a radius. It is also aliased with a using statement to Circle<CoordType> for the specific case of 2-D.

    These structures provide useful queries and generally interact with one another. For instance, it is possible to intersect lines and planes and compute distances.

    For ease of use, there are also several using statements that alias these geometric structures to names that specialize them for a particular dimension or other template parameter. As an example, Ray<CoordType, Dimension, true> is aliased to Line<CoordType, Dimension> and Ray<CoordType, 3, true> is aliased to Line3<CoordType> and Ray<FloatDefault, 3, true> is aliased to Line3d.

    Design patterns

    If you plan to add a new geometric entity type, please adopt these conventions:

    • Each geometric entity may be default-constructed. The default constructor will initialize the state to some valid unit-length entity, usually with some part of its state at the origin of the coordinate system.
    • Entities may always be constructed by passing in values for their internal state. Alternate construction methods are declared as free functions such as make_CircleFrom3Points()
    • Use template metaprogramming to make methods available only when the template dimension gives them semantic meaning. For example, a 2-D line segment's perpendicular bisector is another line segment, but a 3-D line segment's perpendicular line segment is a plane. Note how this is accomplished and apply this pattern to new geometric entities or new methods on existing entities.
    • Some entities may have invalid state. If this is possible, the entity will have an IsValid() method. For example, a sphere may be invalid because the user or some construction technique specified a zero or negative radius.
    • When signed distance is semantically meaningful, provide it in favor of or in addition to unsigned distance.
    • Accept a tolerance parameter when appropriate, but provide a sensible default value. You may want to perform exact arithmetic versions of tests, but please provide fast, tolerance-based versions as well.

    Logging support via loguru.

    The loguru project has been integrated with VTK-m to provide runtime logging facilities. A sample of the log output can be found at https://gitlab.kitware.com/snippets/427.

    Logging is enabled by setting the CMake variable VTKm_ENABLE_LOGGING. When this flag is enabled, any messages logged to the Info, Warn, Error, and Fatal levels are printed to stderr by default.

    Additional logging features are enabled by calling vtkm::cont::InitLogging in an executable. This will:

    • Set human-readable names for the log levels in the output.
    • Allow the stderr logging level to be set at runtime by passing a '-v [level]' argument to the executable.
    • Name the main thread.
    • Print a preamble with details of the program's startup (args, etc).
    • Install signal handlers to automatically print stacktraces and error contexts (linux only) on crashes.

    The main logging entry points are the macros VTKM_LOG_S and VTKM_LOG_F, which using C++ stream and printf syntax, repectively. Other variants exist, including conditional logging and special-purpose logs for writing specific events, such as DynamicObject cast results and TryExecute failures.

    The logging backend supports the concept of "Scopes". By creating a new scope with the macros VTKM_LOG_SCOPE or VTKM_LOG_SCOPE_FUNCTION, a new "logging scope" is opened within the C++ scope the macro is called from. New messages will be indented in the log until the scope ends, at which point a message is logged with the elapsed time that the scope was active. Scopes may be nested to arbitrary depths.

    The logging implementation is thread-safe. When working in a multithreaded environment, each thread may be assigned a human-readable name using vtkm::cont::SetThreadName. This will appear in the log output so that per-thread messages can be easily tracked.

    By default, only Info, Warn, Error, and Fatal messages are printed to stderr. This can be changed at runtime by passing the '-v' flag to an executable that calls vtkm::cont::InitLogging. Alternatively, the application can explicitly call vtkm::cont::SetStderrLogLevel to change the verbosity. When specifying a verbosity, all log levels with enum values less-than-or-equal-to the requested level are printed. vtkm::cont::LogLevel::Off (or "-v Off") may be used to silence the log completely.

    The helper functions vtkm::cont::GetHumanReadableSize and vtkm::cont::GetSizeString assist in formating byte sizes to a more readable format. Similarly, the vtkm::cont::TypeName template functions provide RTTI based type-name information. When logging is enabled, these use the logging backend to demangle symbol names on supported platforms.

    The more verbose VTK-m log levels are:

    • Perf: Logs performance information, using the scopes feature to track execution time of filters, worklets, and device algorithms with microsecond resolution.
    • MemCont / MemExec: These levels log memory allocations in the control and execution environments, respectively.
    • MemTransfer: This level logs memory transfers between the control and host environments.
    • Cast: Logs details of dynamic object resolution.

    The log may be shared and extended by applications that use VTK-m. There are two log level ranges left available for applications: User and UserVerbose. The User levels may be enabled without showing any of the verbose VTK-m levels, while UserVerbose levels will also enable all VTK-m levels.

    ArrayHandle

    ArrayHandle can now take ownership of a user allocated memory location

    Previously memory that was allocated outside of VTK-m was impossible to transfer to VTK-m as we didn't know how to free it. By extending the ArrayHandle constructors to support a Storage object that is being moved, we can clearly express that the ArrayHandle now owns memory it didn't allocate.

    Here is an example of how this is done:

      T* buffer = new T[100];
      auto user_free_function = [](void* ptr) { delete[] static_cast<T*>(ptr); };
    
      vtkm::cont::internal::Storage<T, vtkm::cont::StorageTagBasic>
          storage(buffer, 100, user_free_function);
      vtkm::cont::ArrayHandle<T> arrayHandle(std::move(storage));

    StorageBasic now can provide or be provided a delete function

    Memory that was allocated outside of VTK-m was impossible to transfer to VTK-m as we didn't know how to free it. This is now resolved by allowing the user to specify a free function to be called on release.

    Memory that was allocated by VTK-m and Stolen by the user needed the proper free function. When running on CUDA on hardware that supports concurrent managed access the free function of the storage could be cudaFree.

    To properly steal memory from VTK-m you do the following:

      vtkm::cont::ArrayHandle<T> arrayHandle;
      //fill arrayHandle
    
      //you must get the free function before calling steal array
      auto free_function = arrayHandle.GetDeleteFunction();
      T* ptr = arrayHandle.StealArray();
      //use ptr
    
    
      free_function(ptr);

    ArrayHandleTransform works with ExecObject

    Previously, the ArrayHandleTransform class only worked with plain old data (POD) objects as is functors. For simple transforms, this makes sense since all the data comes from a target ArrayHandle that will be sent to the device through a different path. However, this also requires the transform to be known at compile time.

    However, there are cases where the functor cannot be a POD object and has to be built for a specific device. There are numerous reasons for this. One might be that you need some lookup tables. Another might be you want to support a virtual object, which has to be initialized for a particular device. The standard way to implement this in VTK-m is to create an "executive object." This actually means that we create a wrapper around executive objects that inherits from vtkm::cont::ExecutionAndControlObjectBase that contains a PrepareForExecution method and a PrepareForControl method.

    As an example, consider the use case of a special ArrayHandle that takes the value in one array and returns the index of that value in another sorted array. We can do that by creating a functor that finds a value in an array and returns the index.

    template <typename ArrayPortalType>
    struct FindValueFunctor
    {
      ArrayPortalType SortedArrayPortal;
    
      FindValueFunctor() = default;
    
      VTKM_CONT FindValueFunctor(const ArrayPortalType& sortedPortal)
        : SortedArrayPortal(sortedPortal)
      { }
    
      VTKM_EXEC vtkm::Id operator()(const typename PortalType::ValueType& value)
      {
        vtkm::Id leftIndex = 0;
      vtkm::Id rightIndex = this->SortedArrayPortal.GetNubmerOfValues();
      while (leftIndex < rightIndex)
      {
        vtkm::Id middleIndex = (leftIndex + rightIndex) / 2;
        auto middleValue = this->SortedArrayPortal.Get(middleIndex);
        if (middleValue <= value)
        {
          rightIndex = middleValue;
        }
        else
        {
          leftIndex = middleValue + 1;
        }
      }
      return leftIndex;
      }
    };

    Simple enough, except that the type of ArrayPortalType depends on what device the functor runs on (not to mention its memory might need to be moved to different hardware). We can now solve this problem by creating a functor objecgt set this up for a device. ArrayHandles also need to be able to provide portals that run in the control environment, and for that we need a special version of the functor for the control environment.

    template <typename ArrayHandleType>
    struct FindValueExecutionObject : vtkm::cont::ExecutionAndControlObjectBase
    {
      VTKM_IS_ARRAY_HANDLE(ArrayHandleType);
    
      ArrayHandleType SortedArray;
    
      FindValueExecutionObject() = default;
    
      VTKM_CONT FindValueExecutionObject(const ArrayHandleType& sortedArray)
        : SortedArray(sortedArray)
      { }
    
      template <typename Device>
      VTKM_CONT
      FindValueFunctor<decltype(std::declval<FunctorType>()(Device()))>
      PrepareForExecution(Device device)
      {
        using FunctorType =
        FindValueFunctor<decltype(std::declval<FunctorType>()(Device()))>
    
        return FunctorType(this->SortedArray.PrepareForInput(device));
      }
    
      VTKM_CONT
      FundValueFunctor<typename ArrayHandleType::PortalConstControl>
      PrepareForControl()
      {
        using FunctorType =
        FindValueFunctor<typename ArrayHandleType::PortalConstControl>
    
      return FunctorType(this->SortedArray.GetPortalConstControl());
      }
    }

    Now you can use this execution object in an ArrayHandleTransform. It will automatically be detected as an execution object and be converted to a functor in the execution environment.

    auto transformArray =
      vtkm::cont::make_ArrayHandleTransform(
        inputArray, FindValueExecutionObject<decltype(sortedArray)>(sortedArray));

    Add ArrayHandleView fancy array

    Added a new class named ArrayHandleView that allows you to get a subset of an array. You use the ArrayHandleView by giving it a target array, a starting index, and a length. Here is a simple example of usage:

    vtkm::cont::ArrayHandle<vtkm::Id> sourceArray;
    
    vtkm::cont::ArrayCopy(vtkm::cont::ArrayHandleIndex(10), sourceArray);
    // sourceArray has [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    
    vtkm::cont::ArrayHandleView<vtkm::cont::ArrayHandle<vtkm::Id>>
      viewArray(sourceArray, 3, 5);
    // viewArray has [3, 4, 5, 6, 7]

    There is also a convenience make_ArraHandleView function to create view arrays. The following makes the same view array as before.

    auto viewArray = vtkm::cont::make_ArrayHandleView(sourceArray, 3, 5);

    ArrayHandleCompositeVector simplified and made writable.

    ArrayHandleCompositeVector is now easier to use, as its type has a more straightforward definition: ArrayHandleCompositeVector<Array1, Array2, ...>. Previously, a helper metaprogramming struct was needed to determine the type of the array handle.

    In addition, the new implementation supports both reading and writing, whereas the original version was read-only.

    Another notable change is that the ArrayHandleCompositeVector no longer supports component extraction from the source arrays. While the previous version could take a source array with a vtkm::Vec ValueType and use only a single component in the output, the new version requires that all input arrays have the same ValueType, which becomes the ComponentType of the output vtkm::Vec.

    When component extraction is needed, the classes ArrayHandleSwizzle and ArrayHandleExtractComponent have been introduced to allow the previous usecases to continue working efficiently.

    ArrayHandleExtractComponent target component is now set at runtime

    Rather than embedding the extracted component in a template parameter, the extract operation is now defined at runtime.

    This is easier to use and keeps compile times / sizes / memory requirements down.

    ArrayHandleSwizzle component maps are now set at runtime

    Rather than embedding the component map in the template parameters, the swizzle operation is now defined at runtime using a vtkm::Vec<vtkm::IdComponent, N> that maps the input components to the output components.

    This is easier to use and keeps compile times / sizes / memory requirements down.

    Control Environment

    Interfaces for VTK-m spatial search structures added

    The objective for this feature was to add a common interface for the VTK-m spatial search strucutes for ease of use for the users. VTK-m now distinguishes locators into two types, cell locators and point locators. Cell locators can be used to query a containing cell for a point, and point locators can be used to search for other points that are close to the given point.

    All cell locators are now required to inherit from the interface vtkm::cont::CellLocator, and all point locatos are required to inherit from the interface vtkm::cont::PointLocator

    These interfaces describe the necessary features that are required from either a cell locator, or a point locator and provided an easy way to use them in the execution environment.

    By deriving new search structures from these locator interfaces, it makes it easier for users to build the underlying structures as well, abstracting away complicated details. After providing all the required data from a vtkm::cont::DataSet object, the user only need to call the Update method on the object of vtkm::cont::CellLocator, or vtkm::cont::PointLocator.

    For example, building the cell locator which used a Bounding Interval Hiererchy tree as a search structure, provided in the class vtkm::cont::BoundingIntervalHierarchy which inherits from vtkm::cont::CellLocator, only requires few steps.

      // Build a bounding interval hierarchy with 5 splitting planes,
      // and a maximum of 10 cells in the leaf node.
      vtkm::cont::BoundingIntervalHierarchy locator(5, 10);
      // Provide the cell set required by the search structure.
      locator.SetCellSet(cellSet);
      // Provide the coordinate system required by the search structure.
      locator.SetCoordinates(coords);
      // Cell the Update methods to finish building the underlying tree.
      locator.Update();

    Similarly, users can easily build available point locators as well.

    When using an object of vtkm::cont::CellLocator, or vtkm::cont::PointLocator in the execution environment, they need to be passed to the worklet as an ExecObject argument. In the execution environment, users will receive a pointer to an object of type vtkm::exec::CellLocator, or vtkm::exec::PointLocator respectively. vtkm::exec::CellLocator provides a method FindCell to use in the execution environment to query the containing cell of a point. vtkm::exec::PointLocator provides a method FindNearestNeighbor to query for the nearest point.

    As of now, VTK-m provides only one implementation for each of the given interfaces. vtkm::cont::BoundingIntervalHierarchy which is an implementation of vtkm::cont::CellLocator, and vtkm::cont::PointLocatorUniformGrid, which is an implementation of vtkm::cont::PointLocator.

    vtkm::cont::Algorithm now can be told which device to use at runtime

    The vtkm::cont::Algorithm has been extended to support the user specifying which device to use at runtime previously Algorithm would only use the first enabled device, requiring users to modify the vtkm::cont::GlobalRuntimeDeviceTracker if they wanted a specific device used.

    To select a specific device with vtkm::cont::Algorithm pass the vtkm::cont::DeviceAdapterId as the first parameter.

    vtkm::cont::ArrayHandle<double> values;
    
    //call with no tag, will run on first enabled device
    auto result = vtkm::cont::Algorithm::Reduce(values, 0.0);
    
    //call with an explicit device tag, will only run on serial
    vtkm::cont::DeviceAdapterTagSerial serial;
    result = vtkm::cont::Algorithm::Reduce(serial, values, 0.0);
    
    //call with an runtime device tag, will only run on serial
    vtkm::cont::DeviceAdapterId device = serial;
    result = vtkm::cont::Algorithm::Reduce(device, values, 0.0);
    

    Support ExecArg behavior in vtkm::cont::Algorithm methods

    vtkm::cont::Algorithm is a wrapper around DeviceAdapterAlgorithm that internally uses TryExecutes to select an appropriate device. The intention is that you can run parallel algorithms (outside of worklets) without having to specify a particular device.

    Most of the arguments given to device adapter algorithms are actually control-side arguments that get converted to execution objects internally (usually a vtkm::cont::ArrayHandle). However, some of the algorithms, take an argument that is passed directly to the execution environment, such as the predicate argument of Sort. If the argument is a plain-old-data (POD) type, which is common enough, then you can just pass the object straight through. However, if the object has any special elements that have to be transferred to the execution environment, such as internal arrays, passing this to the vtkm::cont::Algorithm functions becomes problematic.

    To cover this use case, all the vtkm::cont::Algorithm functions now support automatically transferring objects that support the ExecObject worklet convention. If any argument to any of the vtkm::cont::Algorithm functions inherits from vtkm::cont::ExecutionObjectBase, then the PrepareForExecution method is called with the device the algorithm is running on, which allows these device-specific objects to be used without the hassle of creating a TryExecute.

    vtkm::cont::TryExecuteOnDevice allows for runtime selection of which device to execute on

    VTK-m now offers vtkm::cont::TryExecuteOnDevice to allow for the user to select which device to execute a function on at runtime. The original vtkm::cont::TryExecute used the first valid device, which meant users had to modify the runtime state through the RuntimeTracker which was verbose and unwieldy.

    Here is an example of how you can execute a function on the device that an array handle was last executed on:

    
    struct ArrayCopyFunctor
    {
      template <typename Device, typename InArray, typename OutArray>
      VTKM_CONT bool operator()(Device, const InArray& src, OutArray& dest)
      {
        vtkm::cont::DeviceAdapterAlgorithm<Device>::Copy(src, dest);
        return true;
      }
    };
    
    template<typename T, typename InStorage, typename OutStorage>
    void SmartCopy(const vtkm::cont::ArrayHandle<T, InStorage>& src, vtkm::cont::ArrayHandle<T, OutStorage>& dest)
    {
      bool success = vtkm::cont::TryExecuteOnDevice(devId, ArrayCopyFunctor(), src, dest);
      if (!success)
      {
        vtkm::cont::TryExecute(ArrayCopyFunctor(), src, dest);
      }
    }

    vtkm::cont::CellSetExplicit now caches CellToPoint table when used with Invoke

    Issue #268 highlighted an issue where the expensive CellToPoint table update was not properly cached when a CellSetExplicit was used with a filter. This has been corrected by ensuring that the metadata associated with the table survives shallow copying of the CellSet.

    New methods are also added to check whether the CellToPoint table exists, and also to reset it if needed (e.g. for benchmarking):

    vtkm::cont::CellSetExplicit<> cellSet = ...;
    // Check if the CellToPoint table has already been computed:
    if (cellSet.HasConnectivity(vtkm::TopologyElementTagCell{},
                                vtkm::TopologyElementTagPoint{}))
    {
      // Reset it:
      cellSet.ResetConnectivity(vtkm::TopologyElementTagCell{},
                                vtkm::TopologyElementTagPoint{});
    }

    vtkm::cont::RuntimeDeviceInformation doesn't require a device template

    By making RuntimeDeviceInformation class template independent, vtkm is able to detect device info at runtime with a runtime specified deviceId. In the past it's impossible because the CRTP pattern does not allow function overloading(compiler would complain that DeviceAdapterRuntimeDetector does not have Exists() function defined).

    vtkm::cont::VirtualObjectHandle can transfer to a device using runtime DeviceAdapterId value

    Previously VirtualObjectHandle required the caller to know a compile time device adapter tag to transfer data. This was problematic since in parts of VTK-m you would only have the runtime vtkm::cont::DeviceAdapterId value of the desired device. To than transfer the VirtualObjectHandle you would have to call FindDeviceAdapterTagAndCall. All this extra work was unneeded as VirtualObjectHandle internally was immediately converting from a compile time type to a runtime value.

    Here is an example of how you can now transfer a VirtualObjectHandle to a device using a runtime value:

    
    template<typename BaseType>
    const BaseType* moveToDevice(VirtualObjectHandle<BaseType>& handle,
                          vtkm::cont::vtkm::cont::DeviceAdapterId deviceId)
    {
      return handle.PrepareForExecution(deviceId);
    }

    Add new vtkm::exec and vtkm::cont execution objects

    Recent changes to execution objects now have execution objects behave as factories that create an objec specific for a particular device. Sometimes, you also need to be able to get an object that behaves properly in the control environment. For these cases, a sublcass to vtkm::cont::ExecutionObjectBase was created.

    This subclass is called vtkm::cont::ExecutionAndControlObjectBase. In addition to the PrepareForExecution method required by its superclass, these objects also need to provide a PrepareForControl method to get an equivalent object that works in the control environment.

    See the changelog for ArrayHandleTransform works with ExecObject for an example of using a vtkm::cont::ExecutionAndControlObjectBase.

    Added a ReleaseResources API to CellSet and its derived classes

    We now offer the ability to unload execution memory from vtkm::cont::CellSet and its derived classes(CellSetExplicit, CellSetPermutation and CellSetStructured) using the ReleaseResourcesExecution.

    Added a ReleaseResourcesExecution API for Field to unload execution resources

    We now offer the ability to unload execution memory from vtkm::cont::Field using the ReleaseResourcesExecution method.

    Added a ReleaseResourcesExecution API for CoordinateSystem to unload execution resources

    We now offer the ability to unload execution memory from vtkm::cont::ArrayHandleVirtualCoordinates and vtkm::cont::CoordinateSystem using the ReleaseResourcesExecution method.

    Use the strong typed enums for vtkm::cont::Field

    By doing so, the compiler would not convert these enums into ints which can cause some unexpected behavior.

    vtkm::cont::DeviceAdapterId has becomes a real constexpr type and not an alias to vtkm::UInt8

    As part of the ability to support vtkm::cont::TryExecuteOnDevice VTK-m has made the DeviceAdapterId a real constexpr type instead of a vtkm::UInt8.

    The benefits of a real type are as follows:

    • Easier to add functionality like range verification, which previously had to be located in each user of DeviceAdapterId

    • In ability to have ambiguous arguments. Previously it wasn't perfectly clear what a method parameter of vtkm::UInt8 represented. Was it actually the DeviceAdapterId or something else?

    • Ability to add subclasses that represent things such as Undefined, Error, or Any.

    The implementation of DeviceAdapterId is:

    struct DeviceAdapterId
    {
      constexpr explicit DeviceAdapterId(vtkm::Int8 id)
        : Value(id)
      {
      }
    
      constexpr bool operator==(DeviceAdapterId other) const { return this->Value == other.Value; }
      constexpr bool operator!=(DeviceAdapterId other) const { return this->Value != other.Value; }
      constexpr bool operator<(DeviceAdapterId other) const { return this->Value < other.Value; }
    
      constexpr bool IsValueValid() const
      {
        return this->Value > 0 && this->Value < VTKM_MAX_DEVICE_ADAPTER_ID;
      }
    
      constexpr vtkm::Int8 GetValue() const { return this->Value; }
    
    private:
      vtkm::Int8 Value;
    };

    Execution Environment

    User defined execution objects now usable with runtime selection of device adapter

    Changed how Execution objects are created and passed from the cont environment to the exec environment. Instead we will now fill out a class and call PrepareForExecution() and create the execution object for the exec environment from this function. This way we do not have to template the class that extends vtkm::cont::ExecutionObjectBase on the device.

    Example of new execution object:

    template <typename Device>
    struct ExecutionObject
    {
      vtkm::Int32 Number;
    };
    
    struct TestExecutionObject : public vtkm::cont::ExecutionObjectBase
    {
      vtkm::Int32 Number;
    
      template <typename Device>
      VTKM_CONT ExecutionObject<Device> PrepareForExecution(Device) const
      {
        ExecutionObject<Device> object;
        object.Number = this->Number;
        return object;
      }
    };

    Dot function name changed

    The free function vtkm::dot() has been renamed to vtkm::Dot() to be consistent with other vtk-m function names. Aliases are provided for backwards compatibility but will be removed in the next release.

    Added float version operations for vtkm::Math Pi()

    vtkm::Pi<T> now suports float and double as T.

    vtkm::Math Pi functions are now constexpr

    Now PI related functions are evalulated at compile time as constexpr functions.

    CellDerivativeFor3DCell has a better version for Vec of Vec fields.

    Previously we would compute a 3x3 matrix where each element was a Vec. Using the jacobain of a single component is sufficient instead of computing it for each component. This approach saves anywhere from 2 to 3 times the memory space.

    Add a CUDA-safe vtkm::Swap method

    Added a swap implementation that is safe to call from all backends.

    It is not legal to call std functions from CUDA code, and the new vtkm::Swap implements a naive swap when compiled under NVCC while falling back to a std/ADL swap otherwise.

    Worklets and Filters

    Worklets are now asynchronous on CUDA

    Worklets are now fully asynchronous in the CUDA backend. This means that worklet errors are reported asynchronously. Existing errors are checked for before invocation of a new worklet and at explicit synchronization points like DeviceAdapterAlgorithm<>::Synchronize().

    An important effect of this change is that functions that are synchronization points, like ArrayHandle::GetPortalControl() and ArrayHandle::GetPortalConstControl(), may now throw exception for errors from previously executed worklets.

    Worklet invocations, synchronization and error reporting happen independtly on different threads. Therefore, synchronization on one thread does not affect any other threads.

    Worklets now execute on CUDA using grid stride loops

    Previously VTK-m Worklets used what is referred to as a monolithic kernel pattern for worklet execution. This assumes a single large grid of threads to process an entire array in a single pass. This resulted in launches that looked like:

    template<typename F>
    void TaskSingular(F f, vtkm::Id end)
    {
      const vtkm::Id index = static_cast<vtkm::Id>(blockDim.x * blockIdx.x + threadIdx.x);
      if (index < end)
      {
        f(index);
      }
    }
    
    Schedule1DIndexKernel<TaskSingular><<<totalBlocks, 128, 0, CUDAStreamPerThread>>>(
           functor, numInstances);

    This was problematic as it had the drawbacks of:

    • Not being able to reuse any infrastructure between kernel executions.
    • Harder to tune performance based on the current hardware.

    The solution was to move to a grid stride loop strategy with a block size based off the number of SM's on the executing GPU. The result is something that looks like:

    template<typename F>
    void TaskStrided(F f, vtkm::Id end)
    {
      const vtkm::Id start = blockIdx.x * blockDim.x + threadIdx.x;
      const vtkm::Id inc = blockDim.x * gridDim.x;
      for (vtkm::Id index = start; index < end; index += inc)
      {
        f(index);
      }
    }
    Schedule1DIndexKernel<TaskStrided><<<32*numSMs, 128, 0, CUDAStreamPerThread>>>(
           functor, numInstances);

    With a loop stride equal to grid size we maintain the optimal memory coalescing patterns as we had with the monolithic version. These changes also allow VTK-m to optimize TaskStrided so that it can reuse infrastructure between iterations.

    Scatter class moved to dispatcher

    Scatter classes are special objects that are associated with a worklet to adjust the standard 1:1 mapping of input to output in the worklet execution to some other mapping with multiple outputs to a single input or skipping over input values. A classic use case is the Marching Cubes algorithm where cube cases will have different numbers of output. A scatter object allows you to specify for each output polygon which source cube it comes from.

    Scatter objects have been in VTK-m for some time now (since before the 1.0 release). The way they used to work is that the worklet completely managed the scatter object. It would declare the ScatterType, keep a copy as part of its state, and provide a GetScatter method so that the dispatcher could use it for scheduling.

    The problem with this approach is that it put control-environment-specific state into the worklet. The scatter object would be pushed into the execution environment (like a CUDA device) like the rest of the worklet where it could not be used. It also meant that worklets that defined their own scatter had to declare a bunch more code to manage the scatter.

    This behavior has been changed so that the dispatcher object manages the scatter object. The worklet still declares the type of scatter by declaring a ScatterType (defaulting to ScatterUniform for 1:1 mapping), but its responsibility ends there. When the dispatcher is constructed, it must be given a scatter object that matches the ScatterType of the associated worklet. (If ScatterType has a default constructor, then one can be created automatically.) A worklet may declare a static MakeScatter method for convenience, but this is not necessary.

    As an example, a worklet may declare a custom scatter like this.

      class Generate : public vtkm::worklet::WorkletMapField
      {
      public:
        typedef void ControlSignature(FieldIn<Vec3> inPoints,
                                      FieldOut<Vec3> outPoints);
        typedef void ExecutionSignature(_1, _2);
        using InputDomain = _1;
    
        using ScatterType = vtkm::worklet::ScatterCounting;
    
        template<typename CountArrayType, typename DeviceAdapterTag>
        VTKM_CONT
        static ScatterType MakeScatter(const CountArrayType &countArray,
                                       DeviceAdapterTag)
        {
          VTKM_IS_ARRAY_HANDLE(CountArrayType);
          return ScatterType(countArray, DeviceAdapterTag());
        }

    Note that the ScatterCounting needs to be created with the appropriate indexing arrays to make the scatter behave as the worklet expects, so the worklet provides a helpful MakeScatter method to make it more clear how to construct the scatter.

    This worklet can be invoked as follows.

        auto generateScatter =
            ClipPoints::Generate::MakeScatter(countArray, DeviceAdapterTag());
        vtkm::worklet::DispatcherMapField<ClipPoints::Generate, DeviceAdapterTag>
            dispatcherGenerate(generateScatter);
        dispatcherGenerate.Invoke(pointArray, clippedPointsArray);

    Because the ScatterCounting class does not have a default constructor, you would get a compiler error if you failed to provide one to the dispatcher's constructor. The compiler error will probably not be too helpful the the user, but there is a detailed comment in the dispatcher's code where the compiler error will occur describing what the issue is.

    VTK-m now supports dispatcher parameters being pointers

    Previously it was only possible to pass values to a dispatcher when you wanted to invoke a VTK-m worklet. This caused problems when it came to designing new types that used inheritance as the types couldn't be past as the base type to the dispatcher. To fix this issue we now support invoking worklets with pointers as seen below.

      vtkm::cont::ArrayHandle<T> input;
      //fill input
    
      vtkm::cont::ArrayHandle<T> output;
      vtkm::worklet::DispatcherMapField<WorkletType> dispatcher;
    
      dispatcher(&input, output);
      dispatcher(input, &output);
      dispatcher(&input, &output);

    Histogram filter supports custom types

    By passing TypeList and StorageList type into FieldRangeGlobalCompute, upstream users(VTK) can pass custom types into the histogram filter.

    Cell measure functions, worklet, and filter

    VTK-m now provides free functions, a worklet, and a filter for computing the integral measure of a cell (i.e., its arc length, area, or volume).

    The free functions are located in vtkm/exec/CellMeasure.h and share the same signature:

      template<typename OutType, typename PointVecType>
      OutType CellMeasure(
        const vtkm::IdComponent& numPts,
        const PointCoordVecType& pts,
        CellShapeTag,
        const vtkm::exec::FunctorBase& worklet);

    The number of points argument is provided for cell-types such as lines, which allow an arbitrary number of points per cell. See the worklet for examples of their use.

    The worklet is named vtkm::worklet::CellMeasure and takes a template parameter that is a tag list of measures to include. Cells that are not selected by the tag list return a measure of 0. Some convenient tag lists are predefined for you:

    • vtkm::ArcLength will only compute the measure of cells with a 1-dimensional parameter-space.
    • vtkm::Area will only compute the measure of cells with a 2-dimensional parameter-space.
    • vtkm::Volume will only compute the measure of cells with a 3-dimensional parameter-space.
    • vtkm::AllMeasures will compute all of the above.

    The filter version, named vtkm::filter::CellMeasures – plural since it produces a cell-centered array of measures — takes the same template parameter and tag lists as the worklet. By default, the output array of measure values is named "measure" but the filter accepts other names via the SetCellMeasureName() method.

    The only cell type that is not supported is the polygon; you must triangulate polygons before running this filter. See the unit tests for examples of how to use the worklet and filter.

    The cell measures are all signed: negative measures indicate that the cell is inverted. Simplicial cells (points, lines, triangles, tetrahedra) cannot not be inverted by definition and thus always return values above or equal to 0.0. Negative values indicate either the order in which vertices appear in its connectivity array is improper or the relative locations of the vertices in world coordinates result in a cell with a negative Jacobian somewhere in its interior. Finally, note that cell measures may return invalid (NaN) or infinite (Inf, -Inf) values if the cell is poorly defined, e.g., has coincident vertices or a parametric dimension larger than the space spanned by its world-coordinate vertices.

    The verdict mesh quality library was used as the source of the methods for approximating the cell measures.

    Add a WaveletGenerator worklet (e.g. vtkRTAnalyticSource)

    Add a VTK-m implementation of VTK's vtkRTAnalyticSource, or "Wavelet" source as it is known in ParaView. This is a customizable dataset with properties that make it useful for testing and benchmarking various algorithms.

    Add a filter to support Lagrangian analysis capabilities

    Lagrangian analysis operates in two phases - phase one involes the extraction of flow field information. Phase two involves calculating new particle trajectories using the saved information.

    The lagrangian filter can be used to extract flow field information given a time-varying vector fields. The extracted information is in the form of particle trajectories.

    The filter operates by first being set up with some information regarding step size, the interval at which information should be saved (write frequency), the number of seeds to be placed in the domain (specified as a reduction factor along each axis of the original dimensions). The step size should be equivalent to the time between vector field data input. The write frequency corresponds to the number of cycles between saves.

    Filter execution is called for each cycle of the simulation data. Each filter execution call requires a velocity field to advect particles forward.

    The extracted particle trajectories - referred to as basis flows exist in the domain for the specified interval (write frequency). Particles are then reset along a uniform grid and new particle trajectories are calculated.

    An example of using the Lagrangian filter is at vtk-m/examples/lagrangian The basis flows are saved into a folder named output which needs to be created in the directory in which the program is being executed.

    The basis flows can be interpolated using barycentric coordinate interpolation or a form of linear interpolation to calculate new particle trajectories post hoc.

    An example of using basis flows generated by the Lagrangian filter is at vtk-m/examples/posthocinterpolation. The folder contains a script which specifies parameters which need to be provided to use the example.

    Add a warp vector worklet and filter

    This commit adds a worklet that modifies point coordinates by moving points along point normals by the scalar amount. It's a simplified version of the vtkWarpScalar in VTK. Additionally the filter doesn't modify the point coordinates, but creates a new point coordinates that have been warped. Useful for showing flow profiles or mechanical deformation.

    Add a warp scalar worklet and filter

    This commit adds a worklet as well as a filter that modify point coordinates by moving points along point normals by the scalar amount times the scalar factor. It's a simplified version of the vtkWarpScalar class in VTK. Additionally the filter doesn't modify the point coordinates, but creates a new point coordinates that have been warped.

    Add a split sharp edge filter

    It's a filter that splits sharp manifold edges where the feature angle between the adjacent surfaces are larger than the threshold value. When an edge is split, it would add a new point to the coordinates and update the connectivity of an adjacent surface. Ex. There are two adjacent triangles(0,1,2) and (2,1,3). Edge (1,2) needs to be split. Two new points 4(duplication of point 1) an 5(duplication of point 2) would be added and the later triangle's connectivity would be changed to (5,4,3). By default, all old point's fields would be copied to the new point. Use with caution.

    Time-varying "oscillator" filter and example

    The oscillator is a simple analytical source of time-varying data. It provides a function value at each point of a uniform grid that is computed as a sum of Gaussian kernels — each with a specified position, amplitude, frequency, and phase.

    The example (in examples/oscillator) generates volumetric Cinema datasets that can be viewed in a web browser with ArcticViewer.

    Build

    Build System Redesign and new minimum CMake

    VTK-m CMake buildsystem was redesigned to be more declarative for consumers. This was done by moving away from the previous component design and instead to explicit targets. Additionally VTK-m now uses the native CUDA support introduced in CMake 3.8 and has the following minimum CMake versions:

    • Visual Studio Generator requires CMake 3.11+
    • CUDA support requires CMake 3.9+
    • OpenMP support requires CMake 3.9+
    • Otherwise CMake 3.3+ is supported

    When VTK-m is found find_package it defines the following targets:

    • vtkm_cont

      • contains all common core functionality
      • always exists
    • vtkm_rendering

      • contains all the rendering code
      • exists only when rendering is enabled
      • rendering also provides a vtkm_find_gl function
        • allows you to find the GL (EGL,MESA,Hardware), GLUT, and GLEW versions that VTK-m was built with.

    VTK-m also provides targets that represent what device adapters it was built to support. The pattern for these targets are vtkm::<device>. Currently we don't provide a target for the serial device.

    • vtkm::tbb

      • Target that contains tbb related link information implicitly linked to by vtkm_cont if tbb was enabled
    • vtkm::openmp

      • Target that contains openmp related link information implicitly linked to by vtkm_cont if openmp was enabled
    • vtkm::CUDA

      • Target that contains CUDA related link information implicitly linked to by vtkm_cont if CUDA was enabled

    VTK-m can be built with specific CPU architecture vectorization/optimization flags. Consumers of the project can find these flags by looking at the vtkm_vectorization_flags target.

    So a project that wants to build an executable that uses vtk-m would look like:

    
    cmake_minimum_required(VERSION 3.3 FATAL_ERROR)
    project(HellowWorld CXX)
    
    #Find the VTK-m package.
    #Will automatically enable the CUDA language if needed ( and bump CMake minimum )
    
    find_package(VTKm REQUIRED)
    
    add_executable(HelloWorld HelloWorld.cxx)
    target_link_libraries(HelloWorld PRIVATE vtkm_cont)
    
    if(TARGET vtkm::CUDA)
      set_source_files_properties(HelloWorld.cxx PROPERTIES LANGUAGE CUDA)
    endif()
    

    Add none option to VTKm_CUDA_Architecture

    A new VTKm_CUDA_Architecture option called none has been added. This will disable all VTK-m generated CUDA architecture flags, allowing the user to specify their own custom flags.

    Useful when VTK-m is used as a library in another project and the project wants to use its own architecture flags.

    Add a new cmake option: VTKm_ENABLE_DEVELOPER_FLAGS

    The new cmake option VTKm_ENABLE_DEVELOPER_FLAGS can be used to enable/disable warnings in VTK-m. It is useful to disable VTK-m's warning flags when VTK-m is directly embedded by a project as sub project (add_subdirectory), and the warnings are too strict for the project. This does not apply when using an installed version of VTK-m.

    For example, this flag is disabled in VTK.

    This flag is enabled by default.

    Add a new cmake option: VTKm_INSTALL_ONLY_LIBRARIES

    The new cmake option VTKm_INSTALL_ONLY_LIBRARIES when enabled will cause VTK-m to only install libraries. This is useful for projects that are producing an application and don't want to ship headers or CMake infrastructure.

    For example, this flag is enabled by ParaView for releases.

    This flag is disabled by default.

    Other

    Control CUDA managed memory with an environment variable

    By setting the environment variable "VTKM_MANAGEDMEMO_DISABLED" to be 1, users are able to disable CUDA managed memory even though the hardware is capable of doing so.

    Add support for deferred freeing of CUDA memory

    A new function, void CudaAllocator::FreeDeferred(void* ptr, std::size_t numBytes) has been added that can be used to defer the freeing of CUDA memory to a later point. This is useful because cudaFree causes a global sync across all CUDA streams. This function internally maintains a pool of to-be-freed pointers that are freed together when a size threshold is reached. This way a number of global syncs are collected together at one point.

    Allow variable arguments to VTKM_TEST_ASSERT

    The VTKM_TEST_ASSERT macro is a very useful tool for performing checks in tests. However, it is rather annoying to have to always specify a message for the assert. Often the failure is self evident from the condition (which is already printed out), and specifying a message is both repetative and annoying.

    Also, it is often equally annoying to print out additional information in the case of an assertion failure. In that case, you have to either attach a debugger or add a printf, see the problem, and remove the printf.

    This change solves both of these problems. VTKM_TEST_ASSERT now takes a condition and a variable number of message arguments. If no message arguments are given, then a default message (along with the condition) are output. If multiple message arguments are given, they are appended together in the result. The messages do not have to be strings. Any object that can be sent to a stream will be printed correctly. This allows you to print out the values that caused the issue.

    So the old behavior of VTKM_TEST_ASSERT still works. So you can have a statement like

    VTKM_TEST_ASSERT(array.GetNumberOfValues() != 0, "Array is empty");

    As before, if this assertion failed, you would get the following error message.

    Array is empty (array.GetNumberOfValues() != 0)

    However, in the statement above, you may feel that it is self evident that array.GetNumberOfValues() == 0 means the array is empty and you have to type this into your test, like, 20 times. You can save yourself some work by dropping the message.

    VTKM_TEST_ASSERT(array.GetNumberOfValues() != 0);

    In this case if the assertion fails, you will get a message like this.

    Test assertion failed (array.GetNumberOfValues() != 0)

    But perhaps you have the opposite problem. Perhaps you need to output more information. Let's say that you expected a particular operation to half the length of an array. If the operation fails, it could be helpful to know how big the array actually is. You can now actually output that on failure by adding more message arguments.

    VTKM_TEST_ARRAY(outarray.GetNumberOfValues() == inarrayGetNumberOfValues()/2,
                    "Expected array size ",
            inarrayGetNumberOfValues()/2,
            " but got ",
            outarray.GetNumberOfValues());

    In this case, if the test failed, you might get an error like this.

    Expected array size 5 but got 6 (outarray.GetNumberOfValues() == inarrayGetNumberOfValues()/2)

    Support constexpr and variadic constructor for Vec

    Add variadic constructors to the vtkm::Vec classes. The main advantage of this addition is that it makes it much easier to initialize Vecs of arbitrary length.

    Meanwhile, Vec classes constructed with values listed in their parameters up to size 4 are constructed as constant expressions at compile time to reduce runtime overhead. Sizes greater than 4 are not yet supported to be constructed at compile time via initializer lists since in C++11 constexpr does not allow for loops. Only on Windows platform with a compiler older than Visual Studio 2017 version 15.0, users are allowed to use initializer lists to construct a vec with size > 4.

    vtkm::make_Vec would always construct Vec at compile time if possible.

    vtkm::Vec<vtkm::Float64, 3> vec1{1.1, 2.2, 3.3};  // New better initializer since
                                                      // it does not allow type narrowing
    
    vtkm::Vec<vtkm::Float64, 3> vec2 = {1.1, 2.2, 3.3}; // Nice syntax also supported by
                                                        // initializer lists.
    
    vtkm::Vec<vtkm::Float64, 3> vec3 = vtkm::make_Vec(1.1, 2.2, 3.3); // Old style that still works.
    
    vtkm::Vec<vtkm::Float64, 3> vec3(1.1, 2.2, 3.3); // Old style that still works but
                                                     // should be deprecated. Reason listed below.

    Nested initializer lists work to initialize Vec of Vecs. If the size is no more than 4, it's always constructed at compile time if possible.

    vtkm::Vec<vtkm::Vec<vtkm::Float64, 2>, 3> vec{ {1.1, 2.2}, {3.3, 4.4}, {5.5, 6.6} };
                                                               //Constructed at compile time

    One drawback about the std::initializer_list implementation is that it constructs larger Vec(size>4) of scalars or vecs at run time.

    vtkm::Vec<vtkm::Float64, 5> vec1{1.1, 2.2, 3.3, 4.4, 5.5}; // Constructed at run time.
    
    vtkm::Vec<vtkm::Vec<vtkm::Float64, 2>, 5> vec2{{1.1, 1.1},{2.2, 2.2},{3.3, 3.3},
                                          {4.4, 4.4}, {5.5, 5.5}}; // Constructed at run time.

    Parenthesis constructor would report an error if the size is larger than 4 when being used to construct a Vec of Vecs. If it's being used to construct a Vec of scalars then it's fine.

    vtkm::Vec<vtkm::Vec<vtkm::Float64, 2>, 5> vec2({1.1, 1.1},{2.2, 2.2},{3.3, 3.3},
                                          {4.4, 4.4}, {5.5, 5.5}); // ERROR! This type of
                                                                   // constructor not implemented!
    
    vtkm::Vec<vtkm::Float64, 5> vec1(1.1, 2.2, 3.3, 4.4, 5.5); // Constructed at compile time.

    If a vtkm::Vec is initialized with a list of size one, then that one value is replicated for all components.

    vtkm::Vec<vtkm::Float64, 3> vec{1.1};  // vec gets [ 1.1, 1.1, 1.1 ]

    This "scalar" initialization also works for Vec of Vecs.

    vtkm::Vec<vtkm::Vec<vtkm::Float64, 2>, 3> vec1{ { 1.1, 2.2 } };
    // vec1 is [[1.1, 2.2], [1.1, 2.2], [1.1, 2.2]]
    
    vtkm::Vec<vtkm::Vec<vtkm::Float64, 2>, 3> vec2{ { 3.3}, { 4.4 }, { 5.5 } };
    // vec2 is [[3.3, 3.3], [4.4, 4.4], [5.5, 5.5]]

    vtkm::make_Vec is also updated to support an arbitrary number initial values which are constructed at compile time.

    // Creates a vtkm::Vec<vtkm::Float64, 5>
    auto vec = vtkm::make_Vec(1.1, 2.2, 3.3, 4.4, 5.5);

    This is super convenient when dealing with variadic function arguments.

    template <typename... Ts>
    void ExampleVariadicFunction(const Ts&... params)
    {
      auto vec = vtkm::make_Vec(params...);

    Of course, this assumes that the type of all the parameters is the same. If not, you could run into compiler trouble.

    vtkm::make_Vec does not accept an std::initializer_list,

    // Creates a vtkm::Vec<vtkm::Float64, 3>
    auto vec1 = vtkm::make_Vec<3>({1.1, 2.2, 3.3}); // ERROR
    
    // Creates exactly the same thing but compiles
    auto vec1 = vtkm::make_Vec<3>(1.1, 2.2, 3.3);

    A limitation of the initializer list constructor is that the compiler has no way to check the length of the list or force it to a particular length. Thus it is entirely possible to construct a Vec with the wrong number of arguments. Or, more to the point, the compiler will let you do it, but there is an assert in the constructor to correct for that. (Of course, asserts are not compiled in release builds.)

    // This will compile, but it's results are undefined when it is run.
    // In debug builds, it will fail an assert.
    vtkm::Vec<vtkm::Float64, 3> vec{1.1, 1.2};

    vtkm::Vec< vtkm::Vec<T> > can't be constructed from vtkm::Vec<U>

    When you have a Vec<Vec<float,3>> it was possible to incorrectly initialize it with the contents of a Vec<double,3>. An example of this is:

    using Vec3d = vtkm::Vec<double, 3>;
    using Vec3f = vtkm::Vec<float, 3>;
    using Vec3x3f = vtkm::Vec<Vec3f, 3>;
    
    Vec3d x(0.0, 1.0, 2.0);
    Vec3x3f b(x); // becomes [[0,0,0],[1,1,1],[2,2,2]]
    Vec3x3f c(x, x, x); // becomes [[0,1,2],[0,1,2],[0,1,2]]
    Vec3x3f d(Vec3f(0.0f,1.0f,2.0f)) //becomes [[0,0,0],[1,1,1],[2,2,2]]

    So the solution we have chosen is to disallow the construction of objects such as b. This still allows the free implicit cast to go from double to float.

    Use std::call_once to construct singeltons

    By using call_once from C++11, we can simplify the logic in code where we are querying same value variables from multiple threads.

    Use thread_local in vtkm::cont::GetGlobalRuntimeDeviceTracker function if possible

    It will reduce the cost of getting the thread runtime device tracker, and will have a better runtime overhead if user constructs a lot of short lived threads that use VTK-m.

    Replace std::random_shuffle with std::shuffle

    std::random_shuffle is deprecated in C++14 because it's using std::rand which uses a non uniform distribution and the underlying algorithm is unspecified. Using std::shuffle can provide a reliable result in a 64 bit version.

  • v1.2.0   VTK-m 1.2.0 1.2.0 is our third official release. This release is stable and should compile across many platforms. However, note that VTK-m is still under heavy development and each release contains backward-incompatible changes.
    9cb15223 · Release VTK-m 1.2.0 ·
  • v1.1.0   VTK-m 1.1.0 is our second official release. This release is stable and should compile across many platforms. However, note that VTK-m is still under heavy development and each release contains backward-incompatible changes.
  • v1.0.0   VTK-m 1.0.0

    VTK-m 1.0.0 is our first official release. This release is stable and should compile across many platforms. However, note that VTK-m is still under heavy development. We will be following up this release soon with a version 1.1, which may contain some backward-incompatible changes.