Handling a vtkDataSet in a vtkImplicitArray backend
In short
Using vtkImplicitArray allows for memory reduction, at cost of computing the "answer" each time a value is requested.
In some use case, this computation is done by forwarding the request to a vtkDataSet
, but then we have a vtkDataArray
handling a vtkDataSet
.
Example
For instance, in !11052, we try to modify CellValidator
.
In pseudo-code, the backend looks like that:
struct ValidatorBackend
{
short operator()(const vtkIdType index) const
{
this->DataSet->GetCell(index, cell);
vtkCellValidator::State state = vtkCellValidator::Check(cell, this->Tolerance);
}
vtkDataSet* DataSet;
}
Problem
This leads to several questions about the lifetime of the vtkDataSet
and the vtkImplicitArray
itself.
From @spiros.tsalikis :
- If you keep the dataset pointer raw, it's gonna be a dangling pointer if the dataset is deallocated.
- If you use a weak pointer, then the array won't be able to return results if the dataset is deallocated.
- If you use a smart pointer, but you have requested for the dataset to be deallocated, then it won't be deallocated, which i don't think it's a great idea. Also you would need to resolve cycling dependency between the two
Suggestion
Both 1. and 2. lead to an invalid array, so they are not the correct way to go. So we need to find a way to properly handle this cycling dependency.
We propose to add some logic in the backend:
- using a weak pointer, so one can still destroy the dataset (break the cycle)
- observing the
DeleteEvent
of the dataset - on dataset deallocation, allocates the full array in memory (keep array valid). So this happens only if array is handled by someone else.
This solution allows to:
- keep the memory optimization.
- have no impact for "classic" case where the data array is only owned by the dataset (through a dataset attributes).
- keep the array usable and valid when dataset is destroyed
Impl details
As we have several occurrences of this problem, we will create a dedicated vtkImplicitArray
backend that will handle this logic at one place. Then the relevant filters may create their dedicated backend to only add the operator()
logic, and not duplicate this dataset management.
This was discussed in the following MR: