Break reference cycle in vtk pipeline
The goal is to empower asynchronous VTK applications by removing the necessity of VTK garbage collector. That GC has a severe limitation: can collect garbage only on the main thread.
This issue documents the current design of VTK pipeline and elaborates on the requirement of a GC. It dives right into the guts of vtkAlgorithm
and vtkExecutive
. I find this helps me recap the problem from time to time.
Understand the requirement for GC:
VTK GC was required to deal with complex reference loops in the current design of VTK pipeline and elsewhere. A producer keeps a hard reference to itself within it's output information objects per port (say info_op0
) via the vtkExecutive::PRODUCER
key. This is the most complex issue here. A consumer hard references it's producer's output information object per port in one of its input information objects. This is how a consumer keeps a strong reference to its producer.
Example
Observe evolution of info_op0
in a simple pipeline setup with snk->SetInputConnection(0, src->GetOutputPort())
:
------- -------
| | | |
| src |-[op0]--------->[inp0]-| snk |
| | | |
------- -------
src->GetOutputInformation(port)
is called somewhere in the beginning of ::SetInputConnection
.
The producer (src
) adds a hard reference to itself in its output information object for that port. info_op0
now looks like
info_op0 (vtkInformation 0xabcde):
PRODUCER: <src (vtkExecutive 0x12345), 0>
CONSUMERS: []
Towards the end of ::SetInputConnection
the consumer goes on to add a hard reference to itself in the very same info_op0
object.
info_op0 (vtkInformation 0xabcde):
PRODUCER: <src (vtkExecutive 0x12345), 0>
CONSUMERS: [<snk (vtkExecutive 0x6789a, 0>]
The consumer then sets info_op0
object as its input information object (say info_ip0
) for the specified input port. There's a similar setup in ::AddInputConnection
. Whatever is set or appended in either of these functions is removed in ::RemoveInputConnection
Assuming vtkExecutive
and vtkAlgorithm
did not use a GC. If one were to
- delete
snk
by decref'ing, it is still kept alive bysrc
'sinfo_op0
. - delete
src
by decref'ing, it is still kept alive bysnk
'sinfo_ip0
.
When the application exits, both src
and snk
keep each other alive and leak memory very nicely. Whereas with a GC, this strongly connected component is identified and freed.
Here's a real example of that leak when you build VTK with garbage collection disabled for vtkExecutive
and vtkAlgorithm
.
>>> from vtkmodules.vtkCommonExecutionModel import *
>>> from vtkmodules.vtkFiltersSources import *
>>> src = vtkPlaneSource()
>>> snk = vtkTrivialConsumer()
>>> src.UsesGarbageCollector()
False
>>> snk.UsesGarbageCollector()
False
>>> src.GetExecutive().UsesGarbageCollector()
False
>>> snk.GetExecutive().UsesGarbageCollector()
False
>>> snk.SetInputConnection(0, src.GetOutputPort())
>>> snk.Update()
>>> exit()
vtkDebugLeaks has detected LEAKS!
..
Class "vtkPlaneSource" has 1 instance still around.
..
Class "vtkTrivialConsumer" has 1 instance still around.
..
Secondary problem
The CONSUMERS: []
key-value pair is really unnecessary. Its only use case is a very strange application where an algorithm modifies its number of output ports after the construction of a pipeline. See vtkAlgorithm::SetNumberOfOutputPorts()
. Here, the algorithm visits each previously connected consumer and removes itself from the consumer's inputs. I don't know of a use case where a derived algorithm would call ::SetNumberOfOutputPorts()
outside the constructor. We could remove all occurrences of vtkExecutive::CONSUMERS
and ideally nothing would change.
Possible solutions have been moved to !8975
Test scripts:
- This should not leak upon exit.
from vtkmodules.vtkCommonExecutionModel import *
from vtkmodules.vtkFiltersSources import *
src = vtkPlaneSource()
snk = vtkTrivialConsumer()
snk.SetInputConnection(0, src.GetOutputPort())
- This should not leak upon exit and must print the expected output.
from vtkmodules.vtkCommonExecutionModel import *
snk = vtkTrivialConsumer()
print(snk.GetReferenceCount())
1
print(snk.GetExecutive().GetReferenceCount())
2
print(snk.GetReferenceCount())
1