Candidate fix for dsize=positive and flipping warps (!81) · Merge requests · computer-vision / kwimage

Jon Crall requested to merge dev/fix_warp_affine_positive_dsize into dev/0.9.22 Oct 10, 2023

@matthew.bernstein If you have some time I would appreciate a review here.

For flip/rotation augmentations I wanted to add a helper constructor to kwimage.Affine. The trick with these augmentation is that if you use a sequence of np.flip / np.rot90 there are implicit translations that happen to keep the image in the positive quadrent (e.g. otherwise a flip over the x-axis would put the entire image in a negative part of the Cartesian plane).

While in a vacuum using the numpy methods will be faster than using a full affine warp, if you are already doing an affine warp it makes a lot of sense to just linearly fuse the operations together and do a single affine trasnform that takes care of everything (effectively what delayed-image does). So that's the modivation: I want an affine transform to represent flips and rotations.

Towards this end in dev/0.9.22 I implemented a new classmethod constructor: kwimage.Affine.fliprot which - given the size of the canvas - will construct the affine transformation I want.

When it came time to implement a test for this, I wrote something that applies all variants of this new transform to an image with an annotation and visualized it. This is where I encountered the first issue: Applying this transform needs to know both what the original canvas size is (so fliprot can add the appropriate translation adjustments) and the output canvas size.

My initial pass at this just set the output canvas size to the maximum of the input w / h, which gives you something big enough, but does have extra blank space.

Then I remembered the dsize='positive' keyword argument to warp_affine we worked on a while ago. That's exactly the thing needed for this application. However, I quickly found out that that did not work. I got this:

I found out what was happening was that when a kwimage.Boxes is flipped (i.e. has a negative scale component in the affine transform) that causes it to have a negative width / height, and kwimage.warp_affine dutifully provides a negative dsize, which opencv (or something else) squashes to zero.

This is where I could use a second brain: I don't think I setup kwimage.Boxes to handle the case where width/height was negative and I've completely neglected this sort of flipping / reflection transform in my tests.

I've added an experimental function to Boxes called _ensure_nonnegative_extent, which will "fix" these boxes, but I'm nervous about applying it without thinking through all of the consequences. I'm thinking that I'm losing some sort of orientation information if I "rectify" boxes in this was. Perhaps that's fine. In any case, having a funcition that can be called to "rectify" them wont hurt anything. In this MR I only use the function in a specific place to fix flips for warp affine, and it does the right thing in the case I was testing for:

I'm mainly writing this so I have a record of my thought process, but also to:

solicit other potential areas where this negative width boxes could be impacting the library
check if this limited change in warp_affine when dsize=positive has any unintended consequences I'm not testing for here.

Doctest for the above example is here: https://gitlab.kitware.com/computer-vision/kwimage/-/blob/dev/0.9.22/kwimage/transform.py#L2064

Implementation of _ensure_nonnegative_extent is here: https://gitlab.kitware.com/computer-vision/kwimage/-/blob/dev/0.9.22/kwimage/structs/boxes.py#L3357

Edited Oct 10, 2023 by Jon Crall

Candidate fix for dsize=positive and flipping warps

Merge request reports