Skip to content

Conversation

@justincdavis
Copy link

Summary

Implement the CV-CUDA backend kernel for gaussian_blur

How to use

import cvcuda
import torchvision.transforms.v2.functional as F

cv_tensor = cvcuda.Tensor((1, 224, 224, 3), cvcuda.Type.U8, cvcuda.TensorLayout.NHWC)
# dispatched to F.gaussian_blur_cvcuda
blurred = F.gaussian_blur(cv_tensor, (5, 5))

@pytorch-bot
Copy link

pytorch-bot bot commented Nov 19, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9280

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 754223f with merge base aa35ca1 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla
Copy link

meta-cla bot commented Nov 19, 2025

Hi @justincdavis!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

Copy link
Member

@AntoineSimoulin AntoineSimoulin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @justincdavis. Left a few comments, looking good otherwise!

actual_torch = F.cvcuda_to_tensor(actual)

if dtype.is_floating_point:
torch.testing.assert_close(actual_torch, expected, rtol=0, atol=0.3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why setting atol=0.3 here?

Copy link
Author

@justincdavis justincdavis Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! I added a comment on atol=0.3, most likely from floating point differences between the underlying filter2d in CV-CUDA compared to torch.conv2d. Let me know if you want more explanation and/or something else here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we set it as in test_functional_image_correctness with torch.testing.assert_close(actual, expected, rtol=0, atol=1) for consistency?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AntoineSimoulin I ended up rewriting the test setup, and moved all the tests into a single block. Both CV-CUDA and torchvision share the same assert statement now. LMK if you think it looks like a good change.

@meta-cla meta-cla bot added the cla signed label Dec 2, 2025
@meta-cla
Copy link

meta-cla bot commented Dec 2, 2025

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

Copy link
Contributor

@zy1git zy1git left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve left some comments on this PR.
Please feel free to address them or reach out if you’d like to discuss any points further.

if dtype is torch.float16 and device == "cpu":
pytest.skip("The CPU implementation of float16 on CPU differs from opencv")
if (dtype != torch.float32 and dtype != torch.uint8) and input_type == "cvcuda.Tensor":
pytest.skip("CVCUDA does not support non-float32 or uint8 dtypes for gaussian blur")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that this comment is bit confusing:

  • Does it mean "non-(float32 or uint8)" → neither float32 nor uint8?

  • Or "(non-float32) or uint8" → something else entirely?

Thus, I recommend to use "CVCUDA only supports float32 and uint8 dtypes for gaussian blur".


if input_type == "cvcuda.Tensor":
actual = F.cvcuda_to_tensor(actual)
actual = actual.squeeze(0).to(device=device)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also use actual=actual[0].to(device=device) since batch size is guaranteed to be 1 in this case. Not sure we need to be consistent to the implementation here: https://github.com/pytorch/vision/pull/9277/changes#diff-9c2dde92db86c123fee225e39b7c1ef96e08a3e79a9dcc9a2d68b21ed51a81d0R1315


if input_type == "cvcuda.Tensor":
actual = F.cvcuda_to_tensor(actual)
actual = actual.squeeze(0).to(device=device)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can also use actual = actual[0].to(device=device) since the batch size is always 1 in this case. Not sure we need to be consistent with the implementation here: https://github.com/pytorch/vision/pull/9277/changes#diff-9c2dde92db86c123fee225e39b7c1ef96e08a3e79a9dcc9a2d68b21ed51a81d0R1315

make_image,
make_video,
pytest.param(
make_image_cvcuda, marks=pytest.mark.skipif(not CVCUDA_AVAILABLE, reason="CVCUDA is not available")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See this PR: https://github.com/pytorch/vision/pull/9305/changes

There are other parts with similar issues that also need to be addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants