Skip to content

Fix Intel GPU VA-API support and video streaming#1572

Draft
lukemarsden wants to merge 12 commits intomainfrom
fix/software-rendering-mouse-input
Draft

Fix Intel GPU VA-API support and video streaming#1572
lukemarsden wants to merge 12 commits intomainfrom
fix/software-rendering-mouse-input

Conversation

@lukemarsden
Copy link
Copy Markdown
Collaborator

Overview

This PR fixes video streaming issues on Intel GPU systems by enabling VA-API hardware encoding.

Issues Fixed

1. VA-API Permission Denied

  • Problem: /dev/dri/renderD128 had group ownership sgx (Intel SGX) instead of video
  • Fix: Modified device permission init script to handle sgx group
  • Result: Device permissions now correctly set to video group

2. Missing Intel VA-API Drivers

  • Problem: Only mesa-va-drivers installed (AMD/NVIDIA only, no Intel drivers)
  • Fix: Added Intel VA-API driver packages:
    • intel-media-va-driver (iHD, modern Gen8+ GPUs)
    • i965-va-driver (legacy Intel GPUs)
  • Result: VA-API now loads successfully with 13 features including vaapih264enc

3. Video Streaming Pipeline

  • Video pipeline now uses Intel VA-API hardware encoding (vaapih264enc) instead of software fallback
  • Should resolve "only first frame comes through" issue on Intel GPU systems

Test Results

Before fixes:

gst-inspect-1.0 vaapi → 0 features

After fixes:

gst-inspect-1.0 vaapi → 13 features including:
- vaapih264enc (H.264 hardware encoder)
- vaapih265enc (H.265 hardware encoder)
- vaapih264dec, vaapih265dec, etc.

Changes

  • Modified Dockerfile.ubuntu-helix to install Intel VA-API drivers
  • Updated device permission initialization to handle sgx group
  • New desktop image version: helix-ubuntu:da2916

Testing Needed

⚠️ This PR needs further testing before merge:

  1. Test video streaming on Intel GPU systems with active content (vkcube)
  2. Verify mouse clicks work correctly in both software and Intel GPU modes
  3. Test static screen keepalive frames (should be ~10 FPS)
  4. Verify VA-API hardware encoding is actually being used (check encoder logs)
  5. Test on AMD GPU systems to ensure no regression

Related Issues

Related to video streaming freeze issue on Intel GPU machines.

Problem: Video streaming froze after first frame on Intel GPU systems.

Root cause: GPU_VENDOR=none forced software rendering, and
pipewiresrc keepalive-time property wasn't producing frames.

Fix: Enable Intel GPU hardware acceleration:
- Set GPU_VENDOR=intel in .env
- Configure HELIX_RENDER_NODE=/dev/dri/renderD128
- Set LIBVA_DRIVER_NAME=iHD for Intel VA-API driver

This enables VA-API or QSV hardware encoding instead of OpenH264
software encoding, which works correctly with GNOME ScreenCast.

See INTEL_GPU_FIX.md for verification steps.
Problem: Headless GNOME uses damage-based rendering, only sending frames
when screen content changes. This causes:
- 0-1 FPS on static screens
- Capped at 10 FPS even with active interaction (previous wrong fix)

Root cause: gnome-shell --headless uses damage-based ScreenCast.
This is CORRECT behavior for power efficiency, not a bug.

Solution: Add videorate to software encoders (openh264, x264):
- Duplicates last frame on static screens (~10 FPS smooth video)
- Allows up to 60 FPS when content actively changes
- NO framerate caps - let videorate handle frame timing naturally

Removed previous framerate=10/1 caps that was limiting to 10 FPS.
Changed videorate caps from ~10 FPS to 2-60 FPS range:
- 2 FPS minimum on static screens (saves bandwidth)
- Up to 60 FPS when content is actively changing
- Framerate cap: video/x-raw,framerate=[2/1,60/1]
The framerate=[2/1,FPS/1] caps was forcing negotiation to 2 FPS.
videorate handles frame timing naturally without needing caps filter.
Result: ~10 FPS on static screens, up to max-rate on active content.
Testing if keepalive-time=500 on pipewiresrc works in software mode
without videorate workaround.
The renderD128 device on this Intel GPU system has group ownership
'sgx' (Intel SGX - Software Guard Extensions) instead of 'video'.

The permission-fixing script only changed group ownership when the
group was 'root', so it skipped the sgx-owned render device. This
caused VA-API to fail with 'Permission denied' when trying to access
/dev/dri/renderD128.

Fix: Add 'sgx' to the list of groups that need to be changed to 'video'
so VA-API can access the render device for hardware encoding.
Previous commit modified wrong file (15-setup_devices.sh in desktop/ubuntu-config)
but the actual init script is defined inline in Dockerfile.ubuntu-helix as a
heredoc at line 362.

The renderD128 device has group 'sgx' (Intel SGX) instead of 'video',
causing VA-API to fail with 'Permission denied'. Fix by adding 'sgx'
to the list of groups that need to be changed to 'video'.

Removed unused 15-setup_devices.sh file that wasn't being copied into image.
Add intel-media-va-driver (iHD, modern Intel Gen8+) and i965-va-driver
(legacy Intel) packages to enable VA-API hardware encoding on Intel GPUs.

Previously only mesa-va-drivers was installed which provides AMD/NVIDIA/
virtio drivers but not Intel. This caused gst-inspect-1.0 vaapi to show
'0 features' and video encoding to fall back to software (openh264enc).

Fixes Intel VA-API initialization errors:
  Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
  va_openDriver() returns -1
@lukemarsden
Copy link
Copy Markdown
Collaborator Author

⚠️ Testing Required

This PR needs thorough testing before merge. Please verify:

  1. Intel GPU video streaming - Test with active content (vkcube) to ensure 60 FPS
  2. Mouse input functionality - Verify clicks work in both software and Intel GPU modes
  3. Static screen keepalive - Confirm ~10 FPS on static screens (damage-based rendering)
  4. VA-API hardware encoding - Check logs to verify vaapih264enc is being used (not software fallback)
  5. AMD GPU regression test - Ensure no issues on AMD systems

Current test results show VA-API drivers loading correctly with 13 features including vaapih264enc, but end-to-end video streaming needs verification on real hardware.

vaapipostproc (legacy gst-vaapi plugin) doesn't support the add-borders
property - that only exists in the newer vapostproc element (gst-va plugin).

Error: failed to parse pipeline: no property "add-borders" in element "vaapipostproc"

Fix by removing add-borders from vaapipostproc. The element will still
scale to the target resolution using its width/height properties.
- Removed tune=low-latency from vaapih264enc pipeline (property doesn't exist)
- This completes the fix for Intel GPU video streaming
- VA-API drivers already installed and device permissions fixed

Testing needed:
- Intel GPU mode (GPU_VENDOR=intel) video streaming
- Mouse clicks in Intel GPU mode
- Software rendering mode (GPU_VENDOR=none)
- Verify software mode works without /dev/dri devices

Note: Automated testing blocked by spectask CLI timing out when creating new sessions.
Manual session creation required to test the pipeline fix.
@lukemarsden
Copy link
Copy Markdown
Collaborator Author

VA-API Fix Update

Fixed the GStreamer pipeline error for Intel GPU hardware encoding:

Changes:

  • Removed unsupported property from pipeline
  • Intel VA-API drivers installed (intel-media-va-driver, i965-va-driver)
  • Device permissions fixed for Intel SGX group → video group
  • Removed unsupported property from

Testing Status:

Blocked: Unable to start new sessions via helix spectask CLI - sessions timeout waiting for sandbox start even though sandbox is running and healthy.

The fix is complete and ready for manual testing, but I encountered a workflow issue where:

  1. New spec tasks are created successfully
  2. But sessions never start for them (tasks stuck in backlog status)
  3. CLI times out after 90 seconds
  4. Existing sessions use old image (5bffef782770) not new image (967d7d9fcfe0)

What Needs Testing:

  1. ✅ Verify Intel GPU video streaming works (vaapih264enc pipeline)
  2. ✅ Test mouse clicks work in Intel GPU mode
  3. ✅ Test software rendering mode (GPU_VENDOR=none)
  4. ✅ Verify software mode works without /dev/dri devices

Manual testing required - start a new Ubuntu session manually through the UI or debug why spectask CLI can't create new sessions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant