Skip to content

vgpu apis#130

Open
FreeMasen wants to merge 4 commits intorust-nvml:mainfrom
FreeMasen:feat/some-vgpu-apis
Open

vgpu apis#130
FreeMasen wants to merge 4 commits intorust-nvml:mainfrom
FreeMasen:feat/some-vgpu-apis

Conversation

@FreeMasen
Copy link
Contributor

This PR is an attempt to fully cover the vGPU APIs defined by nvml. To achieve this I've followed the patterns defined for Device and VgpuType to cover all functions in the unwrapped_functions.txt file that are named nvmlVgpu*

The one exception here was nvmlVgpuInstanceGetLicenseStatus which is deprecated and not documented at this time, if you'd like me to dig through older versions of the docs, I can find the documentation for how that API works.

Please let me know if I've misunderstood any of the patterns I've tried to emulate and/or the API documentation, I would be happy to follow up with additional changes as needed.

Note: this is currently marked as a draft because I based these changes on #129 and will rebase once that merges or is declined

@FreeMasen FreeMasen force-pushed the feat/some-vgpu-apis branch 4 times, most recently from 2b5367f to f3f0229 Compare March 5, 2026 21:03
@FreeMasen FreeMasen force-pushed the feat/some-vgpu-apis branch from f3f0229 to 970027b Compare March 5, 2026 21:15
Copy link
Contributor Author

@FreeMasen FreeMasen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is ready for an initial review to make sure I am not doing anything that goes against the goals of this project. I need to dig a little deeper into how testing works for this project to add additional test for these additions.

I am going to leave the Draft status until I can get through tests, any feedback would be welcome!

#[derive(Debug, Clone, Copy, Eq, PartialEq, Hash)]
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[repr(u32)]
pub enum VgpuLicenseState {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of an odd scenario since it is effectively a C enum in NVML but no c enum is actually defined for it. Let me know if you'd prefer this live in src/vgpu.rs instead of here

use wrapcenum_derive::EnumWrapper;

#[derive(Debug, Clone, Eq, PartialEq, Hash)]
pub enum VmId {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be a misinterpretation of the *_wrappers modules, let me know if you'd prefer this is moved to a different location

Licensed = NVML_GRID_LICENSE_STATE_LICENSED,
}

impl From<u32> for VgpuLicenseState {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since one of the cases was NVML_GRID_LICENSE_STATE_UNKNOWN, it seemed like we wouldn't need TryFrom here but let me know if you'd prefer to use the UnexpectedVariant error instead

pub hour: u8,
pub min: u8,
pub sec: u8,
pub status: u8,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find any documentation about why are expected values here and left the type effectively unchanged. Let me know if there is something I missed and I can add another enum to capture the status.

Comment on lines +39 to +44
year: u16::try_from(value.year).unwrap_or(u16::MAX),
month: u8::try_from(value.month).unwrap_or(u8::MAX),
day: u8::try_from(value.day).unwrap_or(u8::MAX),
hour: u8::try_from(value.hour).unwrap_or(u8::MAX),
min: u8::try_from(value.min).unwrap_or(u8::MAX),
sec: u8::try_from(value.sec).unwrap_or(u8::MAX),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These all seem like there would be little risk to fail and the TryFromIntError these return wasn't already included in the error variants, let me know if you'd prefer another variant is added there instead of using these defaults

Comment on lines +7963 to +7968
Ok(device
.active_vgpus()?
.into_iter()
.map(|v| v.instance)
.collect::<Vec<_>>())
})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a lifetime generic argument in the return value here made these test helpers fail due to the lifetime requirements, if there is a better way to handle this please let me know

impl TryFrom<nvmlVgpuMetadata_t> for VgpuMetadata {
type Error = NvmlError;
fn try_from(value: nvmlVgpuMetadata_t) -> Result<Self, Self::Error> {
let convert_c_str = |c_str: &[c_char]| {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is a bit odd, I wanted to avoid using any unsafe and so this will manually copy the string byte by byte which may not be the most efficient. Let me know if you'd prefer a different method for this or even if lossy conversion is acceptable here

@swlynch99
Copy link
Contributor

I'm happy to review this once it is ready. Once you're happy with it lmk.

@FreeMasen FreeMasen force-pushed the feat/some-vgpu-apis branch from afce533 to c6b1acb Compare March 11, 2026 01:10
let mut count = 0;
unsafe {
nvml_try_count(sym(self.instance, std::ptr::null_mut(), &mut count))?;
metadata = vec![std::mem::zeroed(); count as usize];
Copy link
Contributor Author

@FreeMasen FreeMasen Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about this one, I tried running this function on a machine with an a16 and count was a number that was much larger than the expected buffer size (468) but not a factor of std::mem::size_of::<nvmlVgpuMetadata_t>(); which is 212 according to rust analyzer.

@FreeMasen FreeMasen marked this pull request as ready for review March 11, 2026 01:17
@FreeMasen
Copy link
Contributor Author

I believe this is ready for review. I have tried to comment on any of the places where I wasn't quite sure about how something should be implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants