Skip to content

Add support for allocating memory that has the same iova as its virtual address#58

Open
lsgunth wants to merge 13 commits into
SamsungDS:mainfrom
Eideticom:same_iova
Open

Add support for allocating memory that has the same iova as its virtual address#58
lsgunth wants to merge 13 commits into
SamsungDS:mainfrom
Eideticom:same_iova

Conversation

@lsgunth
Copy link
Copy Markdown

@lsgunth lsgunth commented May 20, 2026

In some situations it may not be practical to pass around an iova address along with its virtual address. To support these situations this PR aims to add a way to allocate memory where the iova is the same as the virtual address. It does this by dedicating a region in the iova address space well above the addresses that would be used by existing code and using the MAP_FIXED_NOREPLACE flag with mmap() to attempt to allocate memory with an address in this region.

In addition to the core feature, this PR also adds type definitions and checking with sparse around iova addresses, fixes a parameter swap bug that was caught by the extra type checking, and adds some improvements to the examples to help demonstrate and test the feature.

.dma_unmap() takes an iova followed by the length, but the length
was erroneously specified first.

This was caught while creating the next two patches.

Signed-off-by: Logan Gunthorpe <logan.gunthorpe@eideticom.com>
Copy link
Copy Markdown
Collaborator

@birkelund birkelund left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

Comment thread include/vfn/support/compiler.h Outdated

#define __static_assert(x) static_assert(x, #x)

typedef uint64_t iova_t;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be moved to include/vfn/iommu/dma.h without too much trouble?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

I'll have to double check next week, but I believe I put it in support because it was a little awkward to have it defined in iommu/dma.h when it was needed by most of the headers in iommu which didn't need to include eachother.

I'll try changing it next week and if it's not too problematic I'll update this PR.

@birkelund birkelund self-requested a review May 22, 2026 09:28
lsgunth added 12 commits May 27, 2026 12:23
Instead of using a plain uint64_t to refer to iovas, use a specific
type (iova_t) to make the code a bit more self documenting. This
can make some useages a bit clearer.

Signed-off-by: Logan Gunthorpe <logan.gunthorpe@eideticom.com>
Adding the nocast attribute to iova_t lightly prevents the type from
being automatically casted to other types. This helps ensure that
iova_t is used in the correct places at the cost of needing an
occasional cast if doing math with the iova or passing an iova to
the skiplist.

Signed-off-by: Logan Gunthorpe <logan.gunthorpe@eideticom.com>
Store a convient way to get the maximum iova value. This will
be useful to create buffers with the same iova and virtual address.

Signed-off-by: Logan Gunthorpe <logan.gunthorpe@eideticom.com>
Add helper function to allocate memory where the iova and the virtual
address are the same.

While the skiplist to translate a virtual address to an iova is
relatively performant, in some applications it introduces extra
undesirable overhead. To avoid this overhead, it is desirable to
be able to allocate memory where the iova and the virtual address
are the same.

To do this, mmap() memory with a hint to locate it at a region starting
at a quarter of the maximum iova value. Then use IOMMU_MAP_FIXED_IOVA to
assign an iova that matches the virtual address. If the mmap() fails
(which is pretty unlikely) to assign an address in the desired region,
increment the next iova to attempt and retry.

Signed-off-by: Logan Gunthorpe <logan.gunthorpe@eideticom.com>
In all cases, the callers of nvme_mapv_prp() and nvme_mapv_sgl() know
the iova address of the segment as they always typically come from the
request.

Instead of looking these up in the skiplist, just pass the known value
directly.

Signed-off-by: Logan Gunthorpe <logan.gunthorpe@eideticom.com>
Introducing nvme_mapv_iova_prp() and nvme_mapv_iova_sgl() which
take an iova_vec pointer that can be used to avoid calling
iommu_translate_vaddr() when mapping vectored commands.

Signed-off-by: Logan Gunthorpe <logan.gunthorpe@eideticom.com>
Add simple wrappers for nvme_rq_mapv_iova_prp() and
nvme_rq_mapv_iova_sgl() similar to the non-iova versions.

Signed-off-by: Logan Gunthorpe <logan.gunthorpe@eideticom.com>
Add the -S flag to the io example that, when specified, allocates the
memory buffer using memory that has the iova as virtual address.

Signed-off-by: Logan Gunthorpe <logan.gunthorpe@eideticom.com>
Replace several 0x1000 values with a single io_sz variable.

This will allow changing the size in the next patch.

Signed-off-by: Logan Gunthorpe <logan.gunthorpe@eideticom.com>
Add the -v flag for vectored SGL and the -s flag for sgl. This
help testing the nvme_rq_mapv_[prp|sgl]() functions.

Signed-off-by: Logan Gunthorpe <logan.gunthorpe@eideticom.com>
Add the -i/--iova option to test passing iova values directly to
the vector functions to avoid the skip list.

Signed-off-by: Logan Gunthorpe <logan.gunthorpe@eideticom.com>
Add tests for nvme_rq_mapv_iova_prp() and nvme_rq_mapv_iova_sgl().

These are duplicates of the non-iova versions.
@lsgunth
Copy link
Copy Markdown
Author

lsgunth commented May 27, 2026

I was able to move those defines without any difficulties. I've force pushed with it updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants