An IOMMU group is a set of devices for which the IOMMU cannot
distinguish transactions. For PCI devices, a group often occurs
when a PCI bridge is involved. Transactions from any device
behind the bridge appear to be sourced from the bridge itself.
We leave it to the IOMMU driver to define the grouping restraints
for their platform.
Using this new interface, the group for a device can be retrieved
using the iommu_device_group() callback. Users will compare the
value returned against the value returned for other devices to
determine whether they are part of the same group. Devices with
no group are not translated by the IOMMU. There should be no
expectations about the group numbers as they may be arbitrarily
assigned by the IOMMU driver and may not be persistent across boots.
We also provide a sysfs interface to the group numbers here so
that userspace can understand IOMMU dependencies between devices
for managing safe, userspace drivers.
[Some code changes by Joerg Roedel <joerg.roedel@amd.com>]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
When mapping a memory region, split it to page sizes as supported
by the iommu hardware. Always prefer bigger pages, when possible,
in order to reduce the TLB pressure.
The logic to do that is now added to the IOMMU core, so neither the iommu
drivers themselves nor users of the IOMMU API have to duplicate it.
This allows a more lenient granularity of mappings; traditionally the
IOMMU API took 'order' (of a page) as a mapping size, and directly let
the low level iommu drivers handle the mapping, but now that the IOMMU
core can split arbitrary memory regions into pages, we can remove this
limitation, so users don't have to split those regions by themselves.
Currently the supported page sizes are advertised once and they then
remain static. That works well for OMAP and MSM but it would probably
not fly well with intel's hardware, where the page size capabilities
seem to have the potential to be different between several DMA
remapping devices.
register_iommu() currently sets a default pgsize behavior, so we can convert
the IOMMU drivers in subsequent patches. After all the drivers
are converted, the temporary default settings will be removed.
Mainline users of the IOMMU API (kvm and omap-iovmm) are adopted
to deal with bytes instead of page order.
Many thanks to Joerg Roedel <Joerg.Roedel@amd.com> for significant review!
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Cc: David Brown <davidb@codeaurora.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <Joerg.Roedel@amd.com>
Cc: Stepan Moskovchenko <stepanm@codeaurora.org>
Cc: KyongHo Cho <pullip.cho@samsung.com>
Cc: Hiroshi DOYU <hdoyu@nvidia.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: kvm@vger.kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Express sizes in bytes rather than in page order, to eliminate the
size->order->size conversions we have whenever the IOMMU API is calling
the low level drivers' map/unmap methods.
Adopt all existing drivers.
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Cc: David Brown <davidb@codeaurora.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <Joerg.Roedel@amd.com>
Cc: Stepan Moskovchenko <stepanm@codeaurora.org>
Cc: KyongHo Cho <pullip.cho@samsung.com>
Cc: Hiroshi DOYU <hdoyu@nvidia.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
With all IOMMU drivers being converted to bus_set_iommu the
global iommu_ops are no longer required. The same is true
for the deprecated register_iommu function.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
With per-bus iommu_ops the iommu_found function needs to
work on a bus_type too. This patch adds a bus_type parameter
to that function and converts all call-places.
The function is also renamed to iommu_present because the
function now checks if an iommu is present for a given bus
and does not check for a global iommu anymore.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This is necessary to store a pointer to the bus-specific
iommu_ops in the iommu-domain structure. It will be used
later to call into bus-specific iommu-ops.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This is the starting point to make the iommu_ops used for
the iommu-api a per-bus-type structure. It is required to
easily implement bus-specific setup in the iommu-layer.
The first user will be the iommu-group attribute in sysfs.
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This makes it impossible to compile an iommu driver into the
kernel without selecting CONFIG_IOMMU_API.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Make report_iommu_fault() return -ENOSYS whenever an iommu fault
handler isn't installed, so IOMMU drivers can then do their own
platform-specific default behavior if they wanted.
Fault handlers can still return -ENOSYS in case they want to elicit the
default behavior of the IOMMU drivers.
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Add iommu fault report mechanism to the IOMMU API, so implementations
could report about mmu faults (translation errors, hardware errors,
etc..).
Fault reports can be used in several ways:
- mere logging
- reset the device that accessed the faulting address (may be necessary
in case the device is a remote processor for example)
- implement dynamic PTE/TLB loading
A dedicated iommu_set_fault_handler() API has been added to allow
users, who are interested to receive such reports, to provide
their handler.
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
If CONFIG_IOMMU_API is not defined some functions will just
return -ENODEV. Add errno.h for the definition of ENODEV.
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This patch allows IOMMU users to determine whether the
hardware and software support safe, isolated interrupt
remapping. Not all Intel IOMMUs have the hardware, and the
software for AMD is not there yet.
Signed-off-by: Tom Lyon <pugs@cisco.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
These functions are not longer used and can be removed
savely. There functionality is now provided by the
iommu_{un}map functions which are also capable of multiple
page sizes.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This patch adds new callbacks for mapping and unmapping
pages to the iommu_ops structure. These callbacks are aware
of page sizes which makes them different to the
->{un}map_range callbacks.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
These two functions provide support for mapping and
unmapping physical addresses to io virtual addresses. The
difference to the iommu_(un)map_range() is that the new
functions take a gfp_order parameter instead of a size. This
allows the IOMMU backend implementations to detect easier if
a given range can be mapped by larger page sizes.
These new functions should replace the old ones in the long
term.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
The new function pointer names match better with the
top-level functions of the iommu-api which are using them.
Main intention of this change is to make the ->{un}map
pointer names free for two new mapping functions.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
The user can request to enable snooping control through VT-d page table.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This iommu_op can tell if domain have a specific capability, like snooping
control for Intel IOMMU, which can be used by other components of kernel to
adjust the behaviour.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This patch introduces the API to abstract the exported VT-d functions
for KVM into a generic API. This way the AMD IOMMU implementation can
plug into this API later.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>