Skip to content

Commit 0cc6f45

Browse files
committed
Merge tag 'iommu-updates-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel: "Core: - IOMMU memory usage observability - This will make the memory used for IO page tables explicitly visible. - Simplify arch_setup_dma_ops() Intel VT-d: - Consolidate domain cache invalidation - Remove private data from page fault message - Allocate DMAR fault interrupts locally - Cleanup and refactoring ARM-SMMUv2: - Support for fault debugging hardware on Qualcomm implementations - Re-land support for the ->domain_alloc_paging() callback ARM-SMMUv3: - Improve handling of MSI allocation failure - Drop support for the "disable_bypass" cmdline option - Major rework of the CD creation code, following on directly from the STE rework merged last time around. - Add unit tests for the new STE/CD manipulation logic AMD-Vi: - Final part of SVA changes with generic IO page fault handling Renesas IPMMU: - Add support for R8A779H0 hardware ... and a couple smaller fixes and updates across the sub-tree" * tag 'iommu-updates-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (80 commits) iommu/arm-smmu-v3: Make the kunit into a module arm64: Properly clean up iommu-dma remnants iommu/amd: Enable Guest Translation after reading IOMMU feature register iommu/vt-d: Decouple igfx_off from graphic identity mapping iommu/amd: Fix compilation error iommu/arm-smmu-v3: Add unit tests for arm_smmu_write_entry iommu/arm-smmu-v3: Build the whole CD in arm_smmu_make_s1_cd() iommu/arm-smmu-v3: Move the CD generation for SVA into a function iommu/arm-smmu-v3: Allocate the CD table entry in advance iommu/arm-smmu-v3: Make arm_smmu_alloc_cd_ptr() iommu/arm-smmu-v3: Consolidate clearing a CD table entry iommu/arm-smmu-v3: Move the CD generation for S1 domains into a function iommu/arm-smmu-v3: Make CD programming use arm_smmu_write_entry() iommu/arm-smmu-v3: Add an ops indirection to the STE code iommu/arm-smmu-qcom: Don't build debug features as a kernel module iommu/amd: Add SVA domain support iommu: Add ops->domain_alloc_sva() iommu/amd: Initial SVA support for AMD IOMMU iommu/amd: Add support for enable/disable IOPF iommu/amd: Add IO page fault notifier handler ...
2 parents f0cd69b + 2bd5059 commit 0cc6f45

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

72 files changed

+3536
-1598
lines changed

Documentation/admin-guide/cgroup-v2.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -1435,7 +1435,7 @@ PAGE_SIZE multiple when read back.
14351435
sec_pagetables
14361436
Amount of memory allocated for secondary page tables,
14371437
this currently includes KVM mmu allocations on x86
1438-
and arm64.
1438+
and arm64 and IOMMU page tables.
14391439

14401440
percpu (npn)
14411441
Amount of memory used for storing per-cpu kernel
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
2+
%YAML 1.2
3+
---
4+
$id: http://devicetree.org/schemas/iommu/qcom,tbu.yaml#
5+
$schema: http://devicetree.org/meta-schemas/core.yaml#
6+
7+
title: Qualcomm TBU (Translation Buffer Unit)
8+
9+
maintainers:
10+
- Georgi Djakov <quic_c_gdjako@quicinc.com>
11+
12+
description:
13+
The Qualcomm SMMU500 implementation consists of TCU and TBU. The TBU contains
14+
a Translation Lookaside Buffer (TLB) that caches page tables. TBUs provides
15+
debug features to trace and trigger debug transactions. There are multiple TBU
16+
instances with each client core.
17+
18+
properties:
19+
compatible:
20+
enum:
21+
- qcom,sc7280-tbu
22+
- qcom,sdm845-tbu
23+
24+
reg:
25+
maxItems: 1
26+
27+
clocks:
28+
maxItems: 1
29+
30+
interconnects:
31+
maxItems: 1
32+
33+
power-domains:
34+
maxItems: 1
35+
36+
qcom,stream-id-range:
37+
description: |
38+
Phandle of a SMMU device and Stream ID range (address and size) that
39+
is assigned by the TBU
40+
$ref: /schemas/types.yaml#/definitions/phandle-array
41+
items:
42+
- items:
43+
- description: phandle of a smmu node
44+
- description: stream id base address
45+
- description: stream id size
46+
47+
required:
48+
- compatible
49+
- reg
50+
- qcom,stream-id-range
51+
52+
additionalProperties: false
53+
54+
examples:
55+
- |
56+
#include <dt-bindings/clock/qcom,gcc-sdm845.h>
57+
#include <dt-bindings/interconnect/qcom,icc.h>
58+
#include <dt-bindings/interconnect/qcom,sdm845.h>
59+
60+
tbu@150e1000 {
61+
compatible = "qcom,sdm845-tbu";
62+
reg = <0x150e1000 0x1000>;
63+
clocks = <&gcc GCC_AGGRE_NOC_PCIE_TBU_CLK>;
64+
interconnects = <&system_noc MASTER_GNOC_SNOC QCOM_ICC_TAG_ACTIVE_ONLY
65+
&config_noc SLAVE_IMEM_CFG QCOM_ICC_TAG_ACTIVE_ONLY>;
66+
power-domains = <&gcc HLOS1_VOTE_AGGRE_NOC_MMU_PCIE_TBU_GDSC>;
67+
qcom,stream-id-range = <&apps_smmu 0x1c00 0x400>;
68+
};
69+
...

Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml

+1
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@ properties:
5050
- renesas,ipmmu-r8a779a0 # R-Car V3U
5151
- renesas,ipmmu-r8a779f0 # R-Car S4-8
5252
- renesas,ipmmu-r8a779g0 # R-Car V4H
53+
- renesas,ipmmu-r8a779h0 # R-Car V4M
5354
- const: renesas,rcar-gen4-ipmmu-vmsa # R-Car Gen4
5455

5556
reg:

Documentation/filesystems/proc.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -1110,8 +1110,8 @@ KernelStack
11101110
PageTables
11111111
Memory consumed by userspace page tables
11121112
SecPageTables
1113-
Memory consumed by secondary page tables, this currently
1114-
currently includes KVM mmu allocations on x86 and arm64.
1113+
Memory consumed by secondary page tables, this currently includes
1114+
KVM mmu and IOMMU allocations on x86 and arm64.
11151115
NFS_Unstable
11161116
Always zero. Previous counted pages which had been written to
11171117
the server, but has not been committed to stable storage.

arch/arc/mm/dma.c

+1-2
Original file line numberDiff line numberDiff line change
@@ -90,8 +90,7 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
9090
/*
9191
* Plug in direct dma map ops.
9292
*/
93-
void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
94-
bool coherent)
93+
void arch_setup_dma_ops(struct device *dev, bool coherent)
9594
{
9695
/*
9796
* IOC hardware snoops all DMA traffic keeping the caches consistent

arch/arm/mm/dma-mapping-nommu.c

+1-2
Original file line numberDiff line numberDiff line change
@@ -33,8 +33,7 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
3333
}
3434
}
3535

36-
void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
37-
bool coherent)
36+
void arch_setup_dma_ops(struct device *dev, bool coherent)
3837
{
3938
if (IS_ENABLED(CONFIG_CPU_V7M)) {
4039
/*

arch/arm/mm/dma-mapping.c

+9-7
Original file line numberDiff line numberDiff line change
@@ -1709,11 +1709,15 @@ void arm_iommu_detach_device(struct device *dev)
17091709
}
17101710
EXPORT_SYMBOL_GPL(arm_iommu_detach_device);
17111711

1712-
static void arm_setup_iommu_dma_ops(struct device *dev, u64 dma_base, u64 size,
1713-
bool coherent)
1712+
static void arm_setup_iommu_dma_ops(struct device *dev)
17141713
{
17151714
struct dma_iommu_mapping *mapping;
1715+
u64 dma_base = 0, size = 1ULL << 32;
17161716

1717+
if (dev->dma_range_map) {
1718+
dma_base = dma_range_map_min(dev->dma_range_map);
1719+
size = dma_range_map_max(dev->dma_range_map) - dma_base;
1720+
}
17171721
mapping = arm_iommu_create_mapping(dev->bus, dma_base, size);
17181722
if (IS_ERR(mapping)) {
17191723
pr_warn("Failed to create %llu-byte IOMMU mapping for device %s\n",
@@ -1744,17 +1748,15 @@ static void arm_teardown_iommu_dma_ops(struct device *dev)
17441748

17451749
#else
17461750

1747-
static void arm_setup_iommu_dma_ops(struct device *dev, u64 dma_base, u64 size,
1748-
bool coherent)
1751+
static void arm_setup_iommu_dma_ops(struct device *dev)
17491752
{
17501753
}
17511754

17521755
static void arm_teardown_iommu_dma_ops(struct device *dev) { }
17531756

17541757
#endif /* CONFIG_ARM_DMA_USE_IOMMU */
17551758

1756-
void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
1757-
bool coherent)
1759+
void arch_setup_dma_ops(struct device *dev, bool coherent)
17581760
{
17591761
/*
17601762
* Due to legacy code that sets the ->dma_coherent flag from a bus
@@ -1774,7 +1776,7 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
17741776
return;
17751777

17761778
if (device_iommu_mapped(dev))
1777-
arm_setup_iommu_dma_ops(dev, dma_base, size, coherent);
1779+
arm_setup_iommu_dma_ops(dev);
17781780

17791781
xen_setup_dma_ops(dev);
17801782
dev->archdata.dma_ops_setup = true;

arch/arm64/Kconfig

-1
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,6 @@ config ARM64
4646
select ARCH_HAS_SYNC_DMA_FOR_DEVICE
4747
select ARCH_HAS_SYNC_DMA_FOR_CPU
4848
select ARCH_HAS_SYSCALL_WRAPPER
49-
select ARCH_HAS_TEARDOWN_DMA_OPS if IOMMU_SUPPORT
5049
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
5150
select ARCH_HAS_ZONE_DMA_SET if EXPERT
5251
select ARCH_HAVE_ELF_PROT

arch/arm64/mm/dma-mapping.c

+1-12
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,6 @@
77
#include <linux/gfp.h>
88
#include <linux/cache.h>
99
#include <linux/dma-map-ops.h>
10-
#include <linux/iommu.h>
1110
#include <xen/xen.h>
1211

1312
#include <asm/cacheflush.h>
@@ -39,15 +38,7 @@ void arch_dma_prep_coherent(struct page *page, size_t size)
3938
dcache_clean_poc(start, start + size);
4039
}
4140

42-
#ifdef CONFIG_IOMMU_DMA
43-
void arch_teardown_dma_ops(struct device *dev)
44-
{
45-
dev->dma_ops = NULL;
46-
}
47-
#endif
48-
49-
void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
50-
bool coherent)
41+
void arch_setup_dma_ops(struct device *dev, bool coherent)
5142
{
5243
int cls = cache_line_size_of_cpu();
5344

@@ -58,8 +49,6 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
5849
ARCH_DMA_MINALIGN, cls);
5950

6051
dev->dma_coherent = coherent;
61-
if (device_iommu_mapped(dev))
62-
iommu_setup_dma_ops(dev, dma_base, dma_base + size - 1);
6352

6453
xen_setup_dma_ops(dev);
6554
}

arch/loongarch/kernel/dma.c

+2-7
Original file line numberDiff line numberDiff line change
@@ -8,17 +8,12 @@
88
void acpi_arch_dma_setup(struct device *dev)
99
{
1010
int ret;
11-
u64 mask, end = 0;
11+
u64 mask, end;
1212
const struct bus_dma_region *map = NULL;
1313

1414
ret = acpi_dma_get_range(dev, &map);
1515
if (!ret && map) {
16-
const struct bus_dma_region *r = map;
17-
18-
for (end = 0; r->size; r++) {
19-
if (r->dma_start + r->size - 1 > end)
20-
end = r->dma_start + r->size - 1;
21-
}
16+
end = dma_range_map_max(map);
2217

2318
mask = DMA_BIT_MASK(ilog2(end) + 1);
2419
dev->bus_dma_limit = end;

arch/mips/mm/dma-noncoherent.c

+1-2
Original file line numberDiff line numberDiff line change
@@ -137,8 +137,7 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
137137
#endif
138138

139139
#ifdef CONFIG_ARCH_HAS_SETUP_DMA_OPS
140-
void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
141-
bool coherent)
140+
void arch_setup_dma_ops(struct device *dev, bool coherent)
142141
{
143142
dev->dma_coherent = coherent;
144143
}

arch/riscv/mm/dma-noncoherent.c

+1-2
Original file line numberDiff line numberDiff line change
@@ -128,8 +128,7 @@ void arch_dma_prep_coherent(struct page *page, size_t size)
128128
ALT_CMO_OP(FLUSH, flush_addr, size, riscv_cbom_block_size);
129129
}
130130

131-
void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
132-
bool coherent)
131+
void arch_setup_dma_ops(struct device *dev, bool coherent)
133132
{
134133
WARN_TAINT(!coherent && riscv_cbom_block_size > ARCH_DMA_MINALIGN,
135134
TAINT_CPU_OUT_OF_SPEC,

drivers/acpi/arm64/dma.c

+4-13
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,6 @@ void acpi_arch_dma_setup(struct device *dev)
88
{
99
int ret;
1010
u64 end, mask;
11-
u64 size = 0;
1211
const struct bus_dma_region *map = NULL;
1312

1413
/*
@@ -23,31 +22,23 @@ void acpi_arch_dma_setup(struct device *dev)
2322
}
2423

2524
if (dev->coherent_dma_mask)
26-
size = max(dev->coherent_dma_mask, dev->coherent_dma_mask + 1);
25+
end = dev->coherent_dma_mask;
2726
else
28-
size = 1ULL << 32;
27+
end = (1ULL << 32) - 1;
2928

3029
ret = acpi_dma_get_range(dev, &map);
3130
if (!ret && map) {
32-
const struct bus_dma_region *r = map;
33-
34-
for (end = 0; r->size; r++) {
35-
if (r->dma_start + r->size - 1 > end)
36-
end = r->dma_start + r->size - 1;
37-
}
38-
39-
size = end + 1;
31+
end = dma_range_map_max(map);
4032
dev->dma_range_map = map;
4133
}
4234

4335
if (ret == -ENODEV)
44-
ret = iort_dma_get_ranges(dev, &size);
36+
ret = iort_dma_get_ranges(dev, &end);
4537
if (!ret) {
4638
/*
4739
* Limit coherent and dma mask based on size retrieved from
4840
* firmware.
4941
*/
50-
end = size - 1;
5142
mask = DMA_BIT_MASK(ilog2(end) + 1);
5243
dev->bus_dma_limit = end;
5344
dev->coherent_dma_mask = min(dev->coherent_dma_mask, mask);

drivers/acpi/arm64/iort.c

+10-10
Original file line numberDiff line numberDiff line change
@@ -1367,7 +1367,7 @@ int iort_iommu_configure_id(struct device *dev, const u32 *input_id)
13671367
{ return -ENODEV; }
13681368
#endif
13691369

1370-
static int nc_dma_get_range(struct device *dev, u64 *size)
1370+
static int nc_dma_get_range(struct device *dev, u64 *limit)
13711371
{
13721372
struct acpi_iort_node *node;
13731373
struct acpi_iort_named_component *ncomp;
@@ -1384,13 +1384,13 @@ static int nc_dma_get_range(struct device *dev, u64 *size)
13841384
return -EINVAL;
13851385
}
13861386

1387-
*size = ncomp->memory_address_limit >= 64 ? U64_MAX :
1388-
1ULL<<ncomp->memory_address_limit;
1387+
*limit = ncomp->memory_address_limit >= 64 ? U64_MAX :
1388+
(1ULL << ncomp->memory_address_limit) - 1;
13891389

13901390
return 0;
13911391
}
13921392

1393-
static int rc_dma_get_range(struct device *dev, u64 *size)
1393+
static int rc_dma_get_range(struct device *dev, u64 *limit)
13941394
{
13951395
struct acpi_iort_node *node;
13961396
struct acpi_iort_root_complex *rc;
@@ -1408,25 +1408,25 @@ static int rc_dma_get_range(struct device *dev, u64 *size)
14081408
return -EINVAL;
14091409
}
14101410

1411-
*size = rc->memory_address_limit >= 64 ? U64_MAX :
1412-
1ULL<<rc->memory_address_limit;
1411+
*limit = rc->memory_address_limit >= 64 ? U64_MAX :
1412+
(1ULL << rc->memory_address_limit) - 1;
14131413

14141414
return 0;
14151415
}
14161416

14171417
/**
14181418
* iort_dma_get_ranges() - Look up DMA addressing limit for the device
14191419
* @dev: device to lookup
1420-
* @size: DMA range size result pointer
1420+
* @limit: DMA limit result pointer
14211421
*
14221422
* Return: 0 on success, an error otherwise.
14231423
*/
1424-
int iort_dma_get_ranges(struct device *dev, u64 *size)
1424+
int iort_dma_get_ranges(struct device *dev, u64 *limit)
14251425
{
14261426
if (dev_is_pci(dev))
1427-
return rc_dma_get_range(dev, size);
1427+
return rc_dma_get_range(dev, limit);
14281428
else
1429-
return nc_dma_get_range(dev, size);
1429+
return nc_dma_get_range(dev, limit);
14301430
}
14311431

14321432
static void __init acpi_iort_register_irq(int hwirq, const char *name,

drivers/acpi/scan.c

+1-6
Original file line numberDiff line numberDiff line change
@@ -1675,12 +1675,7 @@ int acpi_dma_configure_id(struct device *dev, enum dev_dma_attr attr,
16751675
if (ret == -EPROBE_DEFER)
16761676
return -EPROBE_DEFER;
16771677

1678-
/*
1679-
* Historically this routine doesn't fail driver probing due to errors
1680-
* in acpi_iommu_configure_id().
1681-
*/
1682-
1683-
arch_setup_dma_ops(dev, 0, U64_MAX, attr == DEV_DMA_COHERENT);
1678+
arch_setup_dma_ops(dev, attr == DEV_DMA_COHERENT);
16841679

16851680
return 0;
16861681
}

drivers/hv/hv_common.c

+1-5
Original file line numberDiff line numberDiff line change
@@ -561,11 +561,7 @@ EXPORT_SYMBOL_GPL(hv_query_ext_cap);
561561

562562
void hv_setup_dma_ops(struct device *dev, bool coherent)
563563
{
564-
/*
565-
* Hyper-V does not offer a vIOMMU in the guest
566-
* VM, so pass 0/NULL for the IOMMU settings
567-
*/
568-
arch_setup_dma_ops(dev, 0, 0, coherent);
564+
arch_setup_dma_ops(dev, coherent);
569565
}
570566
EXPORT_SYMBOL_GPL(hv_setup_dma_ops);
571567

0 commit comments

Comments
 (0)