Compare commits

...

10 Commits

Author SHA1 Message Date
openeuler-ci-bot
254389bd10
!1107 [sync] PR-1104: QEMU update to version 8.2.0-32:
From: @openeuler-sync-bot 
Reviewed-by: @imxcc 
Signed-off-by: @imxcc
2025-05-16 06:04:00 +00:00
Jiabo Feng
0900a0da5f QEMU update to version 8.2.0-32:
- target/i386: csv: Release CSV3 shared pages after unmapping DMA
- target/i386: Add new CPU model ClearwaterForest
- target/i386: add sha512, sm3, sm4 feature bits
- docs: Add GNR, SRF and CWF CPU models
- target/i386: Export BHI_NO bit to guests
- target/i386: Introduce SierraForest-v2 model
- vdpa/iommufd:Implement DMA mapping through the iommufd interface
- vdpa/iommufd:Introduce vdpa-iommufd module
- vdpa/iommufd:support associating iommufd backend for vDPA devices
- Kconfig/iommufd/VDPA: Update IOMMUFD module configuration dependencies The vDPA module can also use IOMMUFD like the VFIO module.
- backends/iommufd: Get rid of qemu_open_old()
- backends/iommufd: Make iommufd_backend_*() return bool
- backends/iommufd: Fix missing ERRP_GUARD() for error_prepend()
- backends/iommufd: Remove mutex
- backends/iommufd: Remove check on number of backend users
- hw/intc: Add extioi ability of 256 vcpu interrupt routing
- hw/rtc: Fixed loongson rtc emulation errors
- hw/loongarch/boot: Adjust the loading position of the initrd
- target/loongarch: Fix the cpu unplug resource leak
- target/loongarch: fix vcpu reset command word issue
- vdpa:Fix dirty page bitmap synchronization not done after suspend for vdpa devices

Signed-off-by: Jiabo Feng <fengjiabo1@huawei.com>
(cherry picked from commit a5212066e7516ff2a316e1b2feaa75dd5ee4d17a)
2025-05-15 17:01:41 +08:00
openeuler-ci-bot
c2e9f7fce2
!1102 [sync] PR-1092: QEMU update to version 8.2.0-31:
From: @openeuler-sync-bot 
Reviewed-by: @imxcc 
Signed-off-by: @imxcc
2025-05-15 01:13:49 +00:00
Jiabo Feng
20f9134cd6 QEMU update to version 8.2.0-31:
- target/arm: Change arm_cpu_mp_affinity when enabled IPIV feature
- fw_cfg: Don't set callback_opaque NULL in fw_cfg_modify_bytes_read()

Signed-off-by: Jiabo Feng <fengjiabo1@huawei.com>
(cherry picked from commit 519065adc4ba430c349a235e25b346829814f0d9)
2025-05-14 17:13:12 +08:00
openeuler-ci-bot
c5c7b3546b
!1098 [sync] PR-1091: QEMU update to version 8.2.0-30:
From: @openeuler-sync-bot 
Reviewed-by: @imxcc 
Signed-off-by: @imxcc
2025-05-14 09:11:42 +00:00
Jiabo Feng
2608c0d3b4 QEMU update to version 8.2.0-30:
- Revert "linux-user: Print tid not pid with strace"
- gpex-acpi: Remove duplicate DSM #5
- smmuv3: Use default bus for arm-smmuv3-accel
- smmuv3: Change arm-smmuv3-nested name to arm-smmuv3-accel
- smmu-common: Return sysmem address space only for vfio-pci
- smmuv3: realize get_pasid_cap and set ssidsize with pasid
- vfio: Synthesize vPASID capability to VM
- backend/iommufd: Report PASID capability
- pci: Get pasid capability from vIOMMU
- smmuv3: Add support for page fault handling
- kvm: Translate MSI doorbell address only if it is valid
- hw/arm/smmuv3: Enable sva/stall IDR features
- iommufd.h: Updated to openeuler olk-6.6 kernel
- tests/data/acpi/virt: Update IORT acpi table
- hw/arm/virt-acpi-build: Add IORT RMR regions to handle MSI nested binding
- tests/qtest: Allow IORT acpi table to change
- hw/arm/virt-acpi-build: Build IORT with multiple SMMU nodes
- hw/arm/smmuv3: Associate a pci bus with a SMMUv3 Nested device
- hw/arm/smmuv3: Add initial support for SMMUv3 Nested device
- hw/arm/virt: Add an SMMU_IO_LEN macro
- hw/pci-host/gpex: [needs kernel fix] Allow to generate preserve boot config DSM #5
- tests/data/acpi: Update DSDT acpi tables
- acpi/gpex: Fix PCI Express Slot Information function 0 returned value
- tests/qtest: Allow DSDT acpi tables to change
- hw/arm/smmuv3: Forward cache invalidate commands via iommufd
- hw/arm/smmu-common: Replace smmu_iommu_mr with smmu_find_sdev
- hw/arm/smmuv3: Add missing STE invalidation
- hw/arm/smmuv3: Add smmu_dev_install_nested_ste() for CFGI_STE
- hw/arm/smmuv3: Check idr registers for STE_S1CDMAX and STE_S1STALLD
- hw/arm/smmuv3: Read host SMMU device info
- hw/arm/smmuv3: Ignore IOMMU_NOTIFIER_MAP for nested-smmuv3
- hw/arm/smmu-common: Return sysmem if stage-1 is bypassed
- hw/arm/smmu-common: Add iommufd helpers
- hw/arm/smmu-common: Add set/unset_iommu_device callback
- hw/arm/smmu-common: Extract smmu_get_sbus and smmu_get_sdev helpers
- hw/arm/smmu-common: Bypass emulated IOTLB for a nested SMMU
- hw/arm/smmu-common: Add a nested flag to SMMUState
- backends/iommufd: Introduce iommufd_viommu_invalidate_cache
- backends/iommufd: Introduce iommufd_vdev_alloc
- backends/iommufd: Introduce iommufd_backend_alloc_viommu
- vfio/iommufd: Implement [at|de]tach_hwpt handlers
- vfio/iommufd: Implement HostIOMMUDeviceClass::realize_late() handler
- HostIOMMUDevice: Introduce realize_late callback
- vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD
- backends/iommufd: Add helpers for invalidating user-managed HWPT
- Update iommufd.h header for vSVA
- vfio/common: Allow disabling device dirty page tracking
- vfio/migration: Don't block migration device dirty tracking is unsupported
- vfio/iommufd: Implement VFIOIOMMUClass::query_dirty_bitmap support
- vfio/iommufd: Implement VFIOIOMMUClass::set_dirty_tracking support
- vfio/iommufd: Probe and request hwpt dirty tracking capability
- vfio/{iommufd, container}: Invoke HostIOMMUDevice::realize() during attach_device()
- vfio/iommufd: Add hw_caps field to HostIOMMUDeviceCaps
- vfio/{iommufd,container}: Remove caps::aw_bits
- HostIOMMUDevice: Store the VFIO/VDPA agent
- vfio/iommufd: Introduce auto domain creation
- vfio/ccw: Don't initialize HOST_IOMMU_DEVICE with mdev
- vfio/ap: Don't initialize HOST_IOMMU_DEVICE with mdev
- vfio/iommufd: Return errno in iommufd_cdev_attach_ioas_hwpt()
- backends/iommufd: Extend iommufd_backend_get_device_info() to fetch HW capabilities
- vfio/iommufd: Don't initialize nor set a HOST_IOMMU_DEVICE with mdev
- vfio/pci: Extract mdev check into an helper
- intel_iommu: Check compatibility with host IOMMU capabilities
- intel_iommu: Implement [set|unset]_iommu_device() callbacks
- intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap
- vfio/pci: Pass HostIOMMUDevice to vIOMMU
- hw/pci: Introduce pci_device_[set|unset]_iommu_device()
- hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn()
- vfio: Create host IOMMU device instance
- backends/iommufd: Implement HostIOMMUDeviceClass::get_cap() handler
- vfio/container: Implement HostIOMMUDeviceClass::get_cap() handler
- vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
- backends/iommufd: Introduce helper function iommufd_backend_get_device_info()
- vfio/container: Implement HostIOMMUDeviceClass::realize() handler
- range: Introduce range_get_last_bit()
- backends/iommufd: Introduce TYPE_HOST_IOMMU_DEVICE_IOMMUFD[_VFIO] devices
- vfio/container: Introduce TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO device
- backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
- backends: Introduce HostIOMMUDevice abstract
- vfio/iommufd: Remove CONFIG_IOMMUFD usage
- vfio/spapr: Extend VFIOIOMMUOps with a release handler
- vfio/spapr: Only compile sPAPR IOMMU support when needed
- vfio/iommufd: Introduce a VFIOIOMMU iommufd QOM interface
- vfio/spapr: Introduce a sPAPR VFIOIOMMU QOM interface
- vfio/container: Intoduce a new VFIOIOMMUClass::setup handler
- vfio/container: Introduce a VFIOIOMMU legacy QOM interface
- vfio/container: Introduce a VFIOIOMMU QOM interface
- vfio/container: Initialize VFIOIOMMUOps under vfio_init_container()
- vfio/container: Introduce vfio_legacy_setup() for further cleanups
- docs/devel: Add VFIO iommufd backend documentation
- vfio: Introduce a helper function to initialize VFIODevice
- vfio/ccw: Move VFIODevice initializations in vfio_ccw_instance_init
- vfio/ap: Move VFIODevice initializations in vfio_ap_instance_init
- vfio/platform: Move VFIODevice initializations in vfio_platform_instance_init
- vfio/pci: Move VFIODevice initializations in vfio_instance_init
- hw/i386: Activate IOMMUFD for q35 machines
- kconfig: Activate IOMMUFD for s390x machines
- hw/arm: Activate IOMMUFD for virt machines
- vfio: Make VFIOContainerBase poiner parameter const in VFIOIOMMUOps callbacks
- vfio/ccw: Make vfio cdev pre-openable by passing a file handle
- vfio/ccw: Allow the selection of a given iommu backend
- vfio/ap: Make vfio cdev pre-openable by passing a file handle
- vfio/ap: Allow the selection of a given iommu backend
- vfio/platform: Make vfio cdev pre-openable by passing a file handle
- vfio/platform: Allow the selection of a given iommu backend
- vfio/pci: Make vfio cdev pre-openable by passing a file handle
- vfio/pci: Allow the selection of a given iommu backend
- vfio/iommufd: Enable pci hot reset through iommufd cdev interface
- vfio/pci: Introduce a vfio pci hot reset interface
- vfio/pci: Extract out a helper vfio_pci_get_pci_hot_reset_info
- vfio/iommufd: Add support for iova_ranges and pgsizes
- vfio/iommufd: Relax assert check for iommufd backend
- vfio/iommufd: Implement the iommufd backend
- vfio/common: return early if space isn't empty
- util/char_dev: Add open_cdev()
- backends/iommufd: Introduce the iommufd object
- vfio/spapr: Move hostwin_list into spapr container
- vfio/spapr: Move prereg_listener into spapr container
- vfio/spapr: switch to spapr IOMMU BE add/del_section_window
- vfio/spapr: Introduce spapr backend and target interface
- vfio/container: Implement attach/detach_device
- vfio/container: Move iova_ranges to base container
- vfio/container: Move dirty_pgsizes and max_dirty_bitmap_size to base container
- vfio/container: Move listener to base container
- vfio/container: Move vrdl_list to base container
- vfio/container: Move pgsizes and dma_max_mappings to base container
- vfio/container: Convert functions to base container
- vfio/container: Move per container device list in base container
- vfio/container: Switch to IOMMU BE set_dirty_page_tracking/query_dirty_bitmap API
- vfio/container: Move space field to base container
- vfio/common: Move giommu_list in base container
- vfio/common: Introduce vfio_container_init/destroy helper
- vfio/container: Switch to dma_map|unmap API
- vfio/container: Introduce a empty VFIOIOMMUOps
- vfio: Introduce base object for VFIOContainer and targeted interface
- cryptodev: Fix error handling in cryptodev_lkcf_execute_task()
- hw/xen: Fix xen_bus_realize() error handling
- hw/misc/aspeed_hace: Fix buffer overflow in has_padding function
- target/s390x: Fix a typo in s390_cpu_class_init()
- hw/sd/sdhci: free irq on exit
- hw/ufs: free irq on exit
- hw/pci-host/designware: Fix ATU_UPPER_TARGET register access
- target/i386: Make invtsc migratable when user sets tsc-khz explicitly
- target/i386: Construct CPUID 2 as stateful iff times > 1
- target/i386: Enable fdp-excptn-only and zero-fcs-fds
- target/i386: Don't construct a all-zero entry for CPUID[0xD 0x3f]
- i386/cpuid: Remove subleaf constraint on CPUID leaf 1F
- target/i386: pass X86CPU to x86_cpu_get_supported_feature_word
- target/i386: Raise the highest index value used for any VMCS encoding
- target/i386: Add VMX control bits for nested FRED support
- target/i386: Delete duplicated macro definition CR4_FRED_MASK
- target/i386: Add get/set/migrate support for FRED MSRs
- target/i386: enumerate VMX nested-exception support
- vmxcap: add support for VMX FRED controls
- target/i386: mark CR4.FRED not reserved
- target/i386: add support for FRED in CPUID enumeration
- target/i386: fix feature dependency for WAITPKG
- target/i386: Add more features enumerated by CPUID.7.2.EDX
- net: fix build when libbpf is disabled, but libxdp is enabled
- hw/nvme: fix invalid endian conversion
- hw/nvme: fix invalid check on mcl
- backends/cryptodev: Do not ignore throttle/backends Errors
- backends/cryptodev: Do not abort for invalid session ID
- virtcca: add kvm isolation when get tmi version.
- qga: Don't daemonize before channel is initialized
- qga: Add log to guest-fsfreeze-thaw command
- backends: VirtCCA: cvm_gpa_start supports both 1GB and 3GB
- BUGFIX: Enforce isolation for virtcca_shared_hugepage
- arm: VirtCCA: qemu CoDA support UEFI boot
- arm: VirtCCA: Compatibility with older versions of TMM and the kernel
- arm: VirtCCA: qemu uefi boot support kae
- arm: VirtCCA: CVM support UEFI boot

Signed-off-by: Jiabo Feng <fengjiabo1@huawei.com>
(cherry picked from commit 85fd7a435d8203dde56fedc4c8f500e41faf132c)
2025-05-14 15:07:17 +08:00
openeuler-ci-bot
91ced6f504
!1081 [sync] PR-1077: QEMU update to version 8.2.0-29:
From: @openeuler-sync-bot 
Reviewed-by: @imxcc 
Signed-off-by: @imxcc
2025-02-26 02:59:56 +00:00
Jiabo Feng
40f2c98783 QEMU update to version 8.2.0-29:
- target/i386: csv: Support inject secret for CSV3 guest only if the extension is enabled
- target/i386: csv: Support load kernel hashes for CSV3 guest only if the extension is enabled
- target/i386: csv: Request to set private memory of CSV3 guest if the extension is enabled
- target/i386: kvm: Support to get and enable extensions for Hygon CoCo guest
- qapi/qom,target/i386: csv-guest: Introduce secret-header-file=str and secret-file=str options
- bakcend: VirtCCA:resolve hugepage memory waste issue in vhost-user scenario
- parallels: fix ext_off assertion failure due to overflow
- backends/cryptodev-vhost-user: Fix local_error leaks
- hw/usb/hcd-ehci: Fix debug printf format string
- target/riscv/vector_helper.c: fix 'vmvr_v' memcpy endianess
- target/riscv/vector_helper.c: optimize loops in ldst helpers
- target/riscv/vector_helper.c: set vstart = 0 in GEN_VEXT_VSLIDEUP_VX()
- target/hexagon: don't look for static glib
- virtio-net: Fix network stall at the host side waiting for kick
- Add if condition to avoid assertion failed error in blockdev_init
- target/arm: Use float_status copy in sme_fmopa_s
- target/arm: take HSTR traps of cp15 accesses to EL2, not EL1
- target/arm: Reinstate "vfp" property on AArch32 CPUs
- target/i386/cpu: Fix notes for CPU models
- target/arm: LDAPR should honour SCTLR_ELx.nAA
- target/riscv: Avoid bad shift in riscv_cpu_do_interrupt()
- hvf: remove unused but set variable
- hw/misc/nrf51_rng: Don't use BIT_MASK() when we mean BIT()
- Avoid taking address of out-of-bounds array index
- target/arm: Fix VCMLA Dd, Dn, Dm[idx]
- target/arm: Fix UMOPA/UMOPS of 16-bit values
- target/arm: Fix SVE/SME gross MTE suppression checks
- target/arm: Fix nregs computation in do_{ld,st}_zpa
- crypto: fix error check on gcry_md_open
- Change vmstate_cpuhp_sts vmstateDescription version_id
- hw/pci: Remove unused pci_irq_pulse() method
- hw/intc: Don't clear pending bits on IRQ lowering
- target/arm: Drop user-only special case in sve_stN_r
- migration: Ensure vmstate_save() sets errp
- target/i386: fix hang when using slow path for ptw_setl
- contrib/plugins: add compat for g_memdup2
- hw/audio/hda: fix memory leak on audio setup
- crypto: perform runtime check for hash/hmac support in gcrypt
- target/arm: Fix incorrect aa64_tidcp1 feature check
- target/arm: fix exception syndrome for AArch32 bkpt insn
- target/arm: Don't get MDCR_EL2 in pmu_counter_enabled() before checking ARM_FEATURE_PMU
- linux-user: Print tid not pid with strace
- target/arm: Fix A64 scalar SQSHRN and SQRSHRN
- target/arm: Don't assert for 128-bit tile accesses when SVL is 128
- hw/timer/exynos4210_mct: fix possible int overflow
- target/arm: Avoid shifts by -1 in tszimm_shr() and tszimm_shl()
- hw/audio/virtio-snd: Always use little endian audio format
- target/riscv: Fix vcompress with rvv_ta_all_1s
- usb-hub: Fix handling port power control messages

Signed-off-by: Jiabo Feng <fengjiabo1@huawei.com>
(cherry picked from commit d4a20b24ff377fd07fcbf2b72eecaf07a3ac4cc0)
2025-02-26 09:57:09 +08:00
openeuler-ci-bot
cc48fa494a
!1073 [sync] PR-1070: QEMU update to version 8.2.0-28:
From: @openeuler-sync-bot 
Reviewed-by: @imxcc 
Signed-off-by: @imxcc
2025-02-26 01:56:46 +00:00
Jiabo Feng
c870ecf326 QEMU update to version 8.2.0-28:
- hw/misc/mos6522: Fix bad class definition of the MOS6522 device
- target/i386: Fix minor typo in NO_NESTED_DATA_BP feature bit
- cpu: ensure we don't call start_exclusive from cpu_exec
- Avoid unaligned fetch in ladr_match()
- audio/audio.c: remove trailing newline in error_setg
- acpi/tests/avocado/bits: wait for 200 seconds for SHUTDOWN event from bits VM
- linux-user: Tolerate CONFIG_LSM_MMAP_MIN_ADDR
- accel/tcg: Fix user-only probe_access_internal plugin
- linux-user: Honor elf alignment when placing images
- Reserve address for MSI mapping in the CVM scenario.

Signed-off-by: Jiabo Feng <fengjiabo1@huawei.com>
(cherry picked from commit 3ab56c27fe6b593be9a24f27b52b2730efa05304)
2025-02-21 17:42:49 +08:00
256 changed files with 27139 additions and 1 deletions

View File

@ -0,0 +1,26 @@
From b78860242162ab5ef1e73973eeca36e0261bfeb5 Mon Sep 17 00:00:00 2001
From: xiaoyuliang <xiaoyuliang@kylinos.cn>
Date: Wed, 21 Aug 2024 11:26:41 +0800
Subject: [PATCH] Add if condition to avoid assertion failed error in
blockdev_init
---
blockdev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/blockdev.c b/blockdev.c
index bc7a947dea..d2fe5c361c 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -588,7 +588,7 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts,
read_only = qemu_opt_get_bool(opts, BDRV_OPT_READ_ONLY, false);
- if (!file || !*file) {
+ if ((!file || !*file) && qdict_size(bs_opts) == 2) {
cache = qdict_get_try_str(bs_opts, BDRV_OPT_CACHE_NO_FLUSH);
if (cache && !strcmp(cache, "on")) {
bdrv_flags |= BDRV_O_NO_FLUSH;
--
2.41.0.windows.1

View File

@ -0,0 +1,39 @@
From 8ac5c38a54d407b363d6633eb01806b0e9aaa15e Mon Sep 17 00:00:00 2001
From: yinxiuxiu <yinxiuxiu_yewu@cmss.chinamobile.com>
Date: Fri, 22 Nov 2024 14:45:09 +0800
Subject: [PATCH] Avoid taking address of out-of-bounds array index
Signed-off-by: yinxiuxiu <yinxiuxiu_yewu@cmss.chinamobile.com>
---
hw/intc/openpic.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/hw/intc/openpic.c b/hw/intc/openpic.c
index 0f99b77a17..d74ec11af4 100644
--- a/hw/intc/openpic.c
+++ b/hw/intc/openpic.c
@@ -1031,13 +1031,14 @@ static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
s_IRQ = IRQ_get_next(opp, &dst->servicing);
/* Check queued interrupts. */
n_IRQ = IRQ_get_next(opp, &dst->raised);
- src = &opp->src[n_IRQ];
- if (n_IRQ != -1 &&
- (s_IRQ == -1 ||
- IVPR_PRIORITY(src->ivpr) > dst->servicing.priority)) {
- DPRINTF("Raise OpenPIC INT output cpu %d irq %d",
- idx, n_IRQ);
- qemu_irq_raise(opp->dst[idx].irqs[OPENPIC_OUTPUT_INT]);
+ if (n_IRQ != -1) {
+ src = &opp->src[n_IRQ];
+ if (s_IRQ == -1 ||
+ IVPR_PRIORITY(src->ivpr) > dst->servicing.priority) {
+ DPRINTF("Raise OpenPIC INT output cpu %d irq %d",
+ idx, n_IRQ);
+ qemu_irq_raise(opp->dst[idx].irqs[OPENPIC_OUTPUT_INT]);
+ }
}
break;
default:
--
2.41.0.windows.1

View File

@ -0,0 +1,37 @@
From d2ee29691b6d6b48ba8da179e97572f5a6684a9d Mon Sep 17 00:00:00 2001
From: gubin <gubin_yewu@cmss.chinamobile.com>
Date: Mon, 18 Nov 2024 14:47:25 +0800
Subject: [PATCH] Avoid unaligned fetch in ladr_match()
cherry-pick from 6a5287ce80470bb8df95901d73ee779a64e70c3a
There is no guarantee that the PCNetState is allocated such that
csr[8] is allocated on an 8-byte boundary. Since not all hosts are
capable of unaligned fetches the 16-bit elements need to be fetched
individually to avoid a potential fault. Closes issue #2143
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2143
Signed-off-by: Nick Briggs <nicholas.h.briggs@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: gubin <gubin_yewu@cmss.chinamobile.com>
---
hw/net/pcnet.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/net/pcnet.c b/hw/net/pcnet.c
index a7e123e60d..7d574f487b 100644
--- a/hw/net/pcnet.c
+++ b/hw/net/pcnet.c
@@ -632,7 +632,7 @@ static inline int ladr_match(PCNetState *s, const uint8_t *buf, int size)
{
struct qemu_ether_header *hdr = (void *)buf;
if ((*(hdr->ether_dhost)&0x01) &&
- ((uint64_t *)&s->csr[8])[0] != 0LL) {
+ (s->csr[8] | s->csr[9] | s->csr[10] | s->csr[11]) != 0) {
uint8_t ladr[8] = {
s->csr[8] & 0xff, s->csr[8] >> 8,
s->csr[9] & 0xff, s->csr[9] >> 8,
--
2.41.0.windows.1

View File

@ -0,0 +1,43 @@
From 458d90e226d5833661f9257f6af57c14f9b9bdfe Mon Sep 17 00:00:00 2001
From: gongchangsui <gongchangsui@outlook.com>
Date: Mon, 17 Mar 2025 02:52:21 -0400
Subject: [PATCH] BUGFIX: Enforce isolation for virtcca_shared_hugepage
Add memory isolation enforcement when virtcca hugepage is disabled.
Signed-off-by: gongchangsui <gongchangsui@outlook.com>
---
hw/core/numa.c | 3 ++-
hw/virtio/vhost.c | 2 +-
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/hw/core/numa.c b/hw/core/numa.c
index e7c48dab61..c691578ef5 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -728,7 +728,8 @@ void numa_complete_configuration(MachineState *ms)
memory_region_init(ms->ram, OBJECT(ms), mc->default_ram_id,
ms->ram_size);
numa_init_memdev_container(ms, ms->ram);
- if (virtcca_cvm_enabled() && virtcca_shared_hugepage->ram_block) {
+ if (virtcca_cvm_enabled() && virtcca_shared_hugepage &&
+ virtcca_shared_hugepage->ram_block) {
virtcca_shared_memory_configuration(ms);
}
}
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 8b95558013..4bf0b03977 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1617,7 +1617,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
hdev->log_size = 0;
hdev->log_enabled = false;
hdev->started = false;
- if (virtcca_cvm_enabled()) {
+ if (virtcca_cvm_enabled() && virtcca_shared_hugepage && virtcca_shared_hugepage->ram_block) {
memory_listener_register(&hdev->memory_listener,
&address_space_virtcca_shared_memory);
} else {
--
2.41.0.windows.1

Binary file not shown.

View File

@ -0,0 +1,30 @@
From 0fc0686798aba89c4d4d94f7e0c8e513cfc473b1 Mon Sep 17 00:00:00 2001
From: lijunwei <lijunwei@kylinos.cn>
Date: Fri, 22 Nov 2024 17:09:17 +0800
Subject: [PATCH] Change vmstate_cpuhp_sts vmstateDescription version_id
fix live migration failed error message:
"qemu-kvm: Missing section footer for 0000:00:01.3/piix4_pm"
change vmstate_cpuhp_sts vmstateDescription version_id
Signed-off-by: lijunwei <lijunwei@kylinos.cn>
---
hw/acpi/cpu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index 292e1daca2..4ab27ac66e 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -316,7 +316,7 @@ void acpi_cpu_unplug_cb(CPUHotplugState *cpu_st,
static const VMStateDescription vmstate_cpuhp_sts = {
.name = "CPU hotplug device state",
- .version_id = 1,
+ .version_id = 2,
.minimum_version_id = 1,
.fields = (VMStateField[]) {
VMSTATE_BOOL(is_inserting, AcpiCpuStatus),
--
2.41.0.windows.1

View File

@ -0,0 +1,93 @@
From 53a82c6a5a22bb41e9bd3f754479baf4ce0845bf Mon Sep 17 00:00:00 2001
From: Zhenzhong Duan <zhenzhong.duan@intel.com>
Date: Mon, 5 Aug 2024 09:29:00 +0800
Subject: [PATCH] HostIOMMUDevice: Introduce realize_late callback
Previously we have a realize() callback which is called before attachment.
But there are still some elements e.g., ioas not ready before attachment.
So we need a realize_late() callback to further initialize them.
Currently, this callback is only useful for iommufd backend. For legacy
backend nothing needs to be initialized after attachment.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
hw/vfio/common.c | 18 +++++++++++++++---
include/sysemu/host_iommu_device.h | 17 +++++++++++++++++
2 files changed, 32 insertions(+), 3 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index a8bc1c6055..0be63c5fbc 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1654,6 +1654,7 @@ int vfio_attach_device(char *name, VFIODevice *vbasedev,
const VFIOIOMMUClass *ops =
VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY));
HostIOMMUDevice *hiod = NULL;
+ HostIOMMUDeviceClass *hiod_ops = NULL;
int ret;
if (vbasedev->iommufd) {
@@ -1664,17 +1665,28 @@ int vfio_attach_device(char *name, VFIODevice *vbasedev,
if (!vbasedev->mdev) {
hiod = HOST_IOMMU_DEVICE(object_new(ops->hiod_typename));
+ hiod_ops = HOST_IOMMU_DEVICE_GET_CLASS(hiod);
vbasedev->hiod = hiod;
}
ret = ops->attach_device(name, vbasedev, as, errp);
if (ret) {
- object_unref(hiod);
- vbasedev->hiod = NULL;
- return ret;
+ goto err_attach;
+ }
+
+ if (hiod_ops && hiod_ops->realize_late &&
+ !hiod_ops->realize_late(hiod, vbasedev, errp)) {
+ ops->detach_device(vbasedev);
+ ret = -EINVAL;
+ goto err_attach;
}
return 0;
+
+err_attach:
+ object_unref(hiod);
+ vbasedev->hiod = NULL;
+ return ret;
}
void vfio_detach_device(VFIODevice *vbasedev)
diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
index e4d8300350..84131f5495 100644
--- a/include/sysemu/host_iommu_device.h
+++ b/include/sysemu/host_iommu_device.h
@@ -64,6 +64,23 @@ struct HostIOMMUDeviceClass {
* Returns: true on success, false on failure.
*/
bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
+ /**
+ * @realize_late: initialize host IOMMU device instance after attachment,
+ * some elements e.g., ioas are ready only after attachment.
+ * This callback initialize them.
+ *
+ * Optional callback.
+ *
+ * @hiod: pointer to a host IOMMU device instance.
+ *
+ * @opaque: pointer to agent device of this host IOMMU device,
+ * e.g., VFIO base device or VDPA device.
+ *
+ * @errp: pass an Error out when realize fails.
+ *
+ * Returns: true on success, false on failure.
+ */
+ bool (*realize_late)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
/**
* @get_cap: check if a host IOMMU device capability is supported.
*
--
2.41.0.windows.1

View File

@ -0,0 +1,57 @@
From 35f33bf18826286c9e9fc739a893b9915c71f43c Mon Sep 17 00:00:00 2001
From: Eric Auger <eric.auger@redhat.com>
Date: Fri, 14 Jun 2024 11:52:51 +0200
Subject: [PATCH] HostIOMMUDevice: Store the VFIO/VDPA agent
Store the agent device (VFIO or VDPA) in the host IOMMU device.
This will allow easy access to some of its resources.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
---
hw/vfio/container.c | 1 +
hw/vfio/iommufd.c | 2 ++
include/sysemu/host_iommu_device.h | 1 +
3 files changed, 4 insertions(+)
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 10f7635425..8a5a112b6b 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -1259,6 +1259,7 @@ static bool hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
hiod->name = g_strdup(vdev->name);
hiod->caps.aw_bits = vfio_device_get_aw_bits(vdev);
+ hiod->agent = opaque;
return true;
}
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 3b75cba26c..7a069ca576 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -735,6 +735,8 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
} data;
uint64_t hw_caps;
+ hiod->agent = opaque;
+
if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid,
&type, &data, sizeof(data),
&hw_caps, errp)) {
diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
index a57873958b..3e5f058e7b 100644
--- a/include/sysemu/host_iommu_device.h
+++ b/include/sysemu/host_iommu_device.h
@@ -34,6 +34,7 @@ struct HostIOMMUDevice {
Object parent_obj;
char *name;
+ void *agent; /* pointer to agent device, ie. VFIO or VDPA device */
HostIOMMUDeviceCaps caps;
};
--
2.41.0.windows.1

View File

@ -0,0 +1,38 @@
From 08a4aa240587fed26c17271bf9af87f0a5997f4a Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Wed, 26 Mar 2025 18:59:33 +0800
Subject: [PATCH] Kconfig/iommufd/VDPA: Update IOMMUFD module configuration
dependencies The vDPA module can also use IOMMUFD like the VFIO module.
Therefore, adjust Kconfig to remove the dependency of IOMMUFD on VFIO and add
a reverse dependency on IOMMUFD for vDPA
Signed-off-by: libai <libai12@huawei.com>
---
Kconfig.host | 1 +
backends/Kconfig | 1 -
2 files changed, 1 insertion(+), 1 deletion(-)
diff --git a/Kconfig.host b/Kconfig.host
index f496475f8e..faf58d9af5 100644
--- a/Kconfig.host
+++ b/Kconfig.host
@@ -28,6 +28,7 @@ config VHOST_USER
config VHOST_VDPA
bool
+ select IOMMUFD
config VHOST_KERNEL
bool
diff --git a/backends/Kconfig b/backends/Kconfig
index 2cb23f62fa..8d0be5a263 100644
--- a/backends/Kconfig
+++ b/backends/Kconfig
@@ -2,4 +2,3 @@ source tpm/Kconfig
config IOMMUFD
bool
- depends on VFIO
--
2.41.0.windows.1

View File

@ -0,0 +1,41 @@
From e698238a5fa6e78fdffc8269d59884df69da3434 Mon Sep 17 00:00:00 2001
From: chenzheng <chenzheng71@huawei.com>
Date: Thu, 5 Dec 2024 11:06:57 +0000
Subject: [PATCH] Reserve address for MSI mapping in the CVM scenario.
Signed-off-by: yangxiangkai@huawei.com
---
hw/arm/virt.c | 3 ++-
include/hw/arm/virt.h | 1 +
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index a9efcec85e..8823f2ed1c 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -162,8 +162,9 @@ static const MemMapEntry base_memmap[] = {
[VIRT_PVTIME] = { 0x090a0000, 0x00010000 },
[VIRT_SECURE_GPIO] = { 0x090b0000, 0x00001000 },
[VIRT_CPUHP_ACPI] = { 0x090c0000, ACPI_CPU_HOTPLUG_REG_LEN},
- /* In the virtCCA scenario, this space is used for MSI interrupt mapping */
[VIRT_MMIO] = { 0x0a000000, 0x00000200 },
+ /* In the virtCCA scenario, this space is used for MSI interrupt mapping */
+ [VIRT_CVM_MSI] = { 0x0a001000, 0x00fff000 },
[VIRT_CPUFREQ] = { 0x0b000000, 0x00010000 },
/* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
[VIRT_PLATFORM_BUS] = { 0x0c000000, 0x02000000 },
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 4b7dc61c24..345b2d5594 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -121,6 +121,7 @@ enum {
VIRT_UART,
VIRT_CPUFREQ,
VIRT_MMIO,
+ VIRT_CVM_MSI,
VIRT_RTC,
VIRT_FW_CFG,
VIRT_PCIE,
--
2.41.0.windows.1

View File

@ -0,0 +1,32 @@
From c0717e82e34f96af456309b3786a6808e8e324e4 Mon Sep 17 00:00:00 2001
From: huangyan <huangyan@cdjrlc.com>
Date: Wed, 16 Apr 2025 00:43:27 +0800
Subject: [PATCH] Revert "linux-user: Print tid not pid with strace"
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This reverts commit 2f37362de1d971cc90c35405705bfa22a33f6cd8.
* this change is incomplete, "get_task_state" lacks the implementation.
* Moreover, it requires all calls to the "getpid" function to be changed to use "get_task_state", it would cause too much disruptionand it has not been applied in the upstream 8.2.0.
---
linux-user/strace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/linux-user/strace.c b/linux-user/strace.c
index ac9177ebe4..cf26e55264 100644
--- a/linux-user/strace.c
+++ b/linux-user/strace.c
@@ -4176,7 +4176,7 @@ print_syscall(CPUArchState *cpu_env, int num,
if (!f) {
return;
}
- fprintf(f, "%d ", get_task_state(env_cpu(cpu_env))->ts_tid);
+ fprintf(f, "%d ", getpid());
for (i = 0; i < nsyscalls; i++) {
if (scnames[i].nr == num) {
--
2.41.0.windows.1

View File

@ -0,0 +1,514 @@
From ac715e361fdb6d92169b3b3f5964405c816a13ac Mon Sep 17 00:00:00 2001
From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Date: Tue, 14 Jan 2025 10:29:24 +0000
Subject: [PATCH] Update iommufd.h header for vSVA
This is based on Linaro UADK branch:
https://github.com/Linaro/linux-kernel-uadk/tree/6.12-wip-10.26
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
linux-headers/linux/iommufd.h | 394 ++++++++++++++++++++++++++++++++--
1 file changed, 371 insertions(+), 23 deletions(-)
diff --git a/linux-headers/linux/iommufd.h b/linux-headers/linux/iommufd.h
index 806d98d09c..41559c6064 100644
--- a/linux-headers/linux/iommufd.h
+++ b/linux-headers/linux/iommufd.h
@@ -37,18 +37,22 @@
enum {
IOMMUFD_CMD_BASE = 0x80,
IOMMUFD_CMD_DESTROY = IOMMUFD_CMD_BASE,
- IOMMUFD_CMD_IOAS_ALLOC,
- IOMMUFD_CMD_IOAS_ALLOW_IOVAS,
- IOMMUFD_CMD_IOAS_COPY,
- IOMMUFD_CMD_IOAS_IOVA_RANGES,
- IOMMUFD_CMD_IOAS_MAP,
- IOMMUFD_CMD_IOAS_UNMAP,
- IOMMUFD_CMD_OPTION,
- IOMMUFD_CMD_VFIO_IOAS,
- IOMMUFD_CMD_HWPT_ALLOC,
- IOMMUFD_CMD_GET_HW_INFO,
- IOMMUFD_CMD_HWPT_SET_DIRTY_TRACKING,
- IOMMUFD_CMD_HWPT_GET_DIRTY_BITMAP,
+ IOMMUFD_CMD_IOAS_ALLOC = 0x81,
+ IOMMUFD_CMD_IOAS_ALLOW_IOVAS = 0x82,
+ IOMMUFD_CMD_IOAS_COPY = 0x83,
+ IOMMUFD_CMD_IOAS_IOVA_RANGES = 0x84,
+ IOMMUFD_CMD_IOAS_MAP = 0x85,
+ IOMMUFD_CMD_IOAS_UNMAP = 0x86,
+ IOMMUFD_CMD_OPTION = 0x87,
+ IOMMUFD_CMD_VFIO_IOAS = 0x88,
+ IOMMUFD_CMD_HWPT_ALLOC = 0x89,
+ IOMMUFD_CMD_GET_HW_INFO = 0x8a,
+ IOMMUFD_CMD_HWPT_SET_DIRTY_TRACKING = 0x8b,
+ IOMMUFD_CMD_HWPT_GET_DIRTY_BITMAP = 0x8c,
+ IOMMUFD_CMD_HWPT_INVALIDATE = 0x8d,
+ IOMMUFD_CMD_FAULT_QUEUE_ALLOC = 0x8e,
+ IOMMUFD_CMD_VIOMMU_ALLOC = 0x8f,
+ IOMMUFD_CMD_VDEVICE_ALLOC = 0x90,
};
/**
@@ -355,10 +359,13 @@ struct iommu_vfio_ioas {
* the parent HWPT in a nesting configuration.
* @IOMMU_HWPT_ALLOC_DIRTY_TRACKING: Dirty tracking support for device IOMMU is
* enforced on device attachment
+ * @IOMMU_HWPT_FAULT_ID_VALID: The fault_id field of hwpt allocation data is
+ * valid.
*/
enum iommufd_hwpt_alloc_flags {
IOMMU_HWPT_ALLOC_NEST_PARENT = 1 << 0,
IOMMU_HWPT_ALLOC_DIRTY_TRACKING = 1 << 1,
+ IOMMU_HWPT_FAULT_ID_VALID = 1 << 2,
};
/**
@@ -389,14 +396,34 @@ struct iommu_hwpt_vtd_s1 {
__u32 __reserved;
};
+/**
+ * struct iommu_hwpt_arm_smmuv3 - ARM SMMUv3 Context Descriptor Table info
+ * (IOMMU_HWPT_DATA_ARM_SMMUV3)
+ *
+ * @ste: The first two double words of the user space Stream Table Entry for
+ * a user stage-1 Context Descriptor Table. Must be little-endian.
+ * Allowed fields: (Refer to "5.2 Stream Table Entry" in SMMUv3 HW Spec)
+ * - word-0: V, Cfg, S1Fmt, S1ContextPtr, S1CDMax
+ * - word-1: EATS, S1DSS, S1CIR, S1COR, S1CSH, S1STALLD
+ *
+ * -EIO will be returned if @ste is not legal or contains any non-allowed field.
+ * Cfg can be used to select a S1, Bypass or Abort configuration. A Bypass
+ * nested domain will translate the same as the nesting parent.
+ */
+struct iommu_hwpt_arm_smmuv3 {
+ __aligned_le64 ste[2];
+};
+
/**
* enum iommu_hwpt_data_type - IOMMU HWPT Data Type
* @IOMMU_HWPT_DATA_NONE: no data
* @IOMMU_HWPT_DATA_VTD_S1: Intel VT-d stage-1 page table
+ * @IOMMU_HWPT_DATA_ARM_SMMUV3: ARM SMMUv3 Context Descriptor Table
*/
enum iommu_hwpt_data_type {
- IOMMU_HWPT_DATA_NONE,
- IOMMU_HWPT_DATA_VTD_S1,
+ IOMMU_HWPT_DATA_NONE = 0,
+ IOMMU_HWPT_DATA_VTD_S1 = 1,
+ IOMMU_HWPT_DATA_ARM_SMMUV3 = 2,
};
/**
@@ -404,12 +431,15 @@ enum iommu_hwpt_data_type {
* @size: sizeof(struct iommu_hwpt_alloc)
* @flags: Combination of enum iommufd_hwpt_alloc_flags
* @dev_id: The device to allocate this HWPT for
- * @pt_id: The IOAS or HWPT to connect this HWPT to
+ * @pt_id: The IOAS or HWPT or vIOMMU to connect this HWPT to
* @out_hwpt_id: The ID of the new HWPT
* @__reserved: Must be 0
* @data_type: One of enum iommu_hwpt_data_type
* @data_len: Length of the type specific data
* @data_uptr: User pointer to the type specific data
+ * @fault_id: The ID of IOMMUFD_FAULT object. Valid only if flags field of
+ * IOMMU_HWPT_FAULT_ID_VALID is set.
+ * @__reserved2: Padding to 64-bit alignment. Must be 0.
*
* Explicitly allocate a hardware page table object. This is the same object
* type that is returned by iommufd_device_attach() and represents the
@@ -420,11 +450,13 @@ enum iommu_hwpt_data_type {
* IOMMU_HWPT_DATA_NONE. The HWPT can be allocated as a parent HWPT for a
* nesting configuration by passing IOMMU_HWPT_ALLOC_NEST_PARENT via @flags.
*
- * A user-managed nested HWPT will be created from a given parent HWPT via
- * @pt_id, in which the parent HWPT must be allocated previously via the
- * same ioctl from a given IOAS (@pt_id). In this case, the @data_type
- * must be set to a pre-defined type corresponding to an I/O page table
- * type supported by the underlying IOMMU hardware.
+ * A user-managed nested HWPT will be created from a given vIOMMU (wrapping a
+ * parent HWPT) or a parent HWPT via @pt_id, in which the parent HWPT must be
+ * allocated previously via the same ioctl from a given IOAS (@pt_id). In this
+ * case, the @data_type must be set to a pre-defined type corresponding to an
+ * I/O page table type supported by the underlying IOMMU hardware. The device
+ * via @dev_id and the vIOMMU via @pt_id must be associated to the same IOMMU
+ * instance.
*
* If the @data_type is set to IOMMU_HWPT_DATA_NONE, @data_len and
* @data_uptr should be zero. Otherwise, both @data_len and @data_uptr
@@ -440,6 +472,8 @@ struct iommu_hwpt_alloc {
__u32 data_type;
__u32 data_len;
__aligned_u64 data_uptr;
+ __u32 fault_id;
+ __u32 __reserved2;
};
#define IOMMU_HWPT_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_HWPT_ALLOC)
@@ -474,15 +508,50 @@ struct iommu_hw_info_vtd {
__aligned_u64 ecap_reg;
};
+/**
+ * struct iommu_hw_info_arm_smmuv3 - ARM SMMUv3 hardware information
+ * (IOMMU_HW_INFO_TYPE_ARM_SMMUV3)
+ *
+ * @flags: Must be set to 0
+ * @__reserved: Must be 0
+ * @idr: Implemented features for ARM SMMU Non-secure programming interface
+ * @iidr: Information about the implementation and implementer of ARM SMMU,
+ * and architecture version supported
+ * @aidr: ARM SMMU architecture version
+ *
+ * For the details of @idr, @iidr and @aidr, please refer to the chapters
+ * from 6.3.1 to 6.3.6 in the SMMUv3 Spec.
+ *
+ * User space should read the underlying ARM SMMUv3 hardware information for
+ * the list of supported features.
+ *
+ * Note that these values reflect the raw HW capability, without any insight if
+ * any required kernel driver support is present. Bits may be set indicating the
+ * HW has functionality that is lacking kernel software support, such as BTM. If
+ * a VMM is using this information to construct emulated copies of these
+ * registers it should only forward bits that it knows it can support.
+ *
+ * In future, presence of required kernel support will be indicated in flags.
+ */
+struct iommu_hw_info_arm_smmuv3 {
+ __u32 flags;
+ __u32 __reserved;
+ __u32 idr[6];
+ __u32 iidr;
+ __u32 aidr;
+};
+
/**
* enum iommu_hw_info_type - IOMMU Hardware Info Types
* @IOMMU_HW_INFO_TYPE_NONE: Used by the drivers that do not report hardware
* info
* @IOMMU_HW_INFO_TYPE_INTEL_VTD: Intel VT-d iommu info type
+ * @IOMMU_HW_INFO_TYPE_ARM_SMMUV3: ARM SMMUv3 iommu info type
*/
enum iommu_hw_info_type {
- IOMMU_HW_INFO_TYPE_NONE,
- IOMMU_HW_INFO_TYPE_INTEL_VTD,
+ IOMMU_HW_INFO_TYPE_NONE = 0,
+ IOMMU_HW_INFO_TYPE_INTEL_VTD = 1,
+ IOMMU_HW_INFO_TYPE_ARM_SMMUV3 = 2,
};
/**
@@ -494,9 +563,17 @@ enum iommu_hw_info_type {
* IOMMU_HWPT_GET_DIRTY_BITMAP
* IOMMU_HWPT_SET_DIRTY_TRACKING
*
+ * @IOMMU_HW_CAP_PASID_EXEC: Execute Permission Supported, user ignores it
+ * when the struct iommu_hw_info::out_max_pasid_log2
+ * is zero.
+ * @IOMMU_HW_CAP_PASID_PRIV: Privileged Mode Supported, user ignores it
+ * when the struct iommu_hw_info::out_max_pasid_log2
+ * is zero.
*/
enum iommufd_hw_capabilities {
IOMMU_HW_CAP_DIRTY_TRACKING = 1 << 0,
+ IOMMU_HW_CAP_PCI_PASID_EXEC = 1 << 1,
+ IOMMU_HW_CAP_PCI_PASID_PRIV = 1 << 2,
};
/**
@@ -512,6 +589,9 @@ enum iommufd_hw_capabilities {
* iommu_hw_info_type.
* @out_capabilities: Output the generic iommu capability info type as defined
* in the enum iommu_hw_capabilities.
+ * @out_max_pasid_log2: Output the width of PASIDs. 0 means no PASID support.
+ * PCI devices turn to out_capabilities to check if the
+ * specific capabilities is supported or not.
* @__reserved: Must be 0
*
* Query an iommu type specific hardware information data from an iommu behind
@@ -535,7 +615,8 @@ struct iommu_hw_info {
__u32 data_len;
__aligned_u64 data_uptr;
__u32 out_data_type;
- __u32 __reserved;
+ __u8 out_max_pasid_log2;
+ __u8 __reserved[3];
__aligned_u64 out_capabilities;
};
#define IOMMU_GET_HW_INFO _IO(IOMMUFD_TYPE, IOMMUFD_CMD_GET_HW_INFO)
@@ -613,4 +694,271 @@ struct iommu_hwpt_get_dirty_bitmap {
#define IOMMU_HWPT_GET_DIRTY_BITMAP _IO(IOMMUFD_TYPE, \
IOMMUFD_CMD_HWPT_GET_DIRTY_BITMAP)
+/**
+ * enum iommu_hwpt_invalidate_data_type - IOMMU HWPT Cache Invalidation
+ * Data Type
+ * @IOMMU_HWPT_INVALIDATE_DATA_VTD_S1: Invalidation data for VTD_S1
+ * @IOMMU_VIOMMU_INVALIDATE_DATA_ARM_SMMUV3: Invalidation data for ARM SMMUv3
+ */
+enum iommu_hwpt_invalidate_data_type {
+ IOMMU_HWPT_INVALIDATE_DATA_VTD_S1 = 0,
+ IOMMU_VIOMMU_INVALIDATE_DATA_ARM_SMMUV3 = 1,
+};
+
+/**
+ * enum iommu_hwpt_vtd_s1_invalidate_flags - Flags for Intel VT-d
+ * stage-1 cache invalidation
+ * @IOMMU_VTD_INV_FLAGS_LEAF: Indicates whether the invalidation applies
+ * to all-levels page structure cache or just
+ * the leaf PTE cache.
+ */
+enum iommu_hwpt_vtd_s1_invalidate_flags {
+ IOMMU_VTD_INV_FLAGS_LEAF = 1 << 0,
+};
+
+/**
+ * struct iommu_hwpt_vtd_s1_invalidate - Intel VT-d cache invalidation
+ * (IOMMU_HWPT_INVALIDATE_DATA_VTD_S1)
+ * @addr: The start address of the range to be invalidated. It needs to
+ * be 4KB aligned.
+ * @npages: Number of contiguous 4K pages to be invalidated.
+ * @flags: Combination of enum iommu_hwpt_vtd_s1_invalidate_flags
+ * @__reserved: Must be 0
+ *
+ * The Intel VT-d specific invalidation data for user-managed stage-1 cache
+ * invalidation in nested translation. Userspace uses this structure to
+ * tell the impacted cache scope after modifying the stage-1 page table.
+ *
+ * Invalidating all the caches related to the page table by setting @addr
+ * to be 0 and @npages to be U64_MAX.
+ *
+ * The device TLB will be invalidated automatically if ATS is enabled.
+ */
+struct iommu_hwpt_vtd_s1_invalidate {
+ __aligned_u64 addr;
+ __aligned_u64 npages;
+ __u32 flags;
+ __u32 __reserved;
+};
+
+/**
+ * struct iommu_viommu_arm_smmuv3_invalidate - ARM SMMUv3 cahce invalidation
+ * (IOMMU_VIOMMU_INVALIDATE_DATA_ARM_SMMUV3)
+ * @cmd: 128-bit cache invalidation command that runs in SMMU CMDQ.
+ * Must be little-endian.
+ *
+ * Supported command list only when passing in a vIOMMU via @hwpt_id:
+ * CMDQ_OP_TLBI_NSNH_ALL
+ * CMDQ_OP_TLBI_NH_VA
+ * CMDQ_OP_TLBI_NH_VAA
+ * CMDQ_OP_TLBI_NH_ALL
+ * CMDQ_OP_TLBI_NH_ASID
+ * CMDQ_OP_ATC_INV
+ * CMDQ_OP_CFGI_CD
+ * CMDQ_OP_CFGI_CD_ALL
+ *
+ * -EIO will be returned if the command is not supported.
+ */
+struct iommu_viommu_arm_smmuv3_invalidate {
+ __aligned_le64 cmd[2];
+};
+
+/**
+ * struct iommu_hwpt_invalidate - ioctl(IOMMU_HWPT_INVALIDATE)
+ * @size: sizeof(struct iommu_hwpt_invalidate)
+ * @hwpt_id: ID of a nested HWPT or a vIOMMU, for cache invalidation
+ * @data_uptr: User pointer to an array of driver-specific cache invalidation
+ * data.
+ * @data_type: One of enum iommu_hwpt_invalidate_data_type, defining the data
+ * type of all the entries in the invalidation request array. It
+ * should be a type supported by the hwpt pointed by @hwpt_id.
+ * @entry_len: Length (in bytes) of a request entry in the request array
+ * @entry_num: Input the number of cache invalidation requests in the array.
+ * Output the number of requests successfully handled by kernel.
+ * @__reserved: Must be 0.
+ *
+ * Invalidate iommu cache for user-managed page table or vIOMMU. Modifications
+ * on a user-managed page table should be followed by this operation, if a HWPT
+ * is passed in via @hwpt_id. Other caches, such as device cache or descriptor
+ * cache can be flushed if a vIOMMU is passed in via the @hwpt_id field.
+ *
+ * Each ioctl can support one or more cache invalidation requests in the array
+ * that has a total size of @entry_len * @entry_num.
+ *
+ * An empty invalidation request array by setting @entry_num==0 is allowed, and
+ * @entry_len and @data_uptr would be ignored in this case. This can be used to
+ * check if the given @data_type is supported or not by kernel.
+ */
+struct iommu_hwpt_invalidate {
+ __u32 size;
+ __u32 hwpt_id;
+ __aligned_u64 data_uptr;
+ __u32 data_type;
+ __u32 entry_len;
+ __u32 entry_num;
+ __u32 __reserved;
+};
+#define IOMMU_HWPT_INVALIDATE _IO(IOMMUFD_TYPE, IOMMUFD_CMD_HWPT_INVALIDATE)
+
+/**
+ * enum iommu_hwpt_pgfault_flags - flags for struct iommu_hwpt_pgfault
+ * @IOMMU_PGFAULT_FLAGS_PASID_VALID: The pasid field of the fault data is
+ * valid.
+ * @IOMMU_PGFAULT_FLAGS_LAST_PAGE: It's the last fault of a fault group.
+ */
+enum iommu_hwpt_pgfault_flags {
+ IOMMU_PGFAULT_FLAGS_PASID_VALID = (1 << 0),
+ IOMMU_PGFAULT_FLAGS_LAST_PAGE = (1 << 1),
+};
+
+/**
+ * enum iommu_hwpt_pgfault_perm - perm bits for struct iommu_hwpt_pgfault
+ * @IOMMU_PGFAULT_PERM_READ: request for read permission
+ * @IOMMU_PGFAULT_PERM_WRITE: request for write permission
+ * @IOMMU_PGFAULT_PERM_EXEC: (PCIE 10.4.1) request with a PASID that has the
+ * Execute Requested bit set in PASID TLP Prefix.
+ * @IOMMU_PGFAULT_PERM_PRIV: (PCIE 10.4.1) request with a PASID that has the
+ * Privileged Mode Requested bit set in PASID TLP
+ * Prefix.
+ */
+enum iommu_hwpt_pgfault_perm {
+ IOMMU_PGFAULT_PERM_READ = (1 << 0),
+ IOMMU_PGFAULT_PERM_WRITE = (1 << 1),
+ IOMMU_PGFAULT_PERM_EXEC = (1 << 2),
+ IOMMU_PGFAULT_PERM_PRIV = (1 << 3),
+};
+
+/**
+ * struct iommu_hwpt_pgfault - iommu page fault data
+ * @flags: Combination of enum iommu_hwpt_pgfault_flags
+ * @dev_id: id of the originated device
+ * @pasid: Process Address Space ID
+ * @grpid: Page Request Group Index
+ * @perm: Combination of enum iommu_hwpt_pgfault_perm
+ * @addr: Fault address
+ * @length: a hint of how much data the requestor is expecting to fetch. For
+ * example, if the PRI initiator knows it is going to do a 10MB
+ * transfer, it could fill in 10MB and the OS could pre-fault in
+ * 10MB of IOVA. It's default to 0 if there's no such hint.
+ * @cookie: kernel-managed cookie identifying a group of fault messages. The
+ * cookie number encoded in the last page fault of the group should
+ * be echoed back in the response message.
+ */
+struct iommu_hwpt_pgfault {
+ __u32 flags;
+ __u32 dev_id;
+ __u32 pasid;
+ __u32 grpid;
+ __u32 perm;
+ __u64 addr;
+ __u32 length;
+ __u32 cookie;
+};
+
+/**
+ * enum iommufd_page_response_code - Return status of fault handlers
+ * @IOMMUFD_PAGE_RESP_SUCCESS: Fault has been handled and the page tables
+ * populated, retry the access. This is the
+ * "Success" defined in PCI 10.4.2.1.
+ * @IOMMUFD_PAGE_RESP_INVALID: Could not handle this fault, don't retry the
+ * access. This is the "Invalid Request" in PCI
+ * 10.4.2.1.
+ */
+enum iommufd_page_response_code {
+ IOMMUFD_PAGE_RESP_SUCCESS = 0,
+ IOMMUFD_PAGE_RESP_INVALID = 1,
+};
+
+/**
+ * struct iommu_hwpt_page_response - IOMMU page fault response
+ * @cookie: The kernel-managed cookie reported in the fault message.
+ * @code: One of response code in enum iommufd_page_response_code.
+ */
+struct iommu_hwpt_page_response {
+ __u32 cookie;
+ __u32 code;
+};
+
+/**
+ * struct iommu_fault_alloc - ioctl(IOMMU_FAULT_QUEUE_ALLOC)
+ * @size: sizeof(struct iommu_fault_alloc)
+ * @flags: Must be 0
+ * @out_fault_id: The ID of the new FAULT
+ * @out_fault_fd: The fd of the new FAULT
+ *
+ * Explicitly allocate a fault handling object.
+ */
+struct iommu_fault_alloc {
+ __u32 size;
+ __u32 flags;
+ __u32 out_fault_id;
+ __u32 out_fault_fd;
+};
+#define IOMMU_FAULT_QUEUE_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_FAULT_QUEUE_ALLOC)
+
+/**
+ * enum iommu_viommu_type - Virtual IOMMU Type
+ * @IOMMU_VIOMMU_TYPE_DEFAULT: Reserved for future use
+ * @IOMMU_VIOMMU_TYPE_ARM_SMMUV3: ARM SMMUv3 driver specific type
+ */
+enum iommu_viommu_type {
+ IOMMU_VIOMMU_TYPE_DEFAULT = 0,
+ IOMMU_VIOMMU_TYPE_ARM_SMMUV3 = 1,
+};
+
+/**
+ * struct iommu_viommu_alloc - ioctl(IOMMU_VIOMMU_ALLOC)
+ * @size: sizeof(struct iommu_viommu_alloc)
+ * @flags: Must be 0
+ * @type: Type of the virtual IOMMU. Must be defined in enum iommu_viommu_type
+ * @dev_id: The device's physical IOMMU will be used to back the virtual IOMMU
+ * @hwpt_id: ID of a nesting parent HWPT to associate to
+ * @out_viommu_id: Output virtual IOMMU ID for the allocated object
+ *
+ * Allocate a virtual IOMMU object, representing the underlying physical IOMMU's
+ * virtualization support that is a security-isolated slice of the real IOMMU HW
+ * that is unique to a specific VM. Operations global to the IOMMU are connected
+ * to the vIOMMU, such as:
+ * - Security namespace for guest owned ID, e.g. guest-controlled cache tags
+ * - Access to a sharable nesting parent pagetable across physical IOMMUs
+ * - Non-affiliated event reporting (e.g. an invalidation queue error)
+ * - Virtualization of various platforms IDs, e.g. RIDs and others
+ * - Delivery of paravirtualized invalidation
+ * - Direct assigned invalidation queues
+ * - Direct assigned interrupts
+ */
+struct iommu_viommu_alloc {
+ __u32 size;
+ __u32 flags;
+ __u32 type;
+ __u32 dev_id;
+ __u32 hwpt_id;
+ __u32 out_viommu_id;
+};
+#define IOMMU_VIOMMU_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VIOMMU_ALLOC)
+
+/**
+ * struct iommu_vdevice_alloc - ioctl(IOMMU_VDEVICE_ALLOC)
+ * @size: sizeof(struct iommu_vdevice_alloc)
+ * @viommu_id: vIOMMU ID to associate with the virtual device
+ * @dev_id: The pyhsical device to allocate a virtual instance on the vIOMMU
+ * @__reserved: Must be 0
+ * @virt_id: Virtual device ID per vIOMMU, e.g. vSID of ARM SMMUv3, vDeviceID
+ * of AMD IOMMU, and vID of a nested Intel VT-d to a Context Table.
+ * @out_vdevice_id: Output virtual instance ID for the allocated object
+ * @__reserved2: Must be 0
+ *
+ * Allocate a virtual device instance (for a physical device) against a vIOMMU.
+ * This instance holds the device's information (related to its vIOMMU) in a VM.
+ */
+struct iommu_vdevice_alloc {
+ __u32 size;
+ __u32 viommu_id;
+ __u32 dev_id;
+ __u32 __reserved;
+ __aligned_u64 virt_id;
+ __u32 out_vdevice_id;
+ __u32 __reserved2;
+};
+#define IOMMU_VDEVICE_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VDEVICE_ALLOC)
#endif
--
2.41.0.windows.1

View File

@ -0,0 +1,42 @@
From b611bd7f3f4525c8373f2e504594414e1ed5b058 Mon Sep 17 00:00:00 2001
From: guping <guping_yewu@cmss.chinamobile.com>
Date: Mon, 18 Nov 2024 02:50:17 +0000
Subject: [PATCH] accel/tcg: Fix user-only probe_access_internal plugin check
cherry-pick from 2a339fee450638b512c5122281cb5ab49331cfb8
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The acc_flag check for write should have been against PAGE_WRITE_ORG,
not PAGE_WRITE. But it is better to combine two acc_flag checks
to a single check against access_type. This matches the system code
in cputlb.c.
Cc: qemu-stable@nongnu.org
Resolves: #2647
Signed-off-by: default avatarRichard Henderson <richard.henderson@linaro.org>
Message-Id: 20241111145002.144995-1-richard.henderson@linaro.org
Reviewed-by: default avatarAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: guping <guping_yewu@cmss.chinamobile.com>
---
accel/tcg/user-exec.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 68b252cb8e..e87848a5e2 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -794,7 +794,7 @@ static int probe_access_internal(CPUArchState *env, vaddr addr,
if (guest_addr_valid_untagged(addr)) {
int page_flags = page_get_flags(addr);
if (page_flags & acc_flag) {
- if ((acc_flag == PAGE_READ || acc_flag == PAGE_WRITE)
+ if (access_type != MMU_INST_FETCH
&& cpu_plugin_mem_cbs_enabled(env_cpu(env))) {
return TLB_MMIO;
}
--
2.41.0.windows.1

View File

@ -0,0 +1,32 @@
From 237fdc8ddb0598234aace9c88ac4c8387119a12a Mon Sep 17 00:00:00 2001
From: Eric Auger <eric.auger@redhat.com>
Date: Thu, 7 Jul 2022 11:55:25 -0400
Subject: [PATCH] acpi/gpex: Fix PCI Express Slot Information function 0
returned value
At the moment we do not support other function than function 0.
So according to ACPI spec "_DSM (Device Specific Method)"
description, bit 0 should rather be 0, meaning no other function is
supported than function 0.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/pci-host/gpex-acpi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/pci-host/gpex-acpi.c b/hw/pci-host/gpex-acpi.c
index 1092dc3b70..ac5d229757 100644
--- a/hw/pci-host/gpex-acpi.c
+++ b/hw/pci-host/gpex-acpi.c
@@ -113,7 +113,7 @@ static void acpi_dsdt_add_pci_osc(Aml *dev)
UUID = aml_touuid("E5C937D0-3553-4D7A-9117-EA4D19C3434D");
ifctx = aml_if(aml_equal(aml_arg(0), UUID));
ifctx1 = aml_if(aml_equal(aml_arg(2), aml_int(0)));
- uint8_t byte_list[1] = {1};
+ uint8_t byte_list[1] = {0};
buf = aml_buffer(1, byte_list);
aml_append(ifctx1, aml_return(buf));
aml_append(ifctx, ifctx1);
--
2.41.0.windows.1

View File

@ -0,0 +1,66 @@
From 1f6dde2350209e937a5676c6775d1500136caea2 Mon Sep 17 00:00:00 2001
From: gubin <gubin_yewu@cmss.chinamobile.com>
Date: Mon, 18 Nov 2024 13:48:37 +0800
Subject: [PATCH] acpi/tests/avocado/bits: wait for 200 seconds for SHUTDOWN
event from bits VM
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
cheery-pick from 7ef4c41e91d59d72a3b8bc022a6cb3e81787a50a
By default, the timeout to receive any specified event from the QEMU VM is 60
seconds set by the python avocado test framework. Please see event_wait() and
events_wait() in python/qemu/machine/machine.py. If the matching event is not
triggered within that interval, an asyncio.TimeoutError is generated. Since the
timeout for the bits avocado test is 200 secs, we need to make event_wait()
timeout of the same value as well so that an early timeout is not triggered by
the avocado framework.
CC: peter.maydell@linaro.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2077
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20240117042556.3360190-1-anisinha@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: gubin <gubin_yewu@cmss.chinamobile.com>
---
tests/avocado/acpi-bits.py | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/tests/avocado/acpi-bits.py b/tests/avocado/acpi-bits.py
index 68b9e98d4e..efe4f52ee0 100644
--- a/tests/avocado/acpi-bits.py
+++ b/tests/avocado/acpi-bits.py
@@ -54,6 +54,8 @@
deps = ["xorriso", "mformat"] # dependent tools needed in the test setup/box.
supported_platforms = ['x86_64'] # supported test platforms.
+# default timeout of 120 secs is sometimes not enough for bits test.
+BITS_TIMEOUT = 200
def which(tool):
""" looks up the full path for @tool, returns None if not found
@@ -133,7 +135,7 @@ class AcpiBitsTest(QemuBaseTest): #pylint: disable=too-many-instance-attributes
"""
# in slower systems the test can take as long as 3 minutes to complete.
- timeout = 200
+ timeout = BITS_TIMEOUT
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
@@ -400,7 +402,8 @@ def test_acpi_smbios_bits(self):
# biosbits has been configured to run all the specified test suites
# in batch mode and then automatically initiate a vm shutdown.
- # Rely on avocado's unit test timeout.
- self._vm.event_wait('SHUTDOWN')
+ # Set timeout to BITS_TIMEOUT for SHUTDOWN event from bits VM at par
+ # with the avocado test timeout.
+ self._vm.event_wait('SHUTDOWN', timeout=BITS_TIMEOUT)
self._vm.wait(timeout=None)
self.parse_log()
--
2.41.0.windows.1

View File

@ -0,0 +1,189 @@
From 9eacd1a6df6861b76663e98133adb15059bf65cc Mon Sep 17 00:00:00 2001
From: gongchangsui <gongchangsui@outlook.com>
Date: Mon, 17 Mar 2025 02:40:50 -0400
Subject: [PATCH] arm: VirtCCA: CVM support UEFI boot
1. Add UEFI boot support for Confidential VMs.
2. Modify the base memory address of Confidential VMs from 3GB to 1GB.
3. Disable pflash boot support for Confidential VMs; use the`-bios`option to specify`QEMU_EFI.fd`during launch.
Signed-off-by: gongchangsui <gongchangsui@outlook.com>
---
hw/arm/boot.c | 38 ++++++++++++++++++++++++++++++++++++--
hw/arm/virt.c | 33 ++++++++++++++++++++++++++++++++-
include/hw/arm/boot.h | 3 +++
3 files changed, 71 insertions(+), 3 deletions(-)
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 42110b0f18..6b2f46af4d 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -43,6 +43,9 @@
#define BOOTLOADER_MAX_SIZE (4 * KiB)
+#define UEFI_MAX_SIZE 0x8000000
+#define UEFI_LOADER_START 0x0
+#define DTB_MAX 0x200000
AddressSpace *arm_boot_address_space(ARMCPU *cpu,
const struct arm_boot_info *info)
{
@@ -1155,7 +1158,31 @@ static void arm_setup_direct_kernel_boot(ARMCPU *cpu,
}
}
-static void arm_setup_firmware_boot(ARMCPU *cpu, struct arm_boot_info *info)
+static void arm_setup_confidential_firmware_boot(ARMCPU *cpu,
+ struct arm_boot_info *info,
+ const char *firmware_filename)
+{
+ ssize_t fw_size;
+ const char *fname;
+ AddressSpace *as = arm_boot_address_space(cpu, info);
+
+ fname = qemu_find_file(QEMU_FILE_TYPE_BIOS, firmware_filename);
+ if (!fname) {
+ error_report("Could not find firmware image '%s'", firmware_filename);
+ exit(EXIT_FAILURE);
+ }
+
+ fw_size = load_image_targphys_as(firmware_filename,
+ info->firmware_base,
+ info->firmware_max_size, as);
+
+ if (fw_size <= 0) {
+ error_report("could not load firmware '%s'", firmware_filename);
+ exit(EXIT_FAILURE);
+ }
+}
+
+static void arm_setup_firmware_boot(ARMCPU *cpu, struct arm_boot_info *info, const char *firmware_filename)
{
/* Set up for booting firmware (which might load a kernel via fw_cfg) */
@@ -1166,6 +1193,8 @@ static void arm_setup_firmware_boot(ARMCPU *cpu, struct arm_boot_info *info)
* DTB to the base of RAM for the bootloader to pick up.
*/
info->dtb_start = info->loader_start;
+ if (info->confidential)
+ tmm_add_ram_region(UEFI_LOADER_START, UEFI_MAX_SIZE, info->dtb_start, DTB_MAX , true);
}
if (info->kernel_filename) {
@@ -1206,6 +1235,11 @@ static void arm_setup_firmware_boot(ARMCPU *cpu, struct arm_boot_info *info)
}
}
+ if (info->confidential) {
+ arm_setup_confidential_firmware_boot(cpu, info, firmware_filename);
+ kvm_load_user_data(UEFI_LOADER_START, UEFI_MAX_SIZE, info->loader_start, info->loader_start + DTB_MAX, info->ram_size,
+ (struct kvm_numa_info *)info->numa_info);
+ }
/*
* We will start from address 0 (typically a boot ROM image) in the
* same way as hardware. Leave env->boot_info NULL, so that
@@ -1282,7 +1316,7 @@ void arm_load_kernel(ARMCPU *cpu, MachineState *ms, struct arm_boot_info *info)
/* Load the kernel. */
if (!info->kernel_filename || info->firmware_loaded) {
- arm_setup_firmware_boot(cpu, info);
+ arm_setup_firmware_boot(cpu, info, ms->firmware);
} else {
arm_setup_direct_kernel_boot(cpu, info);
}
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 8823f2ed1c..6ffb26e7e6 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1398,6 +1398,9 @@ static void virt_flash_map1(PFlashCFI01 *flash,
qdev_prop_set_uint32(dev, "num-blocks", size / VIRT_FLASH_SECTOR_SIZE);
sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
+ if (virtcca_cvm_enabled()) {
+ return;
+ }
memory_region_add_subregion(sysmem, base,
sysbus_mmio_get_region(SYS_BUS_DEVICE(dev),
0));
@@ -1433,6 +1436,10 @@ static void virt_flash_fdt(VirtMachineState *vms,
MachineState *ms = MACHINE(vms);
char *nodename;
+ if (virtcca_cvm_enabled()) {
+ return;
+ }
+
if (sysmem == secure_sysmem) {
/* Report both flash devices as a single node in the DT */
nodename = g_strdup_printf("/flash@%" PRIx64, flashbase);
@@ -1468,6 +1475,23 @@ static void virt_flash_fdt(VirtMachineState *vms,
}
}
+static bool virt_confidential_firmware_init(VirtMachineState *vms,
+ MemoryRegion *sysmem)
+{
+ MemoryRegion *fw_ram;
+ hwaddr fw_base = vms->memmap[VIRT_FLASH].base;
+ hwaddr fw_size = vms->memmap[VIRT_FLASH].size;
+
+ if (!MACHINE(vms)->firmware) {
+ return false;
+ }
+
+ fw_ram = g_new(MemoryRegion, 1);
+ memory_region_init_ram(fw_ram, NULL, "fw_ram", fw_size, NULL);
+ memory_region_add_subregion(sysmem, fw_base, fw_ram);
+ return true;
+}
+
static bool virt_firmware_init(VirtMachineState *vms,
MemoryRegion *sysmem,
MemoryRegion *secure_sysmem)
@@ -1486,6 +1510,10 @@ static bool virt_firmware_init(VirtMachineState *vms,
pflash_blk0 = pflash_cfi01_get_blk(vms->flash[0]);
+ if (virtcca_cvm_enabled()) {
+ return virt_confidential_firmware_init(vms, sysmem);
+ }
+
bios_name = MACHINE(vms)->firmware;
if (bios_name) {
char *fname;
@@ -2023,7 +2051,7 @@ static void virt_set_memmap(VirtMachineState *vms, int pa_bits)
vms->memmap[VIRT_PCIE_MMIO] = (MemMapEntry) { 0x10000000, 0x2edf0000 };
vms->memmap[VIRT_KAE_DEVICE] = (MemMapEntry) { 0x3edf0000, 0x00200000 };
- vms->memmap[VIRT_MEM].base = 3 * GiB;
+ vms->memmap[VIRT_MEM].base = 1 * GiB;
vms->memmap[VIRT_MEM].size = ms->ram_size;
info_report("[qemu] fix VIRT_MEM range 0x%llx - 0x%llx\n", (unsigned long long)(vms->memmap[VIRT_MEM].base),
(unsigned long long)(vms->memmap[VIRT_MEM].base + ms->ram_size));
@@ -2822,6 +2850,9 @@ static void machvirt_init(MachineState *machine)
vms->bootinfo.get_dtb = machvirt_dtb;
vms->bootinfo.skip_dtb_autoload = true;
vms->bootinfo.firmware_loaded = firmware_loaded;
+ vms->bootinfo.firmware_base = vms->memmap[VIRT_FLASH].base;
+ vms->bootinfo.firmware_max_size = vms->memmap[VIRT_FLASH].size;
+ vms->bootinfo.confidential = virtcca_cvm_enabled();
vms->bootinfo.psci_conduit = vms->psci_conduit;
arm_load_kernel(ARM_CPU(first_cpu), machine, &vms->bootinfo);
diff --git a/include/hw/arm/boot.h b/include/hw/arm/boot.h
index 4491b1f85b..06ca1d90b2 100644
--- a/include/hw/arm/boot.h
+++ b/include/hw/arm/boot.h
@@ -133,6 +133,9 @@ struct arm_boot_info {
bool secure_board_setup;
arm_endianness endianness;
+ hwaddr firmware_base;
+ hwaddr firmware_max_size;
+ bool confidential;
};
/**
--
2.41.0.windows.1

View File

@ -0,0 +1,117 @@
From 5ed17a43a4cc7fc76397d6d8cad8246063b5b2f3 Mon Sep 17 00:00:00 2001
From: gongchangsui <gongchangsui@outlook.com>
Date: Mon, 17 Mar 2025 02:43:55 -0400
Subject: [PATCH] arm: VirtCCA: Compatibility with older versions of TMM and
the kernel
Since the base memory address of Confidential VMs in QEMU was changed
from 3GB to 1GB, corresponding adjustments are required in both the TMM
and kernel components. To maintain backward compatibility, the following
modifications were implemented:
1. **TMM Versioning**: The TMM version number was incremented to
reflect the update
2. **Kernel Interface**: A new interface was exposed in the kernel
to retrieve the TMM version number.
3. **QEMU Compatibility Logic**: During initialization, QEMU checks
the TMM version via the kernel interface. If the TMM version is**<2.1**(legacy),
QEMU sets the Confidential VM's base memory address to**3GB**. For TMM versions
**2.1**(updated), the address is configured to**1GB**to align with the new memory layout
This approach ensures seamless backward compatibility while transitioning
to the revised memory addressing scheme.
Signed-off-by: gongchangsui <gongchangsui@outlook.com>
---
accel/kvm/kvm-all.c | 3 +--
hw/arm/boot.c | 9 +++++++++
hw/arm/virt.c | 9 +++++++--
linux-headers/asm-arm64/kvm.h | 2 ++
linux-headers/linux/kvm.h | 3 +++
5 files changed, 22 insertions(+), 4 deletions(-)
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index a8e29f148e..38a48cc031 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2390,6 +2390,7 @@ static int kvm_init(MachineState *ms)
qemu_mutex_init(&kml_slots_lock);
s = KVM_STATE(ms->accelerator);
+ kvm_state = s;
/*
* On systems where the kernel can support different base page
@@ -2609,8 +2610,6 @@ static int kvm_init(MachineState *ms)
#endif
}
- kvm_state = s;
-
ret = kvm_arch_init(ms, s);
if (ret < 0) {
goto err;
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 6b2f46af4d..ca9f69fd3d 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -1162,6 +1162,15 @@ static void arm_setup_confidential_firmware_boot(ARMCPU *cpu,
struct arm_boot_info *info,
const char *firmware_filename)
{
+ uint64_t tmi_version = 0;
+ if (kvm_ioctl(kvm_state, KVM_GET_TMI_VERSION, &tmi_version) < 0) {
+ error_report("please check the kernel version!");
+ exit(EXIT_FAILURE);
+ }
+ if (tmi_version < MIN_TMI_VERSION_FOR_UEFI_BOOTED_CVM) {
+ error_report("please check the tmi version!");
+ exit(EXIT_FAILURE);
+ }
ssize_t fw_size;
const char *fname;
AddressSpace *as = arm_boot_address_space(cpu, info);
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 6ffb26e7e6..39dfec0877 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2050,8 +2050,13 @@ static void virt_set_memmap(VirtMachineState *vms, int pa_bits)
/* support kae vf device tree nodes */
vms->memmap[VIRT_PCIE_MMIO] = (MemMapEntry) { 0x10000000, 0x2edf0000 };
vms->memmap[VIRT_KAE_DEVICE] = (MemMapEntry) { 0x3edf0000, 0x00200000 };
-
- vms->memmap[VIRT_MEM].base = 1 * GiB;
+ uint64_t tmi_version = 0;
+ if (kvm_ioctl(kvm_state, KVM_GET_TMI_VERSION, &tmi_version) < 0) {
+ warn_report("can not get tmi version");
+ }
+ if (tmi_version < MIN_TMI_VERSION_FOR_UEFI_BOOTED_CVM) {
+ vms->memmap[VIRT_MEM].base = 3 * GiB;
+ }
vms->memmap[VIRT_MEM].size = ms->ram_size;
info_report("[qemu] fix VIRT_MEM range 0x%llx - 0x%llx\n", (unsigned long long)(vms->memmap[VIRT_MEM].base),
(unsigned long long)(vms->memmap[VIRT_MEM].base + ms->ram_size));
diff --git a/linux-headers/asm-arm64/kvm.h b/linux-headers/asm-arm64/kvm.h
index 552fdcb18f..d69a71cbec 100644
--- a/linux-headers/asm-arm64/kvm.h
+++ b/linux-headers/asm-arm64/kvm.h
@@ -597,4 +597,6 @@ struct kvm_cap_arm_tmm_populate_region_args {
#endif
+#define MIN_TMI_VERSION_FOR_UEFI_BOOTED_CVM 0x20001
+
#endif /* __ARM_KVM_H__ */
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 84cec64b88..7a08f9b1e9 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -2422,4 +2422,7 @@ struct kvm_s390_zpci_op {
/* flags for kvm_s390_zpci_op->u.reg_aen.flags */
#define KVM_S390_ZPCIOP_REGAEN_HOST (1 << 0)
+/* get tmi version */
+#define KVM_GET_TMI_VERSION _IOR(KVMIO, 0xd2, uint64_t)
+
#endif /* __LINUX_KVM_H */
--
2.41.0.windows.1

View File

@ -0,0 +1,137 @@
From 0119389040e4d78c6238875b812827d4f07b5f0f Mon Sep 17 00:00:00 2001
From: gongchangsui <gongchangsui@outlook.com>
Date: Mon, 17 Mar 2025 02:51:16 -0400
Subject: [PATCH] arm: VirtCCA: qemu CoDA support UEFI boot
1. Expose PCIe MMIO region from QEMU memory map.
2. Refactor struct kvm_user_data data_start and data_size represent
the address base and size of the MMIO in UEFI boot modedata_start
and data_size represent the address base and size of the DTB in direct boot mode.
Signed-off-by: gongchangsui <gongchangsui@outlook.com>
---
accel/kvm/kvm-all.c | 8 ++++----
hw/arm/boot.c | 10 ++++++----
hw/arm/virt.c | 6 ++++++
linux-headers/linux/kvm.h | 12 +++++++++---
target/arm/kvm_arm.h | 2 ++
5 files changed, 27 insertions(+), 11 deletions(-)
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 38a48cc031..57c6718b77 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -3527,7 +3527,7 @@ int kvm_get_one_reg(CPUState *cs, uint64_t id, void *target)
return r;
}
-int kvm_load_user_data(hwaddr loader_start, hwaddr image_end, hwaddr initrd_start, hwaddr dtb_end, hwaddr ram_size,
+int kvm_load_user_data(hwaddr loader_start, hwaddr dtb_info, hwaddr data_start, hwaddr data_size, hwaddr ram_size,
struct kvm_numa_info *numa_info)
{
KVMState *state = kvm_state;
@@ -3535,9 +3535,9 @@ int kvm_load_user_data(hwaddr loader_start, hwaddr image_end, hwaddr initrd_star
int ret;
data.loader_start = loader_start;
- data.image_end = image_end;
- data.initrd_start = initrd_start;
- data.dtb_end = dtb_end;
+ data.dtb_info = dtb_info;
+ data.data_start = data_start;
+ data.data_size = data_size;
data.ram_size = ram_size;
memcpy(&data.numa_info, numa_info, sizeof(struct kvm_numa_info));
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index ca9f69fd3d..a3e0dbb68c 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -1149,10 +1149,10 @@ static void arm_setup_direct_kernel_boot(ARMCPU *cpu,
if (kvm_enabled() && virtcca_cvm_enabled()) {
if (info->dtb_limit == 0) {
- info->dtb_limit = info->dtb_start + 0x200000;
+ info->dtb_limit = info->dtb_start + DTB_MAX;
}
- kvm_load_user_data(info->loader_start, image_high_addr, info->initrd_start,
- info->dtb_limit, info->ram_size, (struct kvm_numa_info *)info->numa_info);
+ kvm_load_user_data(info->loader_start, 0x1, info->dtb_start,
+ info->dtb_limit - info->dtb_start, info->ram_size, (struct kvm_numa_info *)info->numa_info);
tmm_add_ram_region(info->loader_start, image_high_addr - info->loader_start,
info->initrd_start, info->dtb_limit - info->initrd_start, true);
}
@@ -1193,6 +1193,7 @@ static void arm_setup_confidential_firmware_boot(ARMCPU *cpu,
static void arm_setup_firmware_boot(ARMCPU *cpu, struct arm_boot_info *info, const char *firmware_filename)
{
+ hwaddr mmio_start, mmio_size;
/* Set up for booting firmware (which might load a kernel via fw_cfg) */
if (have_dtb(info)) {
@@ -1246,7 +1247,8 @@ static void arm_setup_firmware_boot(ARMCPU *cpu, struct arm_boot_info *info, con
if (info->confidential) {
arm_setup_confidential_firmware_boot(cpu, info, firmware_filename);
- kvm_load_user_data(UEFI_LOADER_START, UEFI_MAX_SIZE, info->loader_start, info->loader_start + DTB_MAX, info->ram_size,
+ virtcca_kvm_get_mmio_addr(&mmio_start, &mmio_size);
+ kvm_load_user_data(info->loader_start, DTB_MAX, mmio_start, mmio_size, info->ram_size,
(struct kvm_numa_info *)info->numa_info);
}
/*
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 39dfec0877..6c5611826c 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -176,6 +176,12 @@ static const MemMapEntry base_memmap[] = {
[VIRT_MEM] = { GiB, LEGACY_RAMLIMIT_BYTES },
};
+void virtcca_kvm_get_mmio_addr(hwaddr *mmio_start, hwaddr *mmio_size)
+{
+ *mmio_start = base_memmap[VIRT_PCIE_MMIO].base;
+ *mmio_size = base_memmap[VIRT_PCIE_MMIO].size;
+}
+
/*
* Highmem IO Regions: This memory map is floating, located after the RAM.
* Each MemMapEntry base (GPA) will be dynamically computed, depending on the
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 7a08f9b1e9..c9ec7f862a 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -1510,9 +1510,15 @@ struct kvm_numa_info {
struct kvm_user_data {
__u64 loader_start;
- __u64 image_end;
- __u64 initrd_start;
- __u64 dtb_end;
+ /*
+ * When the lowest bit of dtb_info is 0, the value of dtb_info represents the size of the DTB,
+ * and data_start and data_size represent the address base and size of the MMIO.
+ * When the lowest bit of dtb_info is 1, data_start and data_size represent the address base
+ * and size of the DTB.
+ */
+ __u64 dtb_info;
+ __u64 data_start;
+ __u64 data_size;
__u64 ram_size;
struct kvm_numa_info numa_info;
};
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index 31457a57f7..62fbb713f4 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -73,6 +73,8 @@ int kvm_arm_vcpu_finalize(CPUState *cs, int feature);
void kvm_arm_register_device(MemoryRegion *mr, uint64_t devid, uint64_t group,
uint64_t attr, int dev_fd, uint64_t addr_ormask);
+void virtcca_kvm_get_mmio_addr(hwaddr *mmio_start, hwaddr *mmio_size);
+
/**
* kvm_arm_init_cpreg_list:
* @cpu: ARMCPU
--
2.41.0.windows.1

View File

@ -0,0 +1,100 @@
From 5bffeb311c969a0e05106e4bf54282431c5ba907 Mon Sep 17 00:00:00 2001
From: gongchangsui <gongchangsui@outlook.com>
Date: Mon, 17 Mar 2025 02:42:43 -0400
Subject: [PATCH] arm: VirtCCA: qemu uefi boot support kae
This commit introduces modifications to enable KAE functionality
during UEFI boot in cVMs. Additionally,the ACPI feature must be
configured in cVM.
Signed-off-by: gongchangsui <gongchangsui@outlook.com>
---
hw/arm/virt-acpi-build.c | 58 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 58 insertions(+)
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 076781423b..f78331d69f 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -58,6 +58,7 @@
#include "migration/vmstate.h"
#include "hw/acpi/ghes.h"
#include "hw/acpi/viot.h"
+#include "kvm_arm.h"
#define ARM_SPI_BASE 32
@@ -405,6 +406,54 @@ static void acpi_dsdt_add_virtio(Aml *scope,
}
}
+static void acpi_dsdt_add_hisi_sec(Aml *scope,
+ const MemMapEntry *virtio_mmio_memmap,
+ int dev_id)
+{
+ hwaddr size = 0x10000;
+
+ /*
+ * Calculate the base address for the sec device node.
+ * Each device group contains one sec device and one hpre device,spaced by 2 * size.
+ */
+ hwaddr base = virtio_mmio_memmap->base + dev_id * 2 * size;
+
+ Aml *dev = aml_device("SE%02u", dev_id);
+ aml_append(dev, aml_name_decl("_HID", aml_string("SEC07")));
+ aml_append(dev, aml_name_decl("_UID", aml_int(dev_id)));
+ aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
+
+ Aml *crs = aml_resource_template();
+
+ aml_append(crs, aml_memory32_fixed(base, size, AML_READ_WRITE));
+ aml_append(dev, aml_name_decl("_CRS", crs));
+ aml_append(scope, dev);
+}
+
+static void acpi_dsdt_add_hisi_hpre(Aml *scope,
+ const MemMapEntry *virtio_mmio_memmap,
+ int dev_id)
+{
+ hwaddr size = 0x10000;
+
+ /*
+ * Calculate the base address for the hpre device node.
+ * Each hpre device follows the corresponding sec device by an additional offset of size.
+ */
+ hwaddr base = virtio_mmio_memmap->base + dev_id * 2 * size + size;
+
+ Aml *dev = aml_device("HP%02u", dev_id);
+ aml_append(dev, aml_name_decl("_HID", aml_string("HPRE07")));
+ aml_append(dev, aml_name_decl("_UID", aml_int(dev_id)));
+ aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
+
+ Aml *crs = aml_resource_template();
+
+ aml_append(crs, aml_memory32_fixed(base, size, AML_READ_WRITE));
+ aml_append(dev, aml_name_decl("_CRS", crs));
+ aml_append(scope, dev);
+}
+
static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry *memmap,
uint32_t irq, VirtMachineState *vms)
{
@@ -1201,6 +1250,15 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
acpi_dsdt_add_virtio(scope, &memmap[VIRT_MMIO],
(irqmap[VIRT_MMIO] + ARM_SPI_BASE), NUM_VIRTIO_TRANSPORTS);
acpi_dsdt_add_pci(scope, memmap, irqmap[VIRT_PCIE] + ARM_SPI_BASE, vms);
+
+ if (virtcca_cvm_enabled()) {
+ int kae_num = tmm_get_kae_num();
+ for (int i = 0; i < kae_num; i++) {
+ acpi_dsdt_add_hisi_sec(scope, &memmap[VIRT_KAE_DEVICE], i);
+ acpi_dsdt_add_hisi_hpre(scope, &memmap[VIRT_KAE_DEVICE], i);
+ }
+ }
+
if (vms->acpi_dev) {
build_ged_aml(scope, "\\_SB."GED_DEVICE,
HOTPLUG_HANDLER(vms->acpi_dev),
--
2.41.0.windows.1

View File

@ -0,0 +1,36 @@
From b60350d9f495f568aa1380f02a13b51e9619a7de Mon Sep 17 00:00:00 2001
From: gubin <gubin_yewu@cmss.chinamobile.com>
Date: Mon, 18 Nov 2024 14:17:52 +0800
Subject: [PATCH] audio/audio.c: remove trailing newline in error_setg
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
cherry-pick from 09a36158c283f7448d1b00fdbb6634f05d27f922
error_setg() appends newline to the formatted message.
Fixes: cb94ff5f80c5 ("audio: propagate Error * out of audio_init")
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: gubin <gubin_yewu@cmss.chinamobile.com>
---
audio/audio.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/audio/audio.c b/audio/audio.c
index 8d1e4ad922..7ac74f9e16 100644
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -1744,7 +1744,7 @@ static AudioState *audio_init(Audiodev *dev, Error **errp)
if (driver) {
done = !audio_driver_init(s, driver, dev, errp);
} else {
- error_setg(errp, "Unknown audio driver `%s'\n", drvname);
+ error_setg(errp, "Unknown audio driver `%s'", drvname);
}
if (!done) {
goto out;
--
2.41.0.windows.1

View File

@ -0,0 +1,150 @@
From 0978556247d968ffc83beff3b2611c93fd9b6b13 Mon Sep 17 00:00:00 2001
From: Yi Liu <yi.l.liu@intel.com>
Date: Thu, 12 Sep 2024 00:17:31 -0700
Subject: [PATCH] backend/iommufd: Report PASID capability
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
backends/iommufd.c | 4 +++-
hw/arm/smmu-common.c | 4 ++--
hw/arm/smmuv3.c | 4 +++-
hw/vfio/iommufd.c | 4 +++-
include/hw/arm/smmu-common.h | 2 +-
include/sysemu/host_iommu_device.h | 1 +
include/sysemu/iommufd.h | 3 ++-
7 files changed, 15 insertions(+), 7 deletions(-)
diff --git a/backends/iommufd.c b/backends/iommufd.c
index e9ce82297b..4f5df63331 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -326,7 +326,8 @@ bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
uint32_t *type, void *data, uint32_t len,
- uint64_t *caps, Error **errp)
+ uint64_t *caps, uint8_t *max_pasid_log2,
+ Error **errp)
{
struct iommu_hw_info info = {
.size = sizeof(info),
@@ -344,6 +345,7 @@ bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
*type = info.out_data_type;
g_assert(caps);
*caps = info.out_capabilities;
+ *max_pasid_log2 = info.out_max_pasid_log2;
return true;
}
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index c382fa16e5..e7028bd4ec 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -853,7 +853,7 @@ SMMUDevice *smmu_find_sdev(SMMUState *s, uint32_t sid)
/* IOMMUFD helpers */
int smmu_dev_get_info(SMMUDevice *sdev, uint32_t *data_type,
- uint32_t data_len, void *data)
+ uint32_t data_len, uint8_t *pasid, void *data)
{
uint64_t caps;
@@ -863,7 +863,7 @@ int smmu_dev_get_info(SMMUDevice *sdev, uint32_t *data_type,
return !iommufd_backend_get_device_info(sdev->idev->iommufd,
sdev->idev->devid, data_type, data,
- data_len, &caps, NULL);
+ data_len, &caps, pasid, NULL);
}
void smmu_dev_uninstall_nested_ste(SMMUDevice *sdev, bool abort)
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 30c0ae4c3b..0ca0e96fcc 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -264,6 +264,7 @@ static void smmuv3_nested_init_regs(SMMUv3State *s)
SMMUDevice *sdev;
uint32_t data_type;
uint32_t val;
+ uint8_t pasid;
int ret;
if (!bs->nested || !bs->viommu) {
@@ -280,7 +281,8 @@ static void smmuv3_nested_init_regs(SMMUv3State *s)
goto out;
}
- ret = smmu_dev_get_info(sdev, &data_type, sizeof(sdev->info), &sdev->info);
+ ret = smmu_dev_get_info(sdev, &data_type, sizeof(sdev->info), &pasid,
+ &sdev->info);
if (ret) {
error_report("failed to get SMMU device info");
return;
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index c0eb87c78c..a108beda29 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -871,18 +871,20 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
struct iommu_hw_info_vtd vtd;
} data;
uint64_t hw_caps;
+ uint8_t pasids;
hiod->agent = opaque;
if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid,
&type, &data, sizeof(data),
- &hw_caps, errp)) {
+ &hw_caps, &pasids, errp)) {
return false;
}
hiod->name = g_strdup(vdev->name);
caps->type = type;
caps->hw_caps = hw_caps;
+ caps->max_pasid_log2 = pasids;
return true;
}
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
index 087a11efc7..8ae33c3753 100644
--- a/include/hw/arm/smmu-common.h
+++ b/include/hw/arm/smmu-common.h
@@ -276,7 +276,7 @@ void smmu_inv_notifiers_all(SMMUState *s);
/* IOMMUFD helpers */
int smmu_dev_get_info(SMMUDevice *sdev, uint32_t *data_type,
- uint32_t data_len, void *data);
+ uint32_t data_len, uint8_t *pasid, void *data);
void smmu_dev_uninstall_nested_ste(SMMUDevice *sdev, bool abort);
int smmu_dev_install_nested_ste(SMMUDevice *sdev, uint32_t data_type,
uint32_t data_len, void *data,
diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
index 84131f5495..22c76a37a7 100644
--- a/include/sysemu/host_iommu_device.h
+++ b/include/sysemu/host_iommu_device.h
@@ -26,6 +26,7 @@
typedef struct HostIOMMUDeviceCaps {
uint32_t type;
uint64_t hw_caps;
+ uint8_t max_pasid_log2;
} HostIOMMUDeviceCaps;
#define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index b279184974..29afaa429d 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -57,7 +57,8 @@ int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
hwaddr iova, ram_addr_t size);
bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
uint32_t *type, void *data, uint32_t len,
- uint64_t *caps, Error **errp);
+ uint64_t *caps, uint8_t *max_pasid_log2,
+ Error **errp);
bool iommufd_backend_alloc_hwpt(IOMMUFDBackend *be, uint32_t dev_id,
uint32_t pt_id, uint32_t flags,
uint32_t data_type, uint32_t data_len,
--
2.41.0.windows.1

View File

@ -0,0 +1,162 @@
From 626698a1e9edff6a1032f496858555e1a4614fbe Mon Sep 17 00:00:00 2001
From: Zhenzhong Duan <zhenzhong.duan@intel.com>
Date: Wed, 5 Jun 2024 16:30:27 +0800
Subject: [PATCH] backends: Introduce HostIOMMUDevice abstract
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
A HostIOMMUDevice is an abstraction for an assigned device that is protected
by a physical IOMMU (aka host IOMMU). The userspace interaction with this
physical IOMMU can be done either through the VFIO IOMMU type 1 legacy
backend or the new iommufd backend. The assigned device can be a VFIO device
or a VDPA device. The HostIOMMUDevice is needed to interact with the host
IOMMU that protects the assigned device. It is especially useful when the
device is also protected by a virtual IOMMU as this latter use the translation
services of the physical IOMMU and is constrained by it. In that context the
HostIOMMUDevice can be passed to the virtual IOMMU to collect physical IOMMU
capabilities such as the supported address width. In the future, the virtual
IOMMU will use the HostIOMMUDevice to program the guest page tables in the
first translation stage of the physical IOMMU.
Introduce .realize() to initialize HostIOMMUDevice further after instance init.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
---
MAINTAINERS | 2 ++
backends/host_iommu_device.c | 33 +++++++++++++++++++
backends/meson.build | 1 +
include/sysemu/host_iommu_device.h | 53 ++++++++++++++++++++++++++++++
4 files changed, 89 insertions(+)
create mode 100644 backends/host_iommu_device.c
create mode 100644 include/sysemu/host_iommu_device.h
diff --git a/MAINTAINERS b/MAINTAINERS
index 0ddb20a35f..ada87bfa9e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2174,6 +2174,8 @@ M: Zhenzhong Duan <zhenzhong.duan@intel.com>
S: Supported
F: backends/iommufd.c
F: include/sysemu/iommufd.h
+F: backends/host_iommu_device.c
+F: include/sysemu/host_iommu_device.h
F: include/qemu/chardev_open.h
F: util/chardev_open.c
F: docs/devel/vfio-iommufd.rst
diff --git a/backends/host_iommu_device.c b/backends/host_iommu_device.c
new file mode 100644
index 0000000000..8f2dda1beb
--- /dev/null
+++ b/backends/host_iommu_device.c
@@ -0,0 +1,33 @@
+/*
+ * Host IOMMU device abstract
+ *
+ * Copyright (C) 2024 Intel Corporation.
+ *
+ * Authors: Zhenzhong Duan <zhenzhong.duan@intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "sysemu/host_iommu_device.h"
+
+OBJECT_DEFINE_ABSTRACT_TYPE(HostIOMMUDevice,
+ host_iommu_device,
+ HOST_IOMMU_DEVICE,
+ OBJECT)
+
+static void host_iommu_device_class_init(ObjectClass *oc, void *data)
+{
+}
+
+static void host_iommu_device_init(Object *obj)
+{
+}
+
+static void host_iommu_device_finalize(Object *obj)
+{
+ HostIOMMUDevice *hiod = HOST_IOMMU_DEVICE(obj);
+
+ g_free(hiod->name);
+}
diff --git a/backends/meson.build b/backends/meson.build
index 9a5cea480d..68b5e34e04 100644
--- a/backends/meson.build
+++ b/backends/meson.build
@@ -13,6 +13,7 @@ system_ss.add([files(
system_ss.add(when: 'CONFIG_POSIX', if_true: files('rng-random.c'))
system_ss.add(when: 'CONFIG_POSIX', if_true: files('hostmem-file.c'))
system_ss.add(when: 'CONFIG_LINUX', if_true: files('hostmem-memfd.c'))
+system_ss.add(when: 'CONFIG_LINUX', if_true: files('host_iommu_device.c'))
if keyutils.found()
system_ss.add(keyutils, files('cryptodev-lkcf.c'))
endif
diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
new file mode 100644
index 0000000000..db47a16189
--- /dev/null
+++ b/include/sysemu/host_iommu_device.h
@@ -0,0 +1,53 @@
+/*
+ * Host IOMMU device abstract declaration
+ *
+ * Copyright (C) 2024 Intel Corporation.
+ *
+ * Authors: Zhenzhong Duan <zhenzhong.duan@intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ */
+
+#ifndef HOST_IOMMU_DEVICE_H
+#define HOST_IOMMU_DEVICE_H
+
+#include "qom/object.h"
+#include "qapi/error.h"
+
+#define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
+OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass, HOST_IOMMU_DEVICE)
+
+struct HostIOMMUDevice {
+ Object parent_obj;
+
+ char *name;
+};
+
+/**
+ * struct HostIOMMUDeviceClass - The base class for all host IOMMU devices.
+ *
+ * Different types of host devices (e.g., VFIO or VDPA device) or devices
+ * with different backend (e.g., VFIO legacy container or IOMMUFD backend)
+ * will have different implementations of the HostIOMMUDeviceClass.
+ */
+struct HostIOMMUDeviceClass {
+ ObjectClass parent_class;
+
+ /**
+ * @realize: initialize host IOMMU device instance further.
+ *
+ * Mandatory callback.
+ *
+ * @hiod: pointer to a host IOMMU device instance.
+ *
+ * @opaque: pointer to agent device of this host IOMMU device,
+ * e.g., VFIO base device or VDPA device.
+ *
+ * @errp: pass an Error out when realize fails.
+ *
+ * Returns: true on success, false on failure.
+ */
+ bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
+};
+#endif
--
2.41.0.windows.1

View File

@ -0,0 +1,113 @@
From bc08940ad3c75da49e05c596f79e9e0164573709 Mon Sep 17 00:00:00 2001
From: gongchangsui <gongchangsui@outlook.com>
Date: Mon, 17 Mar 2025 02:56:40 -0400
Subject: [PATCH] backends: VirtCCA: cvm_gpa_start supports both 1GB and 3GB
For TMM versions 2.1 and above, `cvm_gpa_start` is 1GB, while for
versions prior to 2.1, `cvm_gpa_start` is 3GB. Shared huge page memory
supports both `cvm_gpa_start` values.
Signed-off-by: gongchangsui <gongchangsui@outlook.com>
---
backends/hostmem-file.c | 17 ++++++++++++++---
hw/arm/virt.c | 1 +
hw/core/numa.c | 2 +-
include/exec/memory.h | 11 +++++++----
4 files changed, 23 insertions(+), 8 deletions(-)
diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index 891fe4ac4a..ce63a372a3 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -27,6 +27,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(HostMemoryBackendFile, MEMORY_BACKEND_FILE)
bool virtcca_shared_hugepage_mapped = false;
uint64_t virtcca_cvm_ram_size = 0;
+uint64_t virtcca_cvm_gpa_start = 0;
struct HostMemoryBackendFile {
HostMemoryBackend parent_obj;
@@ -101,8 +102,16 @@ virtcca_shared_backend_memory_alloc(char *mem_path, uint32_t ram_flags, Error **
error_report("parse virtcca share memory path failed");
exit(1);
}
- if (virtcca_cvm_ram_size >= VIRTCCA_SHARED_HUGEPAGE_MAX_SIZE) {
- size = VIRTCCA_SHARED_HUGEPAGE_MAX_SIZE;
+
+ /*
+ * 1) CVM_GPA_START = 3GB --> fix size = 1GB
+ * 2) CVM_GPA_START = 1GB && ram_size >= 3GB --> size = 3GB
+ * 3) CVM_GPA_START = 1GB && ram_size < 3GB --> size = ram_size
+ */
+ if (virtcca_cvm_gpa_start != DEFAULT_VM_GPA_START) {
+ size = VIRTCCA_SHARED_HUGEPAGE_ADDR_LIMIT - virtcca_cvm_gpa_start;
+ } else if (virtcca_cvm_ram_size >= VIRTCCA_SHARED_HUGEPAGE_ADDR_LIMIT - DEFAULT_VM_GPA_START) {
+ size = VIRTCCA_SHARED_HUGEPAGE_ADDR_LIMIT - DEFAULT_VM_GPA_START;
}
virtcca_shared_hugepage = g_new(MemoryRegion, 1);
@@ -172,7 +181,9 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
fb->mem_path, fb->offset, errp);
g_free(name);
- if (virtcca_cvm_enabled() && backend->share && !virtcca_shared_hugepage_mapped) {
+ if (virtcca_cvm_enabled() && backend->share &&
+ (strcmp(fb->mem_path, "/dev/shm") != 0) &&
+ !virtcca_shared_hugepage_mapped) {
virtcca_shared_backend_memory_alloc(fb->mem_path, ram_flags, errp);
}
#endif
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 6c5611826c..3c31d3667e 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2063,6 +2063,7 @@ static void virt_set_memmap(VirtMachineState *vms, int pa_bits)
if (tmi_version < MIN_TMI_VERSION_FOR_UEFI_BOOTED_CVM) {
vms->memmap[VIRT_MEM].base = 3 * GiB;
}
+ virtcca_cvm_gpa_start = vms->memmap[VIRT_MEM].base;
vms->memmap[VIRT_MEM].size = ms->ram_size;
info_report("[qemu] fix VIRT_MEM range 0x%llx - 0x%llx\n", (unsigned long long)(vms->memmap[VIRT_MEM].base),
(unsigned long long)(vms->memmap[VIRT_MEM].base + ms->ram_size));
diff --git a/hw/core/numa.c b/hw/core/numa.c
index c691578ef5..98d896e687 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -655,7 +655,7 @@ static void virtcca_shared_memory_configuration(MachineState *ms)
memory_region_init_alias(alias_mr, NULL, "alias-mr", virtcca_shared_hugepage,
0, int128_get64(virtcca_shared_hugepage->size));
memory_region_add_subregion(address_space_virtcca_shared_memory.root,
- VIRTCCA_GPA_START, alias_mr);
+ virtcca_cvm_gpa_start, alias_mr);
}
void numa_complete_configuration(MachineState *ms)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 33778f5c64..c14dc69d27 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -243,14 +243,17 @@ typedef struct IOMMUTLBEvent {
/* RAM FD is opened read-only */
#define RAM_READONLY_FD (1 << 11)
-/* The GPA range of the VirtCCA bounce buffer is from 1GB to 4GB. */
-#define VIRTCCA_SHARED_HUGEPAGE_MAX_SIZE 0xc0000000ULL
+/* The address limit of the VirtCCA bounce buffer is 4GB. */
+#define VIRTCCA_SHARED_HUGEPAGE_ADDR_LIMIT 0x100000000ULL
/* The VirtCCA shared hugepage memory granularity is 1GB */
#define VIRTCCA_SHARED_HUGEPAGE_ALIGN 0x40000000ULL
-/* The GPA starting address of the VirtCCA CVM is 1GB */
-#define VIRTCCA_GPA_START 0x40000000ULL
+/* The default GPA starting address of VM is 1GB */
+#define DEFAULT_VM_GPA_START 0x40000000ULL
+
+/* The GPA starting address of the VirtCCA CVM is 1GB or 3GB */
+extern uint64_t virtcca_cvm_gpa_start;
extern uint64_t virtcca_cvm_ram_size;
--
2.41.0.windows.1

View File

@ -0,0 +1,71 @@
From 29080940b37ce7486a46ab5534383321319fe2c5 Mon Sep 17 00:00:00 2001
From: gubin <gubin_yewu@cmss.chinamobile.com>
Date: Sat, 22 Mar 2025 15:10:32 +0800
Subject: [PATCH] backends/cryptodev: Do not abort for invalid session ID
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
cherry-pick from eaf2bd29538d039df80bb4b1584de33a61312bc6
Instead of aborting when a session ID is invalid,
return VIRTIO_CRYPTO_INVSESS ("Invalid session id").
Reproduced using:
$ cat << EOF | qemu-system-i386 -display none \
-machine q35,accel=qtest -m 512M -nodefaults \
-object cryptodev-backend-builtin,id=cryptodev0 \
-device virtio-crypto-pci,id=crypto0,cryptodev=cryptodev0 \
-qtest stdio
outl 0xcf8 0x80000804
outw 0xcfc 0x06
outl 0xcf8 0x80000820
outl 0xcfc 0xe0008000
write 0x10800e 0x1 0x01
write 0xe0008016 0x1 0x01
write 0xe0008020 0x4 0x00801000
write 0xe0008028 0x4 0x00c01000
write 0xe000801c 0x1 0x01
write 0x110000 0x1 0x05
write 0x110001 0x1 0x04
write 0x108002 0x1 0x11
write 0x108008 0x1 0x48
write 0x10800c 0x1 0x01
write 0x108018 0x1 0x10
write 0x10801c 0x1 0x02
write 0x10c002 0x1 0x01
write 0xe000b005 0x1 0x00
EOF
Assertion failed: (session_id < MAX_NUM_SESSIONS && builtin->sessions[session_id]),
function cryptodev_builtin_close_session, file cryptodev-builtin.c, line 430.
Cc: qemu-stable@nongnu.org
Reported-by: Zheyu Ma <zheyuma97@gmail.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2274
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20240409094757.9127-1-philmd@linaro.org>
Signed-off-by: gubin <gubin_yewu@cmss.chinamobile.com>
---
backends/cryptodev-builtin.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/backends/cryptodev-builtin.c b/backends/cryptodev-builtin.c
index 0822f198d9..940104ee55 100644
--- a/backends/cryptodev-builtin.c
+++ b/backends/cryptodev-builtin.c
@@ -428,7 +428,9 @@ static int cryptodev_builtin_close_session(
CRYPTODEV_BACKEND_BUILTIN(backend);
CryptoDevBackendBuiltinSession *session;
- assert(session_id < MAX_NUM_SESSIONS && builtin->sessions[session_id]);
+ if (session_id >= MAX_NUM_SESSIONS || !builtin->sessions[session_id]) {
+ return -VIRTIO_CRYPTO_INVSESS;
+ }
session = builtin->sessions[session_id];
if (session->cipher) {
--
2.41.0.windows.1

View File

@ -0,0 +1,65 @@
From 690812903469db798ebae012248b9231d5ce9f11 Mon Sep 17 00:00:00 2001
From: gubin <gubin_yewu@cmss.chinamobile.com>
Date: Sat, 22 Mar 2025 15:15:08 +0800
Subject: [PATCH] backends/cryptodev: Do not ignore throttle/backends Errors
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
cherry-pick from 484aecf2d3a75251b63481be2a0c3aef635002af
Both cryptodev_backend_set_throttle() and CryptoDevBackendClass::init()
can set their Error** argument. Do not ignore them, return early
on failure. Without that, running into another failure trips
error_setv()'s assertion. Use the ERRP_GUARD() macro as suggested
in commit ae7c80a7bd ("error: New macro ERRP_GUARD()").
Cc: qemu-stable@nongnu.org
Fixes: e7a775fd9f ("cryptodev: Account statistics")
Fixes: 2580b452ff ("cryptodev: support QoS")
Reviewed-by: zhenwei pi <pizhenwei@bytedance.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20231120150418.93443-1-philmd@linaro.org>
Signed-off-by: gubin <gubin_yewu@cmss.chinamobile.com>
---
backends/cryptodev.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/backends/cryptodev.c b/backends/cryptodev.c
index e5006bd215..fff89fd62a 100644
--- a/backends/cryptodev.c
+++ b/backends/cryptodev.c
@@ -398,6 +398,7 @@ static void cryptodev_backend_set_ops(Object *obj, Visitor *v,
static void
cryptodev_backend_complete(UserCreatable *uc, Error **errp)
{
+ ERRP_GUARD();
CryptoDevBackend *backend = CRYPTODEV_BACKEND(uc);
CryptoDevBackendClass *bc = CRYPTODEV_BACKEND_GET_CLASS(uc);
uint32_t services;
@@ -406,11 +407,20 @@ cryptodev_backend_complete(UserCreatable *uc, Error **errp)
QTAILQ_INIT(&backend->opinfos);
value = backend->tc.buckets[THROTTLE_OPS_TOTAL].avg;
cryptodev_backend_set_throttle(backend, THROTTLE_OPS_TOTAL, value, errp);
+ if (*errp) {
+ return;
+ }
value = backend->tc.buckets[THROTTLE_BPS_TOTAL].avg;
cryptodev_backend_set_throttle(backend, THROTTLE_BPS_TOTAL, value, errp);
+ if (*errp) {
+ return;
+ }
if (bc->init) {
bc->init(backend, errp);
+ if (*errp) {
+ return;
+ }
}
services = backend->conf.crypto_services;
--
2.41.0.windows.1

View File

@ -0,0 +1,41 @@
From c5a859ec02af99574dfac2e5cfab9570345eb2e4 Mon Sep 17 00:00:00 2001
From: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
Date: Wed, 5 Feb 2025 08:04:10 -0500
Subject: [PATCH] backends/cryptodev-vhost-user: Fix local_error leaks
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
cheery-pick from 78b0c15a563ac4be5afb0375602ca0a3adc6c442
Do not propagate error to the upper, directly output the error
to avoid leaks.
Fixes: 2fda101de07 ("virtio-crypto: Support asynchronous mode")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2714
Signed-off-by: Gabriel Barrantes <gabriel.barrantes.dev@outlook.com>
Reviewed-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <DM8PR13MB50781054A4FDACE6F4FB6469B30F2@DM8PR13MB5078.namprd13.prod.outlook.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
---
backends/cryptodev-vhost-user.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/backends/cryptodev-vhost-user.c b/backends/cryptodev-vhost-user.c
index c3283ba84a..b8e95ca8b4 100644
--- a/backends/cryptodev-vhost-user.c
+++ b/backends/cryptodev-vhost-user.c
@@ -281,8 +281,7 @@ static int cryptodev_vhost_user_create_session(
break;
default:
- error_setg(&local_error, "Unsupported opcode :%" PRIu32 "",
- sess_info->op_code);
+ error_report("Unsupported opcode :%" PRIu32 "", sess_info->op_code);
return -VIRTIO_CRYPTO_NOTSUPP;
}
--
2.41.0.windows.1

View File

@ -0,0 +1,91 @@
From ca210a4a8fe97dd56baa184671bb48bff9a54ecb Mon Sep 17 00:00:00 2001
From: Zhenzhong Duan <zhenzhong.duan@intel.com>
Date: Wed, 5 Jun 2024 16:30:28 +0800
Subject: [PATCH] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
HostIOMMUDeviceCaps's elements map to the host IOMMU's capabilities.
Different platform IOMMU can support different elements.
Currently only two elements, type and aw_bits, type hints the host
platform IOMMU type, i.e., INTEL vtd, ARM smmu, etc; aw_bits hints
host IOMMU address width.
Introduce .get_cap() handler to check if HOST_IOMMU_DEVICE_CAP_XXX
is supported.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
---
include/sysemu/host_iommu_device.h | 38 ++++++++++++++++++++++++++++++
1 file changed, 38 insertions(+)
diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
index db47a16189..a57873958b 100644
--- a/include/sysemu/host_iommu_device.h
+++ b/include/sysemu/host_iommu_device.h
@@ -15,6 +15,18 @@
#include "qom/object.h"
#include "qapi/error.h"
+/**
+ * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
+ *
+ * @type: host platform IOMMU type.
+ *
+ * @aw_bits: host IOMMU address width. 0xff if no limitation.
+ */
+typedef struct HostIOMMUDeviceCaps {
+ uint32_t type;
+ uint8_t aw_bits;
+} HostIOMMUDeviceCaps;
+
#define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass, HOST_IOMMU_DEVICE)
@@ -22,6 +34,7 @@ struct HostIOMMUDevice {
Object parent_obj;
char *name;
+ HostIOMMUDeviceCaps caps;
};
/**
@@ -49,5 +62,30 @@ struct HostIOMMUDeviceClass {
* Returns: true on success, false on failure.
*/
bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
+ /**
+ * @get_cap: check if a host IOMMU device capability is supported.
+ *
+ * Optional callback, if not implemented, hint not supporting query
+ * of @cap.
+ *
+ * @hiod: pointer to a host IOMMU device instance.
+ *
+ * @cap: capability to check.
+ *
+ * @errp: pass an Error out when fails to query capability.
+ *
+ * Returns: <0 on failure, 0 if a @cap is unsupported, or else
+ * 1 or some positive value for some special @cap,
+ * i.e., HOST_IOMMU_DEVICE_CAP_AW_BITS.
+ */
+ int (*get_cap)(HostIOMMUDevice *hiod, int cap, Error **errp);
};
+
+/*
+ * Host IOMMU device capability list.
+ */
+#define HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE 0
+#define HOST_IOMMU_DEVICE_CAP_AW_BITS 1
+
+#define HOST_IOMMU_DEVICE_CAP_AW_BITS_MAX 64
#endif
--
2.41.0.windows.1

View File

@ -0,0 +1,81 @@
From cedca4d3635cde049151b5818df2cb66c2b1531f Mon Sep 17 00:00:00 2001
From: Zhenzhong Duan <zhenzhong.duan@intel.com>
Date: Fri, 3 Nov 2023 16:54:01 +0800
Subject: [PATCH] backends/iommufd: Add helpers for invalidating user-managed
HWPT
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
backends/iommufd.c | 30 ++++++++++++++++++++++++++++++
backends/trace-events | 1 +
include/sysemu/iommufd.h | 3 +++
3 files changed, 34 insertions(+)
diff --git a/backends/iommufd.c b/backends/iommufd.c
index c1260766f0..cf24370385 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -330,6 +330,36 @@ bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
return true;
}
+int iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t hwpt_id,
+ uint32_t data_type, uint32_t entry_len,
+ uint32_t *entry_num, void *data_ptr)
+{
+ int ret, fd = be->fd;
+ struct iommu_hwpt_invalidate cache = {
+ .size = sizeof(cache),
+ .hwpt_id = hwpt_id,
+ .data_type = data_type,
+ .entry_len = entry_len,
+ .entry_num = *entry_num,
+ .data_uptr = (uintptr_t)data_ptr,
+ };
+
+ ret = ioctl(fd, IOMMU_HWPT_INVALIDATE, &cache);
+
+ trace_iommufd_backend_invalidate_cache(fd, hwpt_id, data_type, entry_len,
+ *entry_num, cache.entry_num,
+ (uintptr_t)data_ptr, ret);
+ if (ret) {
+ *entry_num = cache.entry_num;
+ error_report("IOMMU_HWPT_INVALIDATE failed: %s", strerror(errno));
+ ret = -errno;
+ } else {
+ g_assert(*entry_num == cache.entry_num);
+ }
+
+ return ret;
+}
+
static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
{
HostIOMMUDeviceCaps *caps = &hiod->caps;
diff --git a/backends/trace-events b/backends/trace-events
index b02433710a..ef0ff98921 100644
--- a/backends/trace-events
+++ b/backends/trace-events
@@ -18,3 +18,4 @@ iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, uint32_t pt_id, uint32_
iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d id=%d (%d)"
iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret) " iommufd=%d hwpt=%u enable=%d (%d)"
iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t iova, uint64_t size, uint64_t page_size, int ret) " iommufd=%d hwpt=%u iova=0x%"PRIx64" size=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
+iommufd_backend_invalidate_cache(int iommufd, uint32_t hwpt_id, uint32_t data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t data_ptr, int ret) " iommufd=%d hwpt_id=%u data_type=%u entry_len=%u entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"
diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index 3b28c8a81c..f6596f6338 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -63,6 +63,9 @@ bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id,
uint64_t iova, ram_addr_t size,
uint64_t page_size, uint64_t *data,
Error **errp);
+int iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t hwpt_id,
+ uint32_t data_type, uint32_t entry_len,
+ uint32_t *entry_num, void *data_ptr);
#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
#endif
--
2.41.0.windows.1

View File

@ -0,0 +1,78 @@
From 7d53d0938921d0faa32e1fef4c7bcc45d21f9bfb Mon Sep 17 00:00:00 2001
From: Joao Martins <joao.m.martins@oracle.com>
Date: Fri, 19 Jul 2024 13:04:51 +0100
Subject: [PATCH] backends/iommufd: Extend iommufd_backend_get_device_info() to
fetch HW capabilities
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The helper will be able to fetch vendor agnostic IOMMU capabilities
supported both by hardware and software. Right now it is only iommu dirty
tracking.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
backends/iommufd.c | 4 +++-
hw/vfio/iommufd.c | 4 +++-
include/sysemu/iommufd.h | 2 +-
3 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/backends/iommufd.c b/backends/iommufd.c
index 7e805bd664..1ce2a24226 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -225,7 +225,7 @@ int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
uint32_t *type, void *data, uint32_t len,
- Error **errp)
+ uint64_t *caps, Error **errp)
{
struct iommu_hw_info info = {
.size = sizeof(info),
@@ -241,6 +241,8 @@ bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
g_assert(type);
*type = info.out_data_type;
+ g_assert(caps);
+ *caps = info.out_capabilities;
return true;
}
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 7cbf0e44f1..d5b923ca83 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -647,9 +647,11 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
union {
struct iommu_hw_info_vtd vtd;
} data;
+ uint64_t hw_caps;
if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid,
- &type, &data, sizeof(data), errp)) {
+ &type, &data, sizeof(data),
+ &hw_caps, errp)) {
return false;
}
diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index dfade18e6d..a0a0143856 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -51,7 +51,7 @@ int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
hwaddr iova, ram_addr_t size);
bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
uint32_t *type, void *data, uint32_t len,
- Error **errp);
+ uint64_t *caps, Error **errp);
#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
#endif
--
2.41.0.windows.1

View File

@ -0,0 +1,59 @@
From 88006385c8e58b2aa612bf5aa184263f0d4245de Mon Sep 17 00:00:00 2001
From: Zhao Liu <zhao1.liu@intel.com>
Date: Mon, 11 Mar 2024 11:37:55 +0800
Subject: [PATCH] backends/iommufd: Fix missing ERRP_GUARD() for
error_prepend()
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The iommufd_backend_set_fd() passes @errp to error_prepend(), to avoid
the above issue, add missing ERRP_GUARD() at the beginning of this
function.
[1]: Issue description in the commit message of commit ae7c80a7bd73
("error: New macro ERRP_GUARD()").
Cc: Yi Liu <yi.l.liu@intel.com>
Cc: Eric Auger <eric.auger@redhat.com>
Cc: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-ID: <20240311033822.3142585-3-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
---
backends/iommufd.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/backends/iommufd.c b/backends/iommufd.c
index 3cbf11fc8b..f061b6869a 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -44,6 +44,7 @@ static void iommufd_backend_finalize(Object *obj)
static void iommufd_backend_set_fd(Object *obj, const char *str, Error **errp)
{
+ ERRP_GUARD();
IOMMUFDBackend *be = IOMMUFD_BACKEND(obj);
int fd = -1;
--
2.41.0.windows.1

View File

@ -0,0 +1,45 @@
From 959b91b9b45b3ec649c6de0e268a4dcd603ce8af Mon Sep 17 00:00:00 2001
From: Zhao Liu <zhao1.liu@intel.com>
Date: Mon, 15 Jul 2024 16:21:54 +0800
Subject: [PATCH] backends/iommufd: Get rid of qemu_open_old()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
For qemu_open_old(), osdep.h said:
> Don't introduce new usage of this function, prefer the following
> qemu_open/qemu_create that take an "Error **errp".
So replace qemu_open_old() with qemu_open().
Cc: Yi Liu <yi.l.liu@intel.com>
Cc: Eric Auger <eric.auger@redhat.com>
Cc: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
backends/iommufd.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/backends/iommufd.c b/backends/iommufd.c
index fad580fdcb..62df6e41f0 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -79,9 +79,8 @@ bool iommufd_backend_connect(IOMMUFDBackend *be, Error **errp)
int fd;
if (be->owned && !be->users) {
- fd = qemu_open_old("/dev/iommu", O_RDWR);
+ fd = qemu_open("/dev/iommu", O_RDWR, errp);
if (fd < 0) {
- error_setg_errno(errp, errno, "/dev/iommu opening failed");
return false;
}
be->fd = fd;
--
2.41.0.windows.1

View File

@ -0,0 +1,61 @@
From 2f1a2f4b320e70a85cef8392cd5f4b1e54afb9c9 Mon Sep 17 00:00:00 2001
From: Zhenzhong Duan <zhenzhong.duan@intel.com>
Date: Wed, 5 Jun 2024 16:30:36 +0800
Subject: [PATCH] backends/iommufd: Implement HostIOMMUDeviceClass::get_cap()
handler
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
---
backends/iommufd.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/backends/iommufd.c b/backends/iommufd.c
index 604a8f4e7d..7e805bd664 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -245,6 +245,28 @@ bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
return true;
}
+static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
+{
+ HostIOMMUDeviceCaps *caps = &hiod->caps;
+
+ switch (cap) {
+ case HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE:
+ return caps->type;
+ case HOST_IOMMU_DEVICE_CAP_AW_BITS:
+ return caps->aw_bits;
+ default:
+ error_setg(errp, "%s: unsupported capability %x", hiod->name, cap);
+ return -EINVAL;
+ }
+}
+
+static void hiod_iommufd_class_init(ObjectClass *oc, void *data)
+{
+ HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
+
+ hioc->get_cap = hiod_iommufd_get_cap;
+};
+
static const TypeInfo types[] = {
{
.name = TYPE_IOMMUFD_BACKEND,
@@ -261,6 +283,7 @@ static const TypeInfo types[] = {
}, {
.name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
.parent = TYPE_HOST_IOMMU_DEVICE,
+ .class_init = hiod_iommufd_class_init,
.abstract = true,
}
};
--
2.41.0.windows.1

View File

@ -0,0 +1,158 @@
From 50142057ec070a70f3f38ec272ec61cc3ae6e071 Mon Sep 17 00:00:00 2001
From: Zhenzhong Duan <zhenzhong.duan@intel.com>
Date: Wed, 5 Jun 2024 16:30:30 +0800
Subject: [PATCH] backends/iommufd: Introduce
TYPE_HOST_IOMMU_DEVICE_IOMMUFD[_VFIO] devices
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
TYPE_HOST_IOMMU_DEVICE_IOMMUFD represents a host IOMMU device under
iommufd backend. It is abstract, because it is going to be derived
into VFIO or VDPA type'd device.
It will have its own .get_cap() implementation.
TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO is a sub-class of
TYPE_HOST_IOMMU_DEVICE_IOMMUFD, represents a VFIO type'd host IOMMU
device under iommufd backend. It will be created during VFIO device
attaching and passed to vIOMMU.
It will have its own .realize() implementation.
Opportunistically, add missed header to include/sysemu/iommufd.h.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
---
backends/iommufd.c | 36 +++++++++++++++++------------------
hw/vfio/iommufd.c | 5 ++++-
include/hw/vfio/vfio-common.h | 3 +++
include/sysemu/iommufd.h | 16 ++++++++++++++++
4 files changed, 41 insertions(+), 19 deletions(-)
diff --git a/backends/iommufd.c b/backends/iommufd.c
index ba58a0eb0d..a2b7f5c3c4 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -223,23 +223,23 @@ int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
return ret;
}
-static const TypeInfo iommufd_backend_info = {
- .name = TYPE_IOMMUFD_BACKEND,
- .parent = TYPE_OBJECT,
- .instance_size = sizeof(IOMMUFDBackend),
- .instance_init = iommufd_backend_init,
- .instance_finalize = iommufd_backend_finalize,
- .class_size = sizeof(IOMMUFDBackendClass),
- .class_init = iommufd_backend_class_init,
- .interfaces = (InterfaceInfo[]) {
- { TYPE_USER_CREATABLE },
- { }
+static const TypeInfo types[] = {
+ {
+ .name = TYPE_IOMMUFD_BACKEND,
+ .parent = TYPE_OBJECT,
+ .instance_size = sizeof(IOMMUFDBackend),
+ .instance_init = iommufd_backend_init,
+ .instance_finalize = iommufd_backend_finalize,
+ .class_size = sizeof(IOMMUFDBackendClass),
+ .class_init = iommufd_backend_class_init,
+ .interfaces = (InterfaceInfo[]) {
+ { TYPE_USER_CREATABLE },
+ { }
+ }
+ }, {
+ .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
+ .parent = TYPE_HOST_IOMMU_DEVICE,
+ .abstract = true,
}
};
-
-static void register_types(void)
-{
- type_register_static(&iommufd_backend_info);
-}
-
-type_init(register_types);
+DEFINE_TYPES(types)
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index d4c586e842..7a4b818830 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -641,7 +641,10 @@ static const TypeInfo types[] = {
.name = TYPE_VFIO_IOMMU_IOMMUFD,
.parent = TYPE_VFIO_IOMMU,
.class_init = vfio_iommu_iommufd_class_init,
- },
+ }, {
+ .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO,
+ .parent = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
+ }
};
DEFINE_TYPES(types)
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 0c807c2806..2cfc8521cd 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -32,6 +32,7 @@
#include "sysemu/sysemu.h"
#include "hw/vfio/vfio-container-base.h"
#include "sysemu/host_iommu_device.h"
+#include "sysemu/iommufd.h"
#define VFIO_MSG_PREFIX "vfio %s: "
@@ -77,6 +78,8 @@ typedef struct VFIOMigration {
struct VFIOGroup;
#define TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO TYPE_HOST_IOMMU_DEVICE "-legacy-vfio"
+#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO \
+ TYPE_HOST_IOMMU_DEVICE_IOMMUFD "-vfio"
typedef struct VFIODMARange {
QLIST_ENTRY(VFIODMARange) next;
diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index 9c5524b0ed..1a75e82f42 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -1,3 +1,16 @@
+/*
+ * iommufd container backend declaration
+ *
+ * Copyright (C) 2024 Intel Corporation.
+ * Copyright Red Hat, Inc. 2024
+ *
+ * Authors: Yi Liu <yi.l.liu@intel.com>
+ * Eric Auger <eric.auger@redhat.com>
+ * Zhenzhong Duan <zhenzhong.duan@intel.com>
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
#ifndef SYSEMU_IOMMUFD_H
#define SYSEMU_IOMMUFD_H
@@ -5,6 +18,7 @@
#include "qemu/thread.h"
#include "exec/hwaddr.h"
#include "exec/cpu-common.h"
+#include "sysemu/host_iommu_device.h"
#define TYPE_IOMMUFD_BACKEND "iommufd"
OBJECT_DECLARE_TYPE(IOMMUFDBackend, IOMMUFDBackendClass, IOMMUFD_BACKEND)
@@ -35,4 +49,6 @@ int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova,
ram_addr_t size, void *vaddr, bool readonly);
int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
hwaddr iova, ram_addr_t size);
+
+#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
#endif
--
2.41.0.windows.1

View File

@ -0,0 +1,69 @@
From ccd8baf4648e6fd6b69e65ee249609904edc92e1 Mon Sep 17 00:00:00 2001
From: Zhenzhong Duan <zhenzhong.duan@intel.com>
Date: Wed, 5 Jun 2024 16:30:33 +0800
Subject: [PATCH] backends/iommufd: Introduce helper function
iommufd_backend_get_device_info()
Introduce a helper function iommufd_backend_get_device_info() to get
host IOMMU related information through iommufd uAPI.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
---
backends/iommufd.c | 22 ++++++++++++++++++++++
include/sysemu/iommufd.h | 3 +++
2 files changed, 25 insertions(+)
diff --git a/backends/iommufd.c b/backends/iommufd.c
index a2b7f5c3c4..604a8f4e7d 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -223,6 +223,28 @@ int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
return ret;
}
+bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
+ uint32_t *type, void *data, uint32_t len,
+ Error **errp)
+{
+ struct iommu_hw_info info = {
+ .size = sizeof(info),
+ .dev_id = devid,
+ .data_len = len,
+ .data_uptr = (uintptr_t)data,
+ };
+
+ if (ioctl(be->fd, IOMMU_GET_HW_INFO, &info)) {
+ error_setg_errno(errp, errno, "Failed to get hardware info");
+ return false;
+ }
+
+ g_assert(type);
+ *type = info.out_data_type;
+
+ return true;
+}
+
static const TypeInfo types[] = {
{
.name = TYPE_IOMMUFD_BACKEND,
diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index 1a75e82f42..dfade18e6d 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -49,6 +49,9 @@ int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova,
ram_addr_t size, void *vaddr, bool readonly);
int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
hwaddr iova, ram_addr_t size);
+bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
+ uint32_t *type, void *data, uint32_t len,
+ Error **errp);
#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
#endif
--
2.41.0.windows.1

View File

@ -0,0 +1,100 @@
From 207259b8f08e87b4a741a8b7884e699c95641a2e Mon Sep 17 00:00:00 2001
From: Nicolin Chen <nicolinc@nvidia.com>
Date: Sat, 13 Apr 2024 00:15:17 +0000
Subject: [PATCH] backends/iommufd: Introduce iommufd_backend_alloc_viommu
Add a helper to allocate a viommu object.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
backends/iommufd.c | 35 +++++++++++++++++++++++++++++++++++
backends/trace-events | 1 +
include/sysemu/iommufd.h | 10 ++++++++++
3 files changed, 46 insertions(+)
diff --git a/backends/iommufd.c b/backends/iommufd.c
index c10aa9b011..82368a3918 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -360,6 +360,41 @@ int iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t hwpt_id,
return ret;
}
+struct IOMMUFDViommu *iommufd_backend_alloc_viommu(IOMMUFDBackend *be,
+ uint32_t dev_id,
+ uint32_t viommu_type,
+ uint32_t hwpt_id)
+{
+ int ret, fd = be->fd;
+ struct IOMMUFDViommu *viommu = g_malloc(sizeof(*viommu));
+ struct iommu_viommu_alloc alloc_viommu = {
+ .size = sizeof(alloc_viommu),
+ .type = viommu_type,
+ .dev_id = dev_id,
+ .hwpt_id = hwpt_id,
+ };
+
+ if (!viommu) {
+ error_report("failed to allocate viommu object");
+ return NULL;
+ }
+
+ ret = ioctl(fd, IOMMU_VIOMMU_ALLOC, &alloc_viommu);
+
+ trace_iommufd_backend_alloc_viommu(fd, viommu_type, dev_id, hwpt_id,
+ alloc_viommu.out_viommu_id, ret);
+ if (ret) {
+ error_report("IOMMU_VIOMMU_ALLOC failed: %s", strerror(errno));
+ g_free(viommu);
+ return NULL;
+ }
+
+ viommu->viommu_id = alloc_viommu.out_viommu_id;
+ viommu->s2_hwpt_id = hwpt_id;
+ viommu->iommufd = be;
+ return viommu;
+}
+
bool host_iommu_device_iommufd_attach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
uint32_t hwpt_id, Error **errp)
{
diff --git a/backends/trace-events b/backends/trace-events
index ef0ff98921..c24cd378df 100644
--- a/backends/trace-events
+++ b/backends/trace-events
@@ -19,3 +19,4 @@ iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d id=%d (%
iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret) " iommufd=%d hwpt=%u enable=%d (%d)"
iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t iova, uint64_t size, uint64_t page_size, int ret) " iommufd=%d hwpt=%u iova=0x%"PRIx64" size=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
iommufd_backend_invalidate_cache(int iommufd, uint32_t hwpt_id, uint32_t data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t data_ptr, int ret) " iommufd=%d hwpt_id=%u data_type=%u entry_len=%u entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"
+iommufd_backend_alloc_viommu(int iommufd, uint32_t type, uint32_t dev_id, uint32_t hwpt_id, uint32_t viommu_id, int ret) " iommufd=%d type=%u dev_id=%u hwpt_id=%u viommu_id=%u (%d)"
diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index 3dc6934144..05a08c49c2 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -39,6 +39,12 @@ struct IOMMUFDBackend {
/*< public >*/
};
+typedef struct IOMMUFDViommu {
+ IOMMUFDBackend *iommufd;
+ uint32_t s2_hwpt_id;
+ uint32_t viommu_id;
+} IOMMUFDViommu;
+
int iommufd_backend_connect(IOMMUFDBackend *be, Error **errp);
void iommufd_backend_disconnect(IOMMUFDBackend *be);
@@ -66,6 +72,10 @@ bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id,
int iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t hwpt_id,
uint32_t data_type, uint32_t entry_len,
uint32_t *entry_num, void *data_ptr);
+struct IOMMUFDViommu *iommufd_backend_alloc_viommu(IOMMUFDBackend *be,
+ uint32_t dev_id,
+ uint32_t viommu_type,
+ uint32_t hwpt_id);
#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
OBJECT_DECLARE_TYPE(HostIOMMUDeviceIOMMUFD, HostIOMMUDeviceIOMMUFDClass,
--
2.41.0.windows.1

View File

@ -0,0 +1,89 @@
From 005b8f4b6cef11982abcc2c071cbe40b69fb22e7 Mon Sep 17 00:00:00 2001
From: Nicolin Chen <nicolinc@nvidia.com>
Date: Sat, 13 Apr 2024 00:21:22 +0000
Subject: [PATCH] backends/iommufd: Introduce iommufd_vdev_alloc
Add a helper to allocate an iommufd device's virtual device (in the user
space) per a viommu instance.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
backends/iommufd.c | 31 +++++++++++++++++++++++++++++++
backends/trace-events | 1 +
include/sysemu/iommufd.h | 11 +++++++++++
3 files changed, 43 insertions(+)
diff --git a/backends/iommufd.c b/backends/iommufd.c
index 82368a3918..af3376d0bf 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -395,6 +395,37 @@ struct IOMMUFDViommu *iommufd_backend_alloc_viommu(IOMMUFDBackend *be,
return viommu;
}
+struct IOMMUFDVdev *iommufd_backend_alloc_vdev(HostIOMMUDeviceIOMMUFD *idev,
+ IOMMUFDViommu *viommu,
+ uint64_t virt_id)
+{
+ int ret, fd = viommu->iommufd->fd;
+ struct IOMMUFDVdev *vdev = g_malloc(sizeof(*vdev));
+ struct iommu_vdevice_alloc alloc_vdev = {
+ .size = sizeof(alloc_vdev),
+ .viommu_id = viommu->viommu_id,
+ .dev_id = idev->devid,
+ .virt_id = virt_id,
+ };
+
+ ret = ioctl(fd, IOMMU_VDEVICE_ALLOC, &alloc_vdev);
+
+ trace_iommufd_backend_alloc_vdev(fd, idev->devid, viommu->viommu_id, virt_id,
+ alloc_vdev.out_vdevice_id, ret);
+
+ if (ret) {
+ error_report("IOMMU_VDEVICE_ALLOC failed: %s", strerror(errno));
+ g_free(vdev);
+ return NULL;
+ }
+
+ vdev->idev = idev;
+ vdev->viommu = viommu;
+ vdev->virt_id = virt_id;
+ vdev->vdev_id = alloc_vdev.out_vdevice_id;
+ return vdev;
+}
+
bool host_iommu_device_iommufd_attach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
uint32_t hwpt_id, Error **errp)
{
diff --git a/backends/trace-events b/backends/trace-events
index c24cd378df..e150a37e9a 100644
--- a/backends/trace-events
+++ b/backends/trace-events
@@ -20,3 +20,4 @@ iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret) "
iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t iova, uint64_t size, uint64_t page_size, int ret) " iommufd=%d hwpt=%u iova=0x%"PRIx64" size=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
iommufd_backend_invalidate_cache(int iommufd, uint32_t hwpt_id, uint32_t data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t data_ptr, int ret) " iommufd=%d hwpt_id=%u data_type=%u entry_len=%u entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"
iommufd_backend_alloc_viommu(int iommufd, uint32_t type, uint32_t dev_id, uint32_t hwpt_id, uint32_t viommu_id, int ret) " iommufd=%d type=%u dev_id=%u hwpt_id=%u viommu_id=%u (%d)"
+iommufd_backend_alloc_vdev(int iommufd, uint32_t dev_id, uint32_t viommu_id, uint64_t virt_id, uint32_t vdev_id, int ret) " iommufd=%d dev_id=%u viommu_id=%u virt_id=0x%"PRIx64" vdev_id=%u (%d)"
diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index 05a08c49c2..0284e95460 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -128,4 +128,15 @@ bool host_iommu_device_iommufd_attach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
uint32_t hwpt_id, Error **errp);
bool host_iommu_device_iommufd_detach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
Error **errp);
+
+typedef struct IOMMUFDVdev {
+ HostIOMMUDeviceIOMMUFD *idev;
+ IOMMUFDViommu *viommu;
+ uint32_t vdev_id;
+ uint64_t virt_id;
+} IOMMUFDVdev;
+
+struct IOMMUFDVdev *iommufd_backend_alloc_vdev(HostIOMMUDeviceIOMMUFD *idev,
+ IOMMUFDViommu *viommu,
+ uint64_t virt_id);
#endif
--
2.41.0.windows.1

View File

@ -0,0 +1,84 @@
From 2be28f75e4ed2a0a35549dd1a545e0655e63973d Mon Sep 17 00:00:00 2001
From: Nicolin Chen <nicolinc@nvidia.com>
Date: Fri, 12 Apr 2024 23:27:54 +0000
Subject: [PATCH] backends/iommufd: Introduce iommufd_viommu_invalidate_cache
Similar to iommufd_backend_invalidate_cache for iotlb invalidation via
IOMMU_HWPT_INVALIDATE ioctl, add a new helper for viommu specific cache
invalidation via IOMMU_VIOMMU_INVALIDATE ioctl.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
backends/iommufd.c | 31 +++++++++++++++++++++++++++++++
backends/trace-events | 1 +
include/sysemu/iommufd.h | 3 +++
3 files changed, 35 insertions(+)
diff --git a/backends/iommufd.c b/backends/iommufd.c
index af3376d0bf..ee6f5bcf65 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -426,6 +426,37 @@ struct IOMMUFDVdev *iommufd_backend_alloc_vdev(HostIOMMUDeviceIOMMUFD *idev,
return vdev;
}
+int iommufd_viommu_invalidate_cache(IOMMUFDBackend *be, uint32_t viommu_id,
+ uint32_t data_type, uint32_t entry_len,
+ uint32_t *entry_num, void *data_ptr)
+{
+ int ret, fd = be->fd;
+ struct iommu_hwpt_invalidate cache = {
+ .size = sizeof(cache),
+ .hwpt_id = viommu_id,
+ .data_type = data_type,
+ .entry_len = entry_len,
+ .entry_num = *entry_num,
+ .data_uptr = (uint64_t)data_ptr,
+ };
+
+ ret = ioctl(fd, IOMMU_HWPT_INVALIDATE, &cache);
+
+ trace_iommufd_viommu_invalidate_cache(fd, viommu_id, data_type,
+ entry_len, *entry_num,
+ cache.entry_num,
+ (uint64_t)data_ptr, ret);
+ if (ret) {
+ *entry_num = cache.entry_num;
+ error_report("IOMMU_VIOMMU_INVALIDATE failed: %s", strerror(errno));
+ ret = -errno;
+ } else {
+ g_assert(*entry_num == cache.entry_num);
+ }
+
+ return ret;
+}
+
bool host_iommu_device_iommufd_attach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
uint32_t hwpt_id, Error **errp)
{
diff --git a/backends/trace-events b/backends/trace-events
index e150a37e9a..f8592a2711 100644
--- a/backends/trace-events
+++ b/backends/trace-events
@@ -21,3 +21,4 @@ iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t iova, u
iommufd_backend_invalidate_cache(int iommufd, uint32_t hwpt_id, uint32_t data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t data_ptr, int ret) " iommufd=%d hwpt_id=%u data_type=%u entry_len=%u entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"
iommufd_backend_alloc_viommu(int iommufd, uint32_t type, uint32_t dev_id, uint32_t hwpt_id, uint32_t viommu_id, int ret) " iommufd=%d type=%u dev_id=%u hwpt_id=%u viommu_id=%u (%d)"
iommufd_backend_alloc_vdev(int iommufd, uint32_t dev_id, uint32_t viommu_id, uint64_t virt_id, uint32_t vdev_id, int ret) " iommufd=%d dev_id=%u viommu_id=%u virt_id=0x%"PRIx64" vdev_id=%u (%d)"
+iommufd_viommu_invalidate_cache(int iommufd, uint32_t viommu_id, uint32_t data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t data_ptr, int ret) " iommufd=%d viommu_id=%u data_type=%u entry_len=%u entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"
diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index 0284e95460..0f2c826036 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -76,6 +76,9 @@ struct IOMMUFDViommu *iommufd_backend_alloc_viommu(IOMMUFDBackend *be,
uint32_t dev_id,
uint32_t viommu_type,
uint32_t hwpt_id);
+int iommufd_viommu_invalidate_cache(IOMMUFDBackend *be, uint32_t viommu_id,
+ uint32_t data_type, uint32_t entry_len,
+ uint32_t *entry_num, void *data_ptr);
#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
OBJECT_DECLARE_TYPE(HostIOMMUDeviceIOMMUFD, HostIOMMUDeviceIOMMUFDClass,
--
2.41.0.windows.1

View File

@ -0,0 +1,468 @@
From 6cb41a55992571dd215fee86ed910bb4d6688bf8 Mon Sep 17 00:00:00 2001
From: Eric Auger <eric.auger@redhat.com>
Date: Sat, 11 Jan 2025 10:52:37 +0800
Subject: [PATCH] backends/iommufd: Introduce the iommufd object
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Introduce an iommufd object which allows the interaction
with the host /dev/iommu device.
The /dev/iommu can have been already pre-opened outside of qemu,
in which case the fd can be passed directly along with the
iommufd object:
This allows the iommufd object to be shared accross several
subsystems (VFIO, VDPA, ...). For example, libvirt would open
the /dev/iommu once.
If no fd is passed along with the iommufd object, the /dev/iommu
is opened by the qemu code.
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
---
MAINTAINERS | 8 ++
backends/Kconfig | 4 +
backends/iommufd.c | 245 +++++++++++++++++++++++++++++++++++++++
backends/meson.build | 1 +
backends/trace-events | 10 ++
include/sysemu/iommufd.h | 38 ++++++
qapi/qom.json | 19 +++
qemu-options.hx | 12 ++
8 files changed, 337 insertions(+)
create mode 100644 backends/iommufd.c
create mode 100644 include/sysemu/iommufd.h
diff --git a/MAINTAINERS b/MAINTAINERS
index 695e0bd34f..a5a446914a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2167,6 +2167,14 @@ F: hw/vfio/ap.c
F: docs/system/s390x/vfio-ap.rst
L: qemu-s390x@nongnu.org
+iommufd
+M: Yi Liu <yi.l.liu@intel.com>
+M: Eric Auger <eric.auger@redhat.com>
+M: Zhenzhong Duan <zhenzhong.duan@intel.com>
+S: Supported
+F: backends/iommufd.c
+F: include/sysemu/iommufd.h
+
vhost
M: Michael S. Tsirkin <mst@redhat.com>
S: Supported
diff --git a/backends/Kconfig b/backends/Kconfig
index f35abc1609..2cb23f62fa 100644
--- a/backends/Kconfig
+++ b/backends/Kconfig
@@ -1 +1,5 @@
source tpm/Kconfig
+
+config IOMMUFD
+ bool
+ depends on VFIO
diff --git a/backends/iommufd.c b/backends/iommufd.c
new file mode 100644
index 0000000000..ba58a0eb0d
--- /dev/null
+++ b/backends/iommufd.c
@@ -0,0 +1,245 @@
+/*
+ * iommufd container backend
+ *
+ * Copyright (C) 2023 Intel Corporation.
+ * Copyright Red Hat, Inc. 2023
+ *
+ * Authors: Yi Liu <yi.l.liu@intel.com>
+ * Eric Auger <eric.auger@redhat.com>
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "sysemu/iommufd.h"
+#include "qapi/error.h"
+#include "qapi/qmp/qerror.h"
+#include "qemu/module.h"
+#include "qom/object_interfaces.h"
+#include "qemu/error-report.h"
+#include "monitor/monitor.h"
+#include "trace.h"
+#include <sys/ioctl.h>
+#include <linux/iommufd.h>
+
+static void iommufd_backend_init(Object *obj)
+{
+ IOMMUFDBackend *be = IOMMUFD_BACKEND(obj);
+
+ be->fd = -1;
+ be->users = 0;
+ be->owned = true;
+ qemu_mutex_init(&be->lock);
+}
+
+static void iommufd_backend_finalize(Object *obj)
+{
+ IOMMUFDBackend *be = IOMMUFD_BACKEND(obj);
+
+ if (be->owned) {
+ close(be->fd);
+ be->fd = -1;
+ }
+}
+
+static void iommufd_backend_set_fd(Object *obj, const char *str, Error **errp)
+{
+ IOMMUFDBackend *be = IOMMUFD_BACKEND(obj);
+ int fd = -1;
+
+ fd = monitor_fd_param(monitor_cur(), str, errp);
+ if (fd == -1) {
+ error_prepend(errp, "Could not parse remote object fd %s:", str);
+ return;
+ }
+ qemu_mutex_lock(&be->lock);
+ be->fd = fd;
+ be->owned = false;
+ qemu_mutex_unlock(&be->lock);
+ trace_iommu_backend_set_fd(be->fd);
+}
+
+static bool iommufd_backend_can_be_deleted(UserCreatable *uc)
+{
+ IOMMUFDBackend *be = IOMMUFD_BACKEND(uc);
+
+ return !be->users;
+}
+
+static void iommufd_backend_class_init(ObjectClass *oc, void *data)
+{
+ UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
+
+ ucc->can_be_deleted = iommufd_backend_can_be_deleted;
+
+ object_class_property_add_str(oc, "fd", NULL, iommufd_backend_set_fd);
+}
+
+int iommufd_backend_connect(IOMMUFDBackend *be, Error **errp)
+{
+ int fd, ret = 0;
+
+ qemu_mutex_lock(&be->lock);
+ if (be->users == UINT32_MAX) {
+ error_setg(errp, "too many connections");
+ ret = -E2BIG;
+ goto out;
+ }
+ if (be->owned && !be->users) {
+ fd = qemu_open_old("/dev/iommu", O_RDWR);
+ if (fd < 0) {
+ error_setg_errno(errp, errno, "/dev/iommu opening failed");
+ ret = fd;
+ goto out;
+ }
+ be->fd = fd;
+ }
+ be->users++;
+out:
+ trace_iommufd_backend_connect(be->fd, be->owned,
+ be->users, ret);
+ qemu_mutex_unlock(&be->lock);
+ return ret;
+}
+
+void iommufd_backend_disconnect(IOMMUFDBackend *be)
+{
+ qemu_mutex_lock(&be->lock);
+ if (!be->users) {
+ goto out;
+ }
+ be->users--;
+ if (!be->users && be->owned) {
+ close(be->fd);
+ be->fd = -1;
+ }
+out:
+ trace_iommufd_backend_disconnect(be->fd, be->users);
+ qemu_mutex_unlock(&be->lock);
+}
+
+int iommufd_backend_alloc_ioas(IOMMUFDBackend *be, uint32_t *ioas_id,
+ Error **errp)
+{
+ int ret, fd = be->fd;
+ struct iommu_ioas_alloc alloc_data = {
+ .size = sizeof(alloc_data),
+ .flags = 0,
+ };
+
+ ret = ioctl(fd, IOMMU_IOAS_ALLOC, &alloc_data);
+ if (ret) {
+ error_setg_errno(errp, errno, "Failed to allocate ioas");
+ return ret;
+ }
+
+ *ioas_id = alloc_data.out_ioas_id;
+ trace_iommufd_backend_alloc_ioas(fd, *ioas_id, ret);
+
+ return ret;
+}
+
+void iommufd_backend_free_id(IOMMUFDBackend *be, uint32_t id)
+{
+ int ret, fd = be->fd;
+ struct iommu_destroy des = {
+ .size = sizeof(des),
+ .id = id,
+ };
+
+ ret = ioctl(fd, IOMMU_DESTROY, &des);
+ trace_iommufd_backend_free_id(fd, id, ret);
+ if (ret) {
+ error_report("Failed to free id: %u %m", id);
+ }
+}
+
+int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova,
+ ram_addr_t size, void *vaddr, bool readonly)
+{
+ int ret, fd = be->fd;
+ struct iommu_ioas_map map = {
+ .size = sizeof(map),
+ .flags = IOMMU_IOAS_MAP_READABLE |
+ IOMMU_IOAS_MAP_FIXED_IOVA,
+ .ioas_id = ioas_id,
+ .__reserved = 0,
+ .user_va = (uintptr_t)vaddr,
+ .iova = iova,
+ .length = size,
+ };
+
+ if (!readonly) {
+ map.flags |= IOMMU_IOAS_MAP_WRITEABLE;
+ }
+
+ ret = ioctl(fd, IOMMU_IOAS_MAP, &map);
+ trace_iommufd_backend_map_dma(fd, ioas_id, iova, size,
+ vaddr, readonly, ret);
+ if (ret) {
+ ret = -errno;
+
+ /* TODO: Not support mapping hardware PCI BAR region for now. */
+ if (errno == EFAULT) {
+ warn_report("IOMMU_IOAS_MAP failed: %m, PCI BAR?");
+ } else {
+ error_report("IOMMU_IOAS_MAP failed: %m");
+ }
+ }
+ return ret;
+}
+
+int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
+ hwaddr iova, ram_addr_t size)
+{
+ int ret, fd = be->fd;
+ struct iommu_ioas_unmap unmap = {
+ .size = sizeof(unmap),
+ .ioas_id = ioas_id,
+ .iova = iova,
+ .length = size,
+ };
+
+ ret = ioctl(fd, IOMMU_IOAS_UNMAP, &unmap);
+ /*
+ * IOMMUFD takes mapping as some kind of object, unmapping
+ * nonexistent mapping is treated as deleting a nonexistent
+ * object and return ENOENT. This is different from legacy
+ * backend which allows it. vIOMMU may trigger a lot of
+ * redundant unmapping, to avoid flush the log, treat them
+ * as succeess for IOMMUFD just like legacy backend.
+ */
+ if (ret && errno == ENOENT) {
+ trace_iommufd_backend_unmap_dma_non_exist(fd, ioas_id, iova, size, ret);
+ ret = 0;
+ } else {
+ trace_iommufd_backend_unmap_dma(fd, ioas_id, iova, size, ret);
+ }
+
+ if (ret) {
+ ret = -errno;
+ error_report("IOMMU_IOAS_UNMAP failed: %m");
+ }
+ return ret;
+}
+
+static const TypeInfo iommufd_backend_info = {
+ .name = TYPE_IOMMUFD_BACKEND,
+ .parent = TYPE_OBJECT,
+ .instance_size = sizeof(IOMMUFDBackend),
+ .instance_init = iommufd_backend_init,
+ .instance_finalize = iommufd_backend_finalize,
+ .class_size = sizeof(IOMMUFDBackendClass),
+ .class_init = iommufd_backend_class_init,
+ .interfaces = (InterfaceInfo[]) {
+ { TYPE_USER_CREATABLE },
+ { }
+ }
+};
+
+static void register_types(void)
+{
+ type_register_static(&iommufd_backend_info);
+}
+
+type_init(register_types);
diff --git a/backends/meson.build b/backends/meson.build
index 914c7c4afb..9a5cea480d 100644
--- a/backends/meson.build
+++ b/backends/meson.build
@@ -20,6 +20,7 @@ if have_vhost_user
system_ss.add(when: 'CONFIG_VIRTIO', if_true: files('vhost-user.c'))
endif
system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-vhost.c'))
+system_ss.add(when: 'CONFIG_IOMMUFD', if_true: files('iommufd.c'))
if have_vhost_user_crypto
system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-vhost-user.c'))
endif
diff --git a/backends/trace-events b/backends/trace-events
index 652eb76a57..d45c6e31a6 100644
--- a/backends/trace-events
+++ b/backends/trace-events
@@ -5,3 +5,13 @@ dbus_vmstate_pre_save(void)
dbus_vmstate_post_load(int version_id) "version_id: %d"
dbus_vmstate_loading(const char *id) "id: %s"
dbus_vmstate_saving(const char *id) "id: %s"
+
+# iommufd.c
+iommufd_backend_connect(int fd, bool owned, uint32_t users, int ret) "fd=%d owned=%d users=%d (%d)"
+iommufd_backend_disconnect(int fd, uint32_t users) "fd=%d users=%d"
+iommu_backend_set_fd(int fd) "pre-opened /dev/iommu fd=%d"
+iommufd_backend_map_dma(int iommufd, uint32_t ioas, uint64_t iova, uint64_t size, void *vaddr, bool readonly, int ret) " iommufd=%d ioas=%d iova=0x%"PRIx64" size=0x%"PRIx64" addr=%p readonly=%d (%d)"
+iommufd_backend_unmap_dma_non_exist(int iommufd, uint32_t ioas, uint64_t iova, uint64_t size, int ret) " Unmap nonexistent mapping: iommufd=%d ioas=%d iova=0x%"PRIx64" size=0x%"PRIx64" (%d)"
+iommufd_backend_unmap_dma(int iommufd, uint32_t ioas, uint64_t iova, uint64_t size, int ret) " iommufd=%d ioas=%d iova=0x%"PRIx64" size=0x%"PRIx64" (%d)"
+iommufd_backend_alloc_ioas(int iommufd, uint32_t ioas, int ret) " iommufd=%d ioas=%d (%d)"
+iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d id=%d (%d)"
diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
new file mode 100644
index 0000000000..9c5524b0ed
--- /dev/null
+++ b/include/sysemu/iommufd.h
@@ -0,0 +1,38 @@
+#ifndef SYSEMU_IOMMUFD_H
+#define SYSEMU_IOMMUFD_H
+
+#include "qom/object.h"
+#include "qemu/thread.h"
+#include "exec/hwaddr.h"
+#include "exec/cpu-common.h"
+
+#define TYPE_IOMMUFD_BACKEND "iommufd"
+OBJECT_DECLARE_TYPE(IOMMUFDBackend, IOMMUFDBackendClass, IOMMUFD_BACKEND)
+
+struct IOMMUFDBackendClass {
+ ObjectClass parent_class;
+};
+
+struct IOMMUFDBackend {
+ Object parent;
+
+ /*< protected >*/
+ int fd; /* /dev/iommu file descriptor */
+ bool owned; /* is the /dev/iommu opened internally */
+ QemuMutex lock;
+ uint32_t users;
+
+ /*< public >*/
+};
+
+int iommufd_backend_connect(IOMMUFDBackend *be, Error **errp);
+void iommufd_backend_disconnect(IOMMUFDBackend *be);
+
+int iommufd_backend_alloc_ioas(IOMMUFDBackend *be, uint32_t *ioas_id,
+ Error **errp);
+void iommufd_backend_free_id(IOMMUFDBackend *be, uint32_t id);
+int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova,
+ ram_addr_t size, void *vaddr, bool readonly);
+int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
+ hwaddr iova, ram_addr_t size);
+#endif
diff --git a/qapi/qom.json b/qapi/qom.json
index a74c7a91f9..a5336e6b11 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -794,6 +794,23 @@
{ 'struct': 'VfioUserServerProperties',
'data': { 'socket': 'SocketAddress', 'device': 'str' } }
+##
+# @IOMMUFDProperties:
+#
+# Properties for iommufd objects.
+#
+# @fd: file descriptor name previously passed via 'getfd' command,
+# which represents a pre-opened /dev/iommu. This allows the
+# iommufd object to be shared accross several subsystems
+# (VFIO, VDPA, ...), and the file descriptor to be shared
+# with other process, e.g. DPDK. (default: QEMU opens
+# /dev/iommu by itself)
+#
+# Since: 9.0
+##
+{ 'struct': 'IOMMUFDProperties',
+ 'data': { '*fd': 'str' } }
+
##
# @RngProperties:
#
@@ -969,6 +986,7 @@
'input-barrier',
{ 'name': 'input-linux',
'if': 'CONFIG_LINUX' },
+ 'iommufd',
'iothread',
'main-loop',
{ 'name': 'memory-backend-epc',
@@ -1039,6 +1057,7 @@
'input-barrier': 'InputBarrierProperties',
'input-linux': { 'type': 'InputLinuxProperties',
'if': 'CONFIG_LINUX' },
+ 'iommufd': 'IOMMUFDProperties',
'iothread': 'IothreadProperties',
'main-loop': 'MainLoopProperties',
'memory-backend-epc': { 'type': 'MemoryBackendEpcProperties',
diff --git a/qemu-options.hx b/qemu-options.hx
index 8516b73206..7fe76c4b1d 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -5224,6 +5224,18 @@ SRST
The ``share`` boolean option is on by default with memfd.
+ ``-object iommufd,id=id[,fd=fd]``
+ Creates an iommufd backend which allows control of DMA mapping
+ through the ``/dev/iommu`` device.
+
+ The ``id`` parameter is a unique ID which frontends (such as
+ vfio-pci of vdpa) will use to connect with the iommufd backend.
+
+ The ``fd`` parameter is an optional pre-opened file descriptor
+ resulting from ``/dev/iommu`` opening. Usually the iommufd is shared
+ across all subsystems, bringing the benefit of centralized
+ reference counting.
+
``-object rng-builtin,id=id``
Creates a random number generator backend which obtains entropy
from QEMU builtin functions. The ``id`` parameter is a unique ID
--
2.41.0.windows.1

View File

@ -0,0 +1,140 @@
From c9a107b1f73bddb4c9844c12444e3802e5f576b4 Mon Sep 17 00:00:00 2001
From: Zhenzhong Duan <zhenzhong.duan@intel.com>
Date: Tue, 7 May 2024 14:42:52 +0800
Subject: [PATCH] backends/iommufd: Make iommufd_backend_*() return bool
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This is to follow the coding standand to return bool if 'Error **'
is used to pass error.
The changed functions include:
iommufd_backend_connect
iommufd_backend_alloc_ioas
By this chance, simplify the functions a bit by avoiding duplicate
recordings, e.g., log through either error interface or trace, not
both.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
backends/iommufd.c | 29 +++++++++++++----------------
backends/trace-events | 4 ++--
include/sysemu/iommufd.h | 6 +++---
3 files changed, 18 insertions(+), 21 deletions(-)
diff --git a/backends/iommufd.c b/backends/iommufd.c
index f061b6869a..fad580fdcb 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -74,24 +74,22 @@ static void iommufd_backend_class_init(ObjectClass *oc, void *data)
object_class_property_add_str(oc, "fd", NULL, iommufd_backend_set_fd);
}
-int iommufd_backend_connect(IOMMUFDBackend *be, Error **errp)
+bool iommufd_backend_connect(IOMMUFDBackend *be, Error **errp)
{
- int fd, ret = 0;
+ int fd;
if (be->owned && !be->users) {
fd = qemu_open_old("/dev/iommu", O_RDWR);
if (fd < 0) {
error_setg_errno(errp, errno, "/dev/iommu opening failed");
- ret = fd;
- goto out;
+ return false;
}
be->fd = fd;
}
be->users++;
-out:
- trace_iommufd_backend_connect(be->fd, be->owned,
- be->users, ret);
- return ret;
+
+ trace_iommufd_backend_connect(be->fd, be->owned, be->users);
+ return true;
}
void iommufd_backend_disconnect(IOMMUFDBackend *be)
@@ -108,25 +106,24 @@ out:
trace_iommufd_backend_disconnect(be->fd, be->users);
}
-int iommufd_backend_alloc_ioas(IOMMUFDBackend *be, uint32_t *ioas_id,
- Error **errp)
+bool iommufd_backend_alloc_ioas(IOMMUFDBackend *be, uint32_t *ioas_id,
+ Error **errp)
{
- int ret, fd = be->fd;
+ int fd = be->fd;
struct iommu_ioas_alloc alloc_data = {
.size = sizeof(alloc_data),
.flags = 0,
};
- ret = ioctl(fd, IOMMU_IOAS_ALLOC, &alloc_data);
- if (ret) {
+ if (ioctl(fd, IOMMU_IOAS_ALLOC, &alloc_data)) {
error_setg_errno(errp, errno, "Failed to allocate ioas");
- return ret;
+ return false;
}
*ioas_id = alloc_data.out_ioas_id;
- trace_iommufd_backend_alloc_ioas(fd, *ioas_id, ret);
+ trace_iommufd_backend_alloc_ioas(fd, *ioas_id);
- return ret;
+ return true;
}
void iommufd_backend_free_id(IOMMUFDBackend *be, uint32_t id)
diff --git a/backends/trace-events b/backends/trace-events
index f8592a2711..8fe77149b2 100644
--- a/backends/trace-events
+++ b/backends/trace-events
@@ -7,13 +7,13 @@ dbus_vmstate_loading(const char *id) "id: %s"
dbus_vmstate_saving(const char *id) "id: %s"
# iommufd.c
-iommufd_backend_connect(int fd, bool owned, uint32_t users, int ret) "fd=%d owned=%d users=%d (%d)"
+iommufd_backend_connect(int fd, bool owned, uint32_t users) "fd=%d owned=%d users=%d"
iommufd_backend_disconnect(int fd, uint32_t users) "fd=%d users=%d"
iommu_backend_set_fd(int fd) "pre-opened /dev/iommu fd=%d"
iommufd_backend_map_dma(int iommufd, uint32_t ioas, uint64_t iova, uint64_t size, void *vaddr, bool readonly, int ret) " iommufd=%d ioas=%d iova=0x%"PRIx64" size=0x%"PRIx64" addr=%p readonly=%d (%d)"
iommufd_backend_unmap_dma_non_exist(int iommufd, uint32_t ioas, uint64_t iova, uint64_t size, int ret) " Unmap nonexistent mapping: iommufd=%d ioas=%d iova=0x%"PRIx64" size=0x%"PRIx64" (%d)"
iommufd_backend_unmap_dma(int iommufd, uint32_t ioas, uint64_t iova, uint64_t size, int ret) " iommufd=%d ioas=%d iova=0x%"PRIx64" size=0x%"PRIx64" (%d)"
-iommufd_backend_alloc_ioas(int iommufd, uint32_t ioas, int ret) " iommufd=%d ioas=%d (%d)"
+iommufd_backend_alloc_ioas(int iommufd, uint32_t ioas) " iommufd=%d ioas=%d"
iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, uint32_t pt_id, uint32_t flags, uint32_t hwpt_type, uint32_t len, uint64_t data_ptr, uint32_t out_hwpt_id, int ret) " iommufd=%d dev_id=%u pt_id=%u flags=0x%x hwpt_type=%u len=%u data_ptr=0x%"PRIx64" out_hwpt=%u (%d)"
iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d id=%d (%d)"
iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret) " iommufd=%d hwpt=%u enable=%d (%d)"
diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index 908c94d811..0531a4ad98 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -43,11 +43,11 @@ typedef struct IOMMUFDViommu {
uint32_t viommu_id;
} IOMMUFDViommu;
-int iommufd_backend_connect(IOMMUFDBackend *be, Error **errp);
+bool iommufd_backend_connect(IOMMUFDBackend *be, Error **errp);
void iommufd_backend_disconnect(IOMMUFDBackend *be);
-int iommufd_backend_alloc_ioas(IOMMUFDBackend *be, uint32_t *ioas_id,
- Error **errp);
+bool iommufd_backend_alloc_ioas(IOMMUFDBackend *be, uint32_t *ioas_id,
+ Error **errp);
void iommufd_backend_free_id(IOMMUFDBackend *be, uint32_t id);
int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova,
ram_addr_t size, void *vaddr, bool readonly);
--
2.41.0.windows.1

View File

@ -0,0 +1,37 @@
From e2bc395c5db34111faf2adcecdb385e5a4e8d23d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Fri, 22 Dec 2023 08:55:23 +0100
Subject: [PATCH] backends/iommufd: Remove check on number of backend users
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
QOM already has a ref count on objects and it will assert much
earlier, when INT_MAX is reached.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
backends/iommufd.c | 5 -----
1 file changed, 5 deletions(-)
diff --git a/backends/iommufd.c b/backends/iommufd.c
index 4f5df63331..f17a846aab 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -81,11 +81,6 @@ int iommufd_backend_connect(IOMMUFDBackend *be, Error **errp)
int fd, ret = 0;
qemu_mutex_lock(&be->lock);
- if (be->users == UINT32_MAX) {
- error_setg(errp, "too many connections");
- ret = -E2BIG;
- goto out;
- }
if (be->owned && !be->users) {
fd = qemu_open_old("/dev/iommu", O_RDWR);
if (fd < 0) {
--
2.41.0.windows.1

View File

@ -0,0 +1,103 @@
From 1e6734af14b3223a7d7e304262c96051ddf8637f Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Thu, 21 Dec 2023 16:58:41 +0100
Subject: [PATCH] backends/iommufd: Remove mutex
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Coverity reports a concurrent data access violation because be->users
is being accessed in iommufd_backend_can_be_deleted() without holding
the mutex.
However, these routines are called from the QEMU main thread when a
device is created. In this case, the code paths should be protected by
the BQL lock and it should be safe to drop the IOMMUFD backend mutex.
Simply remove it.
Fixes: CID 1531550
Fixes: CID 1531549
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
backends/iommufd.c | 7 -------
include/sysemu/iommufd.h | 2 --
2 files changed, 9 deletions(-)
diff --git a/backends/iommufd.c b/backends/iommufd.c
index f17a846aab..3cbf11fc8b 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -30,7 +30,6 @@ static void iommufd_backend_init(Object *obj)
be->fd = -1;
be->users = 0;
be->owned = true;
- qemu_mutex_init(&be->lock);
}
static void iommufd_backend_finalize(Object *obj)
@@ -53,10 +52,8 @@ static void iommufd_backend_set_fd(Object *obj, const char *str, Error **errp)
error_prepend(errp, "Could not parse remote object fd %s:", str);
return;
}
- qemu_mutex_lock(&be->lock);
be->fd = fd;
be->owned = false;
- qemu_mutex_unlock(&be->lock);
trace_iommu_backend_set_fd(be->fd);
}
@@ -80,7 +77,6 @@ int iommufd_backend_connect(IOMMUFDBackend *be, Error **errp)
{
int fd, ret = 0;
- qemu_mutex_lock(&be->lock);
if (be->owned && !be->users) {
fd = qemu_open_old("/dev/iommu", O_RDWR);
if (fd < 0) {
@@ -94,13 +90,11 @@ int iommufd_backend_connect(IOMMUFDBackend *be, Error **errp)
out:
trace_iommufd_backend_connect(be->fd, be->owned,
be->users, ret);
- qemu_mutex_unlock(&be->lock);
return ret;
}
void iommufd_backend_disconnect(IOMMUFDBackend *be)
{
- qemu_mutex_lock(&be->lock);
if (!be->users) {
goto out;
}
@@ -111,7 +105,6 @@ void iommufd_backend_disconnect(IOMMUFDBackend *be)
}
out:
trace_iommufd_backend_disconnect(be->fd, be->users);
- qemu_mutex_unlock(&be->lock);
}
int iommufd_backend_alloc_ioas(IOMMUFDBackend *be, uint32_t *ioas_id,
diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index 29afaa429d..908c94d811 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -15,7 +15,6 @@
#define SYSEMU_IOMMUFD_H
#include "qom/object.h"
-#include "qemu/thread.h"
#include "exec/hwaddr.h"
#include "exec/cpu-common.h"
#include "sysemu/host_iommu_device.h"
@@ -33,7 +32,6 @@ struct IOMMUFDBackend {
/*< protected >*/
int fd; /* /dev/iommu file descriptor */
bool owned; /* is the /dev/iommu opened internally */
- QemuMutex lock;
uint32_t users;
/*< public >*/
--
2.41.0.windows.1

View File

@ -0,0 +1,320 @@
From da6ee14de85b4e619eedfbe3a6cac3f09d948589 Mon Sep 17 00:00:00 2001
From: nonce <2774337358@qq.com>
Date: Thu, 23 Jan 2025 21:03:10 +0800
Subject: [PATCH] bakcend: VirtCCA:resolve hugepage memory waste issue in
vhost-user scenario
VirtCCA is based on SWIOTLB to implement virtio and will only allocate
Bounce Buffer in the lower address range below 4GB. Therefore, the
backend hugepages memory allocated above 4GB will not be used, resulting
in significant waste.
New address space and memory region are added to manage the backend
hugepages memory corresponding to the GPA below 4GB, and there are
shared with the vhostuser backend.
Signed-off-by: nonce0_0 <2774337358@qq.com>
---
backends/hostmem-file.c | 85 +++++++++++++++++++++++++++++++++++
hw/core/numa.c | 20 +++++++++
hw/virtio/vhost.c | 8 +++-
include/exec/address-spaces.h | 3 ++
include/exec/cpu-common.h | 1 +
include/exec/memory.h | 11 +++++
system/physmem.c | 17 +++++++
system/vl.c | 9 ++++
8 files changed, 153 insertions(+), 1 deletion(-)
diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index 361d4a8103..891fe4ac4a 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -20,9 +20,13 @@
#include "qom/object.h"
#include "qapi/visitor.h"
#include "qapi/qapi-visit-common.h"
+#include "sysemu/kvm.h"
+#include "exec/address-spaces.h"
OBJECT_DECLARE_SIMPLE_TYPE(HostMemoryBackendFile, MEMORY_BACKEND_FILE)
+bool virtcca_shared_hugepage_mapped = false;
+uint64_t virtcca_cvm_ram_size = 0;
struct HostMemoryBackendFile {
HostMemoryBackend parent_obj;
@@ -36,6 +40,83 @@ struct HostMemoryBackendFile {
OnOffAuto rom;
};
+/* Parse the path of the hugepages memory file used for memory sharing */
+static int virtcca_parse_share_mem_path(char *src, char *dst)
+{
+ int ret = 0;
+ char src_copy[PATH_MAX];
+ char *token = NULL;
+ char *last_dir = NULL;
+ char *second_last_dir = NULL;
+ static const char delimiter[] = "/";
+
+ if (src == NULL || dst == NULL ||
+ strlen(src) == 0 || strlen(src) > PATH_MAX - 1) {
+ error_report("Invalid input: NULL pointer or invalid string length.");
+ return -1;
+ }
+
+ strcpy(src_copy, src);
+ token = strtok(src_copy, delimiter);
+
+ /* Iterate over the path segments to find the second-to-last directory */
+ while (token != NULL) {
+ second_last_dir = last_dir;
+ last_dir = token;
+ token = strtok(NULL, delimiter);
+ }
+
+ /* Check if the second-to-last directory is found */
+ if (second_last_dir == NULL) {
+ error_report("Invalid path: second-to-last directory not found.");
+ return -1;
+ }
+
+ /*
+ * Construct the share memory path by appending the extracted domain name
+ * to the hugepages memory filesystem prefix
+ */
+ ret = snprintf(dst, PATH_MAX, "/dev/hugepages/libvirt/qemu/%s",
+ second_last_dir);
+
+ if (ret < 0 || ret >= PATH_MAX) {
+ error_report("Error: snprintf failed to construct the share mem path");
+ return -1;
+ }
+
+ return 0;
+}
+
+/*
+ * Create a hugepage memory region in the virtcca scenario
+ * for sharing with process like vhost-user and others.
+ */
+static void
+virtcca_shared_backend_memory_alloc(char *mem_path, uint32_t ram_flags, Error **errp)
+{
+ char dst[PATH_MAX];
+ uint64_t size = virtcca_cvm_ram_size;
+
+ if (virtcca_parse_share_mem_path(mem_path, dst)) {
+ error_report("parse virtcca share memory path failed");
+ exit(1);
+ }
+ if (virtcca_cvm_ram_size >= VIRTCCA_SHARED_HUGEPAGE_MAX_SIZE) {
+ size = VIRTCCA_SHARED_HUGEPAGE_MAX_SIZE;
+ }
+
+ virtcca_shared_hugepage = g_new(MemoryRegion, 1);
+ memory_region_init_ram_from_file(virtcca_shared_hugepage, NULL,
+ "virtcca_shared_hugepage", size,
+ VIRTCCA_SHARED_HUGEPAGE_ALIGN,
+ ram_flags, dst, 0, errp);
+ if (*errp) {
+ error_reportf_err(*errp, "cannot init RamBlock for virtcca_shared_hugepage: ");
+ exit(1);
+ }
+ virtcca_shared_hugepage_mapped = true;
+}
+
static void
file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
{
@@ -90,6 +171,10 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
backend->size, fb->align, ram_flags,
fb->mem_path, fb->offset, errp);
g_free(name);
+
+ if (virtcca_cvm_enabled() && backend->share && !virtcca_shared_hugepage_mapped) {
+ virtcca_shared_backend_memory_alloc(fb->mem_path, ram_flags, errp);
+ }
#endif
}
diff --git a/hw/core/numa.c b/hw/core/numa.c
index f08956ddb0..e7c48dab61 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -42,6 +42,8 @@
#include "qemu/option.h"
#include "qemu/config-file.h"
#include "qemu/cutils.h"
+#include "exec/address-spaces.h"
+#include "sysemu/kvm.h"
QemuOptsList qemu_numa_opts = {
.name = "numa",
@@ -641,6 +643,21 @@ static void numa_init_memdev_container(MachineState *ms, MemoryRegion *ram)
}
}
+/*
+ * Add virtcca_shared_hugepage as a sub-MR to the root MR of address space
+ * address_space_memory and address_space_virtcca_shared_memory.
+ */
+static void virtcca_shared_memory_configuration(MachineState *ms)
+{
+ MemoryRegion *alias_mr = g_new(MemoryRegion, 1);
+
+ memory_region_add_subregion_overlap(ms->ram, 0, virtcca_shared_hugepage, 1);
+ memory_region_init_alias(alias_mr, NULL, "alias-mr", virtcca_shared_hugepage,
+ 0, int128_get64(virtcca_shared_hugepage->size));
+ memory_region_add_subregion(address_space_virtcca_shared_memory.root,
+ VIRTCCA_GPA_START, alias_mr);
+}
+
void numa_complete_configuration(MachineState *ms)
{
int i;
@@ -711,6 +728,9 @@ void numa_complete_configuration(MachineState *ms)
memory_region_init(ms->ram, OBJECT(ms), mc->default_ram_id,
ms->ram_size);
numa_init_memdev_container(ms, ms->ram);
+ if (virtcca_cvm_enabled() && virtcca_shared_hugepage->ram_block) {
+ virtcca_shared_memory_configuration(ms);
+ }
}
/* QEMU needs at least all unique node pair distances to build
* the whole NUMA distance table. QEMU treats the distance table
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index d29075aa04..8b95558013 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -30,6 +30,7 @@
#include "sysemu/dma.h"
#include "trace.h"
#include "qapi/qapi-commands-migration.h"
+#include "sysemu/kvm.h"
/* enabled until disconnected backend stabilizes */
#define _VHOST_DEBUG 1
@@ -1616,7 +1617,12 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
hdev->log_size = 0;
hdev->log_enabled = false;
hdev->started = false;
- memory_listener_register(&hdev->memory_listener, &address_space_memory);
+ if (virtcca_cvm_enabled()) {
+ memory_listener_register(&hdev->memory_listener,
+ &address_space_virtcca_shared_memory);
+ } else {
+ memory_listener_register(&hdev->memory_listener, &address_space_memory);
+ }
QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
/*
diff --git a/include/exec/address-spaces.h b/include/exec/address-spaces.h
index 0d0aa61d68..4518b5da86 100644
--- a/include/exec/address-spaces.h
+++ b/include/exec/address-spaces.h
@@ -33,6 +33,9 @@ MemoryRegion *get_system_io(void);
extern AddressSpace address_space_memory;
extern AddressSpace address_space_io;
+extern AddressSpace address_space_virtcca_shared_memory;
+
+extern MemoryRegion *virtcca_shared_hugepage;
#endif
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index c7fd30d5b9..d21d9990ad 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -28,6 +28,7 @@ typedef uint64_t vaddr;
void cpu_exec_init_all(void);
void cpu_exec_step_atomic(CPUState *cpu);
+void virtcca_shared_memory_address_space_init(void);
/* Using intptr_t ensures that qemu_*_page_mask is sign-extended even
* when intptr_t is 32-bit and we are aligning a long long.
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 542c9da918..33778f5c64 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -243,6 +243,17 @@ typedef struct IOMMUTLBEvent {
/* RAM FD is opened read-only */
#define RAM_READONLY_FD (1 << 11)
+/* The GPA range of the VirtCCA bounce buffer is from 1GB to 4GB. */
+#define VIRTCCA_SHARED_HUGEPAGE_MAX_SIZE 0xc0000000ULL
+
+/* The VirtCCA shared hugepage memory granularity is 1GB */
+#define VIRTCCA_SHARED_HUGEPAGE_ALIGN 0x40000000ULL
+
+/* The GPA starting address of the VirtCCA CVM is 1GB */
+#define VIRTCCA_GPA_START 0x40000000ULL
+
+extern uint64_t virtcca_cvm_ram_size;
+
static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
IOMMUNotifierFlag flags,
hwaddr start, hwaddr end,
diff --git a/system/physmem.c b/system/physmem.c
index 250f315bc8..8f4be2d131 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -89,9 +89,17 @@ RAMList ram_list = { .blocks = QLIST_HEAD_INITIALIZER(ram_list.blocks) };
static MemoryRegion *system_memory;
static MemoryRegion *system_io;
+static MemoryRegion *virtcca_shared_memory;
+
+/*
+ * Serves as the sub-MR of the root MR (virtcca_shared_memory)
+ * and is associated with the RAMBlock.
+ */
+MemoryRegion *virtcca_shared_hugepage;
AddressSpace address_space_io;
AddressSpace address_space_memory;
+AddressSpace address_space_virtcca_shared_memory;
static MemoryRegion io_mem_unassigned;
@@ -2586,6 +2594,15 @@ static void memory_map_init(void)
address_space_init(&address_space_io, system_io, "I/O");
}
+void virtcca_shared_memory_address_space_init(void)
+{
+ virtcca_shared_memory = g_malloc(sizeof(*virtcca_shared_memory));
+ memory_region_init(virtcca_shared_memory, NULL,
+ "virtcca_shared_memory", UINT64_MAX);
+ address_space_init(&address_space_virtcca_shared_memory,
+ virtcca_shared_memory, "virtcca_shared_memory");
+}
+
MemoryRegion *get_system_memory(void)
{
return system_memory;
diff --git a/system/vl.c b/system/vl.c
index a1e5e68773..7c10cd1337 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -3784,6 +3784,15 @@ void qemu_init(int argc, char **argv)
configure_accelerators(argv[0]);
phase_advance(PHASE_ACCEL_CREATED);
+ /*
+ * Must run after kvm_init completes, as virtcca_cvm_enabled()
+ * depends on initialization performed in kvm_init.
+ */
+ if (virtcca_cvm_enabled()) {
+ virtcca_cvm_ram_size = current_machine->ram_size;
+ virtcca_shared_memory_address_space_init();
+ }
+
/*
* Beware, QOM objects created before this point miss global and
* compat properties.
--
2.41.0.windows.1

View File

@ -0,0 +1,62 @@
From 84321dcfb4ec3d08984e7680c8efad80907bde84 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>
Date: Mon, 29 Jul 2024 15:44:13 +0100
Subject: [PATCH] contrib/plugins: add compat for g_memdup2
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
We were premature if bumping this because some of our builds are still
on older glibs. Just copy the compat handler for now and we can remove
it later.
Fixes: ee293103b0 (plugins: update lockstep to use g_memdup2)
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2161
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240729144414.830369-14-alex.bennee@linaro.org>
(cherry picked from commit 44e794896759236885f6d30d1f6b9b8b76355d52)
Signed-off-by: zhujun2 <zhujun2_yewu@cmss.chinamobile.com>
---
contrib/plugins/lockstep.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
diff --git a/contrib/plugins/lockstep.c b/contrib/plugins/lockstep.c
index 237543b43a..0c6f060183 100644
--- a/contrib/plugins/lockstep.c
+++ b/contrib/plugins/lockstep.c
@@ -100,6 +100,31 @@ static void plugin_exit(qemu_plugin_id_t id, void *p)
plugin_cleanup(id);
}
+/*
+ * g_memdup has been deprecated in Glib since 2.68 and
+ * will complain about it if you try to use it. However until
+ * glib_req_ver for QEMU is bumped we make a copy of the glib-compat
+ * handler.
+ */
+static inline gpointer g_memdup2_qemu(gconstpointer mem, gsize byte_size)
+{
+#if GLIB_CHECK_VERSION(2, 68, 0)
+ return g_memdup2(mem, byte_size);
+#else
+ gpointer new_mem;
+
+ if (mem && byte_size != 0) {
+ new_mem = g_malloc(byte_size);
+ memcpy(new_mem, mem, byte_size);
+ } else {
+ new_mem = NULL;
+ }
+
+ return new_mem;
+#endif
+}
+#define g_memdup2(m, s) g_memdup2_qemu(m, s)
+
static void report_divergance(ExecState *us, ExecState *them)
{
DivergeState divrec = { log, 0 };
--
2.41.0.windows.1

View File

@ -0,0 +1,37 @@
From c5b349f9ff0792cce72cdd1ade2521c568058a25 Mon Sep 17 00:00:00 2001
From: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
Date: Mon, 18 Nov 2024 14:20:56 -0500
Subject: [PATCH] cpu: ensure we don't call start_exclusive from cpu_exec
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
cheery-pick from 779f30a01af8566780cefc8639505b758950afb3
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20241025175857.2554252-3-pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
---
cpu-common.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/cpu-common.c b/cpu-common.c
index 54e63b3f77..a949ad7ca3 100644
--- a/cpu-common.c
+++ b/cpu-common.c
@@ -234,6 +234,9 @@ void start_exclusive(void)
CPUState *other_cpu;
int running_cpus;
+ /* Ensure we are not running, or start_exclusive will be blocked. */
+ g_assert(!current_cpu->running);
+
if (current_cpu->exclusive_context_count) {
current_cpu->exclusive_context_count++;
return;
--
2.41.0.windows.1

View File

@ -0,0 +1,44 @@
From 0029172c2c57c18d6aef61070c2471f40de6bb45 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Daniel=20P=2E=20Berrang=C3=A9?= <berrange@redhat.com>
Date: Wed, 30 Oct 2024 10:08:12 +0000
Subject: [PATCH] crypto: fix error check on gcry_md_open
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Gcrypt does not return negative values on error, it returns non-zero
values. This caused QEMU not to detect failure to open an unsupported
hash, resulting in a later crash trying to use a NULL context.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: cheliequan <cheliequan@inspur.com>
---
crypto/hash-gcrypt.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/crypto/hash-gcrypt.c b/crypto/hash-gcrypt.c
index d3bdfe5633..bf5d7ff9ba 100644
--- a/crypto/hash-gcrypt.c
+++ b/crypto/hash-gcrypt.c
@@ -56,7 +56,7 @@ qcrypto_gcrypt_hash_bytesv(QCryptoHashAlgorithm alg,
size_t *resultlen,
Error **errp)
{
- int i, ret;
+ gcry_error_t ret;
gcry_md_hd_t md;
unsigned char *digest;
@@ -69,7 +69,7 @@ qcrypto_gcrypt_hash_bytesv(QCryptoHashAlgorithm alg,
ret = gcry_md_open(&md, qcrypto_hash_alg_map[alg], 0);
- if (ret < 0) {
+ if (ret != 0) {
error_setg(errp,
"Unable to initialize hash algorithm: %s",
gcry_strerror(ret));
--
2.41.0.windows.1

View File

@ -0,0 +1,48 @@
From 17d589becc1a66934e55a4e2efffdd3876d56130 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Daniel=20P=2E=20Berrang=C3=A9?= <berrange@redhat.com>
Date: Wed, 30 Oct 2024 10:09:30 +0000
Subject: [PATCH] crypto: perform runtime check for hash/hmac support in gcrypt
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
gcrypto has the ability to dynamically disable hash/hmac algorithms
at runtime, so QEMU must perform a runtime check.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: cheliequan <cheliequan@inspur.com>
---
crypto/hash-gcrypt.c | 2 +-
crypto/hmac-gcrypt.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/crypto/hash-gcrypt.c b/crypto/hash-gcrypt.c
index d3bdfe5633..2b6dbd97bb 100644
--- a/crypto/hash-gcrypt.c
+++ b/crypto/hash-gcrypt.c
@@ -42,7 +42,7 @@ gboolean qcrypto_hash_supports(QCryptoHashAlgorithm alg)
{
if (alg < G_N_ELEMENTS(qcrypto_hash_alg_map) &&
qcrypto_hash_alg_map[alg] != GCRY_MD_NONE) {
- return true;
+ return gcry_md_test_algo(qcrypto_hash_alg_map[alg]) == 0;
}
return false;
}
diff --git a/crypto/hmac-gcrypt.c b/crypto/hmac-gcrypt.c
index 888afb86ed..15926fccfa 100644
--- a/crypto/hmac-gcrypt.c
+++ b/crypto/hmac-gcrypt.c
@@ -40,7 +40,7 @@ bool qcrypto_hmac_supports(QCryptoHashAlgorithm alg)
{
if (alg < G_N_ELEMENTS(qcrypto_hmac_alg_map) &&
qcrypto_hmac_alg_map[alg] != GCRY_MAC_NONE) {
- return true;
+ return gcry_mac_test_algo(qcrypto_hmac_alg_map[alg]) == 0;
}
return false;
--
2.41.0.windows.1

View File

@ -0,0 +1,52 @@
From ca3f4fd234ea4b8f02a415b99b449e71d028c076 Mon Sep 17 00:00:00 2001
From: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
Date: Tue, 8 Apr 2025 07:27:47 -0400
Subject: [PATCH] cryptodev: Fix error handling in
cryptodev_lkcf_execute_task()
cheery-pick from 1c89dfefc4c33295126208225f202f39b5a234c3
When cryptodev_lkcf_set_op_desc() fails, we report an error, but
continue anyway. This is wrong. We then pass a non-null @local_error
to various functions, which could easily fail error_setv()'s assertion
on failure.
Fail the function instead.
When qcrypto_akcipher_new() fails, we fail the function without
reporting the error. This leaks the Error object.
Add the missing error reporting. This also frees the Error object.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250312101131.1615777-1-armbru@redhat.com>
Reviewed-by: zhenwei pi <pizhenwei@bytedance.com>
Signed-off-by: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
---
backends/cryptodev-lkcf.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/backends/cryptodev-lkcf.c b/backends/cryptodev-lkcf.c
index 45aba1ff67..45b287a953 100644
--- a/backends/cryptodev-lkcf.c
+++ b/backends/cryptodev-lkcf.c
@@ -330,6 +330,8 @@ static void cryptodev_lkcf_execute_task(CryptoDevLKCFTask *task)
cryptodev_lkcf_set_op_desc(&session->akcipher_opts, op_desc,
sizeof(op_desc), &local_error) != 0) {
error_report_err(local_error);
+ status = -VIRTIO_CRYPTO_ERR;
+ goto out;
} else {
key_id = add_key(KCTL_KEY_TYPE_PKEY, "lkcf-backend-priv-key",
p8info, p8info_len, KCTL_KEY_RING);
@@ -346,6 +348,7 @@ static void cryptodev_lkcf_execute_task(CryptoDevLKCFTask *task)
session->key, session->keylen,
&local_error);
if (!akcipher) {
+ error_report_err(local_error);
status = -VIRTIO_CRYPTO_ERR;
goto out;
}
--
2.41.0.windows.1

View File

@ -0,0 +1,119 @@
From 2753607e8768002debb4608dacafe1309420a4dd Mon Sep 17 00:00:00 2001
From: Tao Su <tao1.su@linux.intel.com>
Date: Tue, 21 Jan 2025 10:06:50 +0800
Subject: [PATCH] docs: Add GNR, SRF and CWF CPU models
commit 0a6dec6d11e5e392dcd6299548bf1514f1201707 upstream.
Update GraniteRapids, SierraForest and ClearwaterForest CPU models in
section "Preferred CPU models for Intel x86 hosts".
Also introduce bhi-no, gds-no and rfds-no in doc.
Intel-SIG: commit 0a6dec6d11e5 docs: Add GNR, SRF and CWF CPU models.
Suggested-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250121020650.1899618-5-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[ Quanxian Wang: amend commit log ]
Signed-off-by: Quanxian Wang <quanxian.wang@intel.com>
---
docs/system/cpu-models-x86.rst.inc | 50 +++++++++++++++++++++++++++---
1 file changed, 46 insertions(+), 4 deletions(-)
diff --git a/docs/system/cpu-models-x86.rst.inc b/docs/system/cpu-models-x86.rst.inc
index 7f6368f999..37fe1d0ac8 100644
--- a/docs/system/cpu-models-x86.rst.inc
+++ b/docs/system/cpu-models-x86.rst.inc
@@ -71,6 +71,16 @@ mixture of host CPU models between machines, if live migration
compatibility is required, use the newest CPU model that is compatible
across all desired hosts.
+``ClearwaterForest``
+ Intel Xeon Processor (ClearwaterForest, 2025)
+
+``SierraForest``, ``SierraForest-v2``
+ Intel Xeon Processor (SierraForest, 2024), SierraForest-v2 mitigates
+ the GDS and RFDS vulnerabilities with stepping 3.
+
+``GraniteRapids``, ``GraniteRapids-v2``
+ Intel Xeon Processor (GraniteRapids, 2024)
+
``Cascadelake-Server``, ``Cascadelake-Server-noTSX``
Intel Xeon Processor (Cascade Lake, 2019), with "stepping" levels 6
or 7 only. (The Cascade Lake Xeon processor with *stepping 5 is
@@ -181,7 +191,7 @@ features are included if using "Host passthrough" or "Host model".
CVE-2018-12127, [MSBDS] CVE-2018-12126).
This is an MSR (Model-Specific Register) feature rather than a CPUID feature,
- so it will not appear in the Linux ``/proc/cpuinfo`` in the host or
+ therefore it will not appear in the Linux ``/proc/cpuinfo`` in the host or
guest. Instead, the host kernel uses it to populate the MDS
vulnerability file in ``sysfs``.
@@ -189,10 +199,10 @@ features are included if using "Host passthrough" or "Host model".
affected} in the ``/sys/devices/system/cpu/vulnerabilities/mds`` file.
``taa-no``
- Recommended to inform that the guest that the host is ``not``
+ Recommended to inform the guest that the host is ``not``
vulnerable to CVE-2019-11135, TSX Asynchronous Abort (TAA).
- This too is an MSR feature, so it does not show up in the Linux
+ This is also an MSR feature, therefore it does not show up in the Linux
``/proc/cpuinfo`` in the host or guest.
It should only be enabled for VMs if the host reports ``Not affected``
@@ -214,7 +224,7 @@ features are included if using "Host passthrough" or "Host model".
By disabling TSX, KVM-based guests can avoid paying the price of
mitigating TSX-based attacks.
- Note that ``tsx-ctrl`` too is an MSR feature, so it does not show
+ Note that ``tsx-ctrl`` is also an MSR feature, therefore it does not show
up in the Linux ``/proc/cpuinfo`` in the host or guest.
To validate that Intel TSX is indeed disabled for the guest, there are
@@ -223,6 +233,38 @@ features are included if using "Host passthrough" or "Host model".
``/sys/devices/system/cpu/vulnerabilities/tsx_async_abort`` file in
the guest should report ``Mitigation: TSX disabled``.
+``bhi-no``
+ Recommended to inform the guest that the host is ``not``
+ vulnerable to CVE-2022-0001, Branch History Injection (BHI).
+
+ This is also an MSR feature, therefore it does not show up in the Linux
+ ``/proc/cpuinfo`` in the host or guest.
+
+ It should only be enabled for VMs if the host reports
+ ``BHI: Not affected`` in the
+ ``/sys/devices/system/cpu/vulnerabilities/spectre_v2`` file.
+
+``gds-no``
+ Recommended to inform the guest that the host is ``not``
+ vulnerable to CVE-2022-40982, Gather Data Sampling (GDS).
+
+ This is also an MSR feature, therefore it does not show up in the Linux
+ ``/proc/cpuinfo`` in the host or guest.
+
+ It should only be enabled for VMs if the host reports ``Not affected``
+ in the ``/sys/devices/system/cpu/vulnerabilities/gather_data_sampling``
+ file.
+
+``rfds-no``
+ Recommended to inform the guest that the host is ``not``
+ vulnerable to CVE-2023-28746, Register File Data Sampling (RFDS).
+
+ This is also an MSR feature, therefore it does not show up in the Linux
+ ``/proc/cpuinfo`` in the host or guest.
+
+ It should only be enabled for VMs if the host reports ``Not affected``
+ in the ``/sys/devices/system/cpu/vulnerabilities/reg_file_data_sampling``
+ file.
Preferred CPU models for AMD x86 hosts
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
--
2.41.0.windows.1

View File

@ -0,0 +1,220 @@
From fd1d6d64803a052adcab8c7993ca40cabc9c926d Mon Sep 17 00:00:00 2001
From: Zhenzhong Duan <zhenzhong.duan@intel.com>
Date: Sat, 11 Jan 2025 10:53:03 +0800
Subject: [PATCH] docs/devel: Add VFIO iommufd backend documentation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
---
MAINTAINERS | 1 +
docs/devel/index-internals.rst | 1 +
docs/devel/vfio-iommufd.rst | 166 +++++++++++++++++++++++++++++++++
3 files changed, 168 insertions(+)
create mode 100644 docs/devel/vfio-iommufd.rst
diff --git a/MAINTAINERS b/MAINTAINERS
index ca70bb4e64..0ddb20a35f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2176,6 +2176,7 @@ F: backends/iommufd.c
F: include/sysemu/iommufd.h
F: include/qemu/chardev_open.h
F: util/chardev_open.c
+F: docs/devel/vfio-iommufd.rst
vhost
M: Michael S. Tsirkin <mst@redhat.com>
diff --git a/docs/devel/index-internals.rst b/docs/devel/index-internals.rst
index 6f81df92bc..3def4a138b 100644
--- a/docs/devel/index-internals.rst
+++ b/docs/devel/index-internals.rst
@@ -18,5 +18,6 @@ Details about QEMU's various subsystems including how to add features to them.
s390-dasd-ipl
tracing
vfio-migration
+ vfio-iommufd
writing-monitor-commands
virtio-backends
diff --git a/docs/devel/vfio-iommufd.rst b/docs/devel/vfio-iommufd.rst
new file mode 100644
index 0000000000..3d1c11f175
--- /dev/null
+++ b/docs/devel/vfio-iommufd.rst
@@ -0,0 +1,166 @@
+===============================
+IOMMUFD BACKEND usage with VFIO
+===============================
+
+(Same meaning for backend/container/BE)
+
+With the introduction of iommufd, the Linux kernel provides a generic
+interface for user space drivers to propagate their DMA mappings to kernel
+for assigned devices. While the legacy kernel interface is group-centric,
+the new iommufd interface is device-centric, relying on device fd and iommufd.
+
+To support both interfaces in the QEMU VFIO device, introduce a base container
+to abstract the common part of VFIO legacy and iommufd container. So that the
+generic VFIO code can use either container.
+
+The base container implements generic functions such as memory_listener and
+address space management whereas the derived container implements callbacks
+specific to either legacy or iommufd. Each container has its own way to setup
+secure context and dma management interface. The below diagram shows how it
+looks like with both containers.
+
+::
+
+ VFIO AddressSpace/Memory
+ +-------+ +----------+ +-----+ +-----+
+ | pci | | platform | | ap | | ccw |
+ +---+---+ +----+-----+ +--+--+ +--+--+ +----------------------+
+ | | | | | AddressSpace |
+ | | | | +------------+---------+
+ +---V-----------V-----------V--------V----+ /
+ | VFIOAddressSpace | <------------+
+ | | | MemoryListener
+ | VFIOContainerBase list |
+ +-------+----------------------------+----+
+ | |
+ | |
+ +-------V------+ +--------V----------+
+ | iommufd | | vfio legacy |
+ | container | | container |
+ +-------+------+ +--------+----------+
+ | |
+ | /dev/iommu | /dev/vfio/vfio
+ | /dev/vfio/devices/vfioX | /dev/vfio/$group_id
+ Userspace | |
+ ============+============================+===========================
+ Kernel | device fd |
+ +---------------+ | group/container fd
+ | (BIND_IOMMUFD | | (SET_CONTAINER/SET_IOMMU)
+ | ATTACH_IOAS) | | device fd
+ | | |
+ | +-------V------------V-----------------+
+ iommufd | | vfio |
+ (map/unmap | +---------+--------------------+-------+
+ ioas_copy) | | | map/unmap
+ | | |
+ +------V------+ +-----V------+ +------V--------+
+ | iommfd core | | device | | vfio iommu |
+ +-------------+ +------------+ +---------------+
+
+* Secure Context setup
+
+ - iommufd BE: uses device fd and iommufd to setup secure context
+ (bind_iommufd, attach_ioas)
+ - vfio legacy BE: uses group fd and container fd to setup secure context
+ (set_container, set_iommu)
+
+* Device access
+
+ - iommufd BE: device fd is opened through ``/dev/vfio/devices/vfioX``
+ - vfio legacy BE: device fd is retrieved from group fd ioctl
+
+* DMA Mapping flow
+
+ 1. VFIOAddressSpace receives MemoryRegion add/del via MemoryListener
+ 2. VFIO populates DMA map/unmap via the container BEs
+ * iommufd BE: uses iommufd
+ * vfio legacy BE: uses container fd
+
+Example configuration
+=====================
+
+Step 1: configure the host device
+---------------------------------
+
+It's exactly same as the VFIO device with legacy VFIO container.
+
+Step 2: configure QEMU
+----------------------
+
+Interactions with the ``/dev/iommu`` are abstracted by a new iommufd
+object (compiled in with the ``CONFIG_IOMMUFD`` option).
+
+Any QEMU device (e.g. VFIO device) wishing to use ``/dev/iommu`` must
+be linked with an iommufd object. It gets a new optional property
+named iommufd which allows to pass an iommufd object. Take ``vfio-pci``
+device for example:
+
+.. code-block:: bash
+
+ -object iommufd,id=iommufd0
+ -device vfio-pci,host=0000:02:00.0,iommufd=iommufd0
+
+Note the ``/dev/iommu`` and VFIO cdev can be externally opened by a
+management layer. In such a case the fd is passed, the fd supports a
+string naming the fd or a number, for example:
+
+.. code-block:: bash
+
+ -object iommufd,id=iommufd0,fd=22
+ -device vfio-pci,iommufd=iommufd0,fd=23
+
+If the ``fd`` property is not passed, the fd is opened by QEMU.
+
+If no ``iommufd`` object is passed to the ``vfio-pci`` device, iommufd
+is not used and the user gets the behavior based on the legacy VFIO
+container:
+
+.. code-block:: bash
+
+ -device vfio-pci,host=0000:02:00.0
+
+Supported platform
+==================
+
+Supports x86, ARM and s390x currently.
+
+Caveats
+=======
+
+Dirty page sync
+---------------
+
+Dirty page sync with iommufd backend is unsupported yet, live migration is
+disabled by default. But it can be force enabled like below, low efficient
+though.
+
+.. code-block:: bash
+
+ -object iommufd,id=iommufd0
+ -device vfio-pci,host=0000:02:00.0,iommufd=iommufd0,enable-migration=on
+
+P2P DMA
+-------
+
+PCI p2p DMA is unsupported as IOMMUFD doesn't support mapping hardware PCI
+BAR region yet. Below warning shows for assigned PCI device, it's not a bug.
+
+.. code-block:: none
+
+ qemu-system-x86_64: warning: IOMMU_IOAS_MAP failed: Bad address, PCI BAR?
+ qemu-system-x86_64: vfio_container_dma_map(0x560cb6cb1620, 0xe000000021000, 0x3000, 0x7f32ed55c000) = -14 (Bad address)
+
+FD passing with mdev
+--------------------
+
+``vfio-pci`` device checks sysfsdev property to decide if backend is a mdev.
+If FD passing is used, there is no way to know that and the mdev is treated
+like a real PCI device. There is an error as below if user wants to enable
+RAM discarding for mdev.
+
+.. code-block:: none
+
+ qemu-system-x86_64: -device vfio-pci,iommufd=iommufd0,x-balloon-allowed=on,fd=9: vfio VFIO_FD9: x-balloon-allowed only potentially compatible with mdev devices
+
+``vfio-ap`` and ``vfio-ccw`` devices don't have same issue as their backend
+devices are always mdev and RAM discarding is force enabled.
--
2.41.0.windows.1

View File

@ -0,0 +1,64 @@
From b93ac4e4fd07e36b95ce211faefd0c7912b6f62a Mon Sep 17 00:00:00 2001
From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Date: Tue, 3 Dec 2024 13:18:06 +0000
Subject: [PATCH] fw_cfg: Don't set callback_opaque NULL in
fw_cfg_modify_bytes_read()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
On arm/virt platform, Chen Xiang reported a Guest crash while
attempting the below steps,
1. Launch the Guest with nvdimm=on
2. Hot-add a NVDIMM dev
3. Reboot
4. Guest boots fine.
5. Reboot again.
6. Guest boot fails.
QEMU_EFI reports the below error:
ProcessCmdAddPointer: invalid pointer value in "etc/acpi/tables"
OnRootBridgesConnected: InstallAcpiTables: Protocol Error
Debugging shows that on first reboot(after hot adding NVDIMM),
Qemu updates the etc/table-loader len,
qemu_ram_resize()
  fw_cfg_modify_file()
     fw_cfg_modify_bytes_read()
And in fw_cfg_modify_bytes_read() we set the "callback_opaque" for
the key entry to NULL. Because of this, on the second reboot,
virt_acpi_build_update() is called with a NULL "build_state" and
returns without updating the ACPI tables. This seems to be
upsetting the firmware.
To fix this, don't change the callback_opaque in fw_cfg_modify_bytes_read().
Fixes: bdbb5b1706d165 ("fw_cfg: add fw_cfg_machine_reset function")
Reported-by: chenxiang <chenxiang66@hisilicon.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Message-ID: <20241203131806.37548-1-shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
---
hw/nvram/fw_cfg.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index 4e4524673a..d32079ebdf 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -729,7 +729,6 @@ static void *fw_cfg_modify_bytes_read(FWCfgState *s, uint16_t key,
ptr = s->entries[arch][key].data;
s->entries[arch][key].data = data;
s->entries[arch][key].len = len;
- s->entries[arch][key].callback_opaque = NULL;
s->entries[arch][key].allow_write = false;
return ptr;
--
2.41.0.windows.1

View File

@ -0,0 +1,57 @@
From b1087bb8a4edbacc7240c0fcab63bc1cf2624627 Mon Sep 17 00:00:00 2001
From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Date: Tue, 21 Jan 2025 14:42:45 +0000
Subject: [PATCH] gpex-acpi: Remove duplicate DSM #5
It looks like acpi_dsdt_add_pci_osc() already builds the _DSM
for virt/gpex case, and we don't need to add duplicate DSM methods
for _DSM #5 case.
And the acpi_dsdt_add_pci_osc() already adds _DSM #5 when
preserve_config is true.
This is to get rid of the ACPI related error messages during boot:
ACPI BIOS Error (bug): Failure creating named object [\_SB.PC08._DSM], AE_ALREADY_EXISTS
ACPI BIOS Error (bug): \_SB.PC08.PCI0._DSM: Excess arguments - ASL declared 5, ACPI requires 4
ToDo: Only sanity tested.
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
hw/pci-host/gpex-acpi.c | 12 ------------
1 file changed, 12 deletions(-)
diff --git a/hw/pci-host/gpex-acpi.c b/hw/pci-host/gpex-acpi.c
index ce424fc9da..162f6221ab 100644
--- a/hw/pci-host/gpex-acpi.c
+++ b/hw/pci-host/gpex-acpi.c
@@ -189,12 +189,6 @@ void acpi_dsdt_add_gpex(Aml *scope, struct GPEXConfig *cfg)
aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
}
- if (cfg->preserve_config) {
- method = aml_method("_DSM", 5, AML_SERIALIZED);
- aml_append(method, aml_return(aml_int(0)));
- aml_append(dev, method);
- }
-
acpi_dsdt_add_pci_route_table(dev, cfg->irq);
/*
@@ -226,12 +220,6 @@ void acpi_dsdt_add_gpex(Aml *scope, struct GPEXConfig *cfg)
aml_append(dev, aml_name_decl("_STR", aml_unicode("PCIe 0 Device")));
aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
- if (cfg->preserve_config) {
- method = aml_method("_DSM", 5, AML_SERIALIZED);
- aml_append(method, aml_return(aml_int(0)));
- aml_append(dev, method);
- }
-
acpi_dsdt_add_pci_route_table(dev, cfg->irq);
method = aml_method("_CBA", 0, AML_NOTSERIALIZED);
--
2.41.0.windows.1

View File

@ -0,0 +1,57 @@
From 885c1bf512582757f9d7e2e360701f72a9d6e95f Mon Sep 17 00:00:00 2001
From: Zhang Jiao <zhangjiao2_yewu@cmss.chinamobile.com>
Date: Thu, 12 Dec 2024 11:27:23 +0800
Subject: [PATCH] hvf: remove unused but set variable
cheery-pick from 19d542cc0bce0b3641e80444374f9ffd8294a15b
fixes associated warning when building on MacOS.
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Link: https://lore.kernel.org/r/20241023182922.1040964-1-pierrick.bouvier@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Zhang Jiao <zhangjiao2_yewu@cmss.chinamobile.com>
---
target/i386/hvf/x86_task.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/target/i386/hvf/x86_task.c b/target/i386/hvf/x86_task.c
index f09bfbdda5..cdea2ea69d 100644
--- a/target/i386/hvf/x86_task.c
+++ b/target/i386/hvf/x86_task.c
@@ -122,7 +122,6 @@ void vmx_handle_task_switch(CPUState *cpu, x68_segment_selector tss_sel, int rea
load_regs(cpu);
struct x86_segment_descriptor curr_tss_desc, next_tss_desc;
- int ret;
x68_segment_selector old_tss_sel = vmx_read_segment_selector(cpu, R_TR);
uint64_t old_tss_base = vmx_read_segment_base(cpu, R_TR);
uint32_t desc_limit;
@@ -138,7 +137,7 @@ void vmx_handle_task_switch(CPUState *cpu, x68_segment_selector tss_sel, int rea
if (reason == TSR_IDT_GATE && gate_valid) {
int dpl;
- ret = x86_read_call_gate(cpu, &task_gate_desc, gate);
+ x86_read_call_gate(cpu, &task_gate_desc, gate);
dpl = task_gate_desc.dpl;
x68_segment_selector cs = vmx_read_segment_selector(cpu, R_CS);
@@ -167,11 +166,12 @@ void vmx_handle_task_switch(CPUState *cpu, x68_segment_selector tss_sel, int rea
x86_write_segment_descriptor(cpu, &next_tss_desc, tss_sel);
}
- if (next_tss_desc.type & 8)
- ret = task_switch_32(cpu, tss_sel, old_tss_sel, old_tss_base, &next_tss_desc);
- else
+ if (next_tss_desc.type & 8) {
+ task_switch_32(cpu, tss_sel, old_tss_sel, old_tss_base, &next_tss_desc);
+ } else {
//ret = task_switch_16(cpu, tss_sel, old_tss_sel, old_tss_base, &next_tss_desc);
VM_PANIC("task_switch_16");
+ }
macvm_set_cr0(cpu->accel->fd, rvmcs(cpu->accel->fd, VMCS_GUEST_CR0) |
CR0_TS_MASK);
--
2.41.0.windows.1

View File

@ -0,0 +1,34 @@
From bcb031b40fe40d5b6347b2134fb039945b87e8a3 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Sat, 11 Jan 2025 10:52:55 +0800
Subject: [PATCH] hw/arm: Activate IOMMUFD for virt machines
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
---
hw/arm/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index c0a7d0bd58..4a0ea0628f 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -8,6 +8,7 @@ config ARM_VIRT
imply TPM_TIS_SYSBUS
imply TPM_TIS_I2C
imply NVDIMM
+ imply IOMMUFD
select ARM_GIC
select ACPI
select ARM_SMMUV3
--
2.41.0.windows.1

View File

@ -0,0 +1,67 @@
From d589010512005bfc698f30417911e4b14478c81b Mon Sep 17 00:00:00 2001
From: Nicolin Chen <nicolinc@nvidia.com>
Date: Wed, 22 Jun 2022 01:30:39 -0700
Subject: [PATCH] hw/arm/smmu-common: Add a nested flag to SMMUState
Add a nested flag in the SMMUState, passed in from device property.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
hw/arm/smmu-common.c | 1 +
hw/arm/smmuv3.c | 5 +++++
include/hw/arm/smmu-common.h | 4 ++++
3 files changed, 10 insertions(+)
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 9a8ac45431..c5f3e02065 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -683,6 +683,7 @@ static Property smmu_dev_properties[] = {
DEFINE_PROP_UINT8("bus_num", SMMUState, bus_num, 0),
DEFINE_PROP_LINK("primary-bus", SMMUState, primary_bus,
TYPE_PCI_BUS, PCIBus *),
+ DEFINE_PROP_BOOL("nested", SMMUState, nested, false),
DEFINE_PROP_END_OF_LIST(),
};
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index c3871ae067..64ca4c5542 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1746,6 +1746,11 @@ static void smmu_realize(DeviceState *d, Error **errp)
SysBusDevice *dev = SYS_BUS_DEVICE(d);
Error *local_err = NULL;
+ if (s->stage && strcmp("1", s->stage)) {
+ /* Only support nested with an stage1 only vSMMU */
+ sys->nested = false;
+ }
+
c->parent_realize(d, &local_err);
if (local_err) {
error_propagate(errp, local_err);
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
index fd8d772da1..eae5d4d05b 100644
--- a/include/hw/arm/smmu-common.h
+++ b/include/hw/arm/smmu-common.h
@@ -22,6 +22,7 @@
#include "hw/sysbus.h"
#include "hw/pci/pci.h"
#include "qom/object.h"
+#include "sysemu/iommufd.h"
#define SMMU_PCI_BUS_MAX 256
#define SMMU_PCI_DEVFN_MAX 256
@@ -136,6 +137,9 @@ struct SMMUState {
const char *mrtypename;
MemoryRegion iomem;
+ /* Nested SMMU */
+ bool nested;
+
GHashTable *smmu_pcibus_by_busptr;
GHashTable *configs; /* cache for configuration data */
GHashTable *iotlb;
--
2.41.0.windows.1

View File

@ -0,0 +1,179 @@
From a2735cd15160a62065a0a0b39af405c7b0f3fae8 Mon Sep 17 00:00:00 2001
From: Nicolin Chen <nicolinc@nvidia.com>
Date: Wed, 22 Jun 2022 14:41:27 -0700
Subject: [PATCH] hw/arm/smmu-common: Add iommufd helpers
Add a set of helper functions for IOMMUFD and new "struct SMMUS1Hwpt"
to store the nested hwpt information.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
hw/arm/smmu-common.c | 108 +++++++++++++++++++++++++++++++++++
include/hw/arm/smmu-common.h | 20 +++++++
2 files changed, 128 insertions(+)
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 038ae857d8..a79eb34277 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -838,6 +838,114 @@ IOMMUMemoryRegion *smmu_iommu_mr(SMMUState *s, uint32_t sid)
return NULL;
}
+/* IOMMUFD helpers */
+int smmu_dev_get_info(SMMUDevice *sdev, uint32_t *data_type,
+ uint32_t data_len, void *data)
+{
+ uint64_t caps;
+
+ if (!sdev || !sdev->idev) {
+ return -ENOENT;
+ }
+
+ return !iommufd_backend_get_device_info(sdev->idev->iommufd,
+ sdev->idev->devid, data_type, data,
+ data_len, &caps, NULL);
+}
+
+void smmu_dev_uninstall_nested_ste(SMMUDevice *sdev, bool abort)
+{
+ HostIOMMUDeviceIOMMUFD *idev = sdev->idev;
+ SMMUS1Hwpt *s1_hwpt = sdev->s1_hwpt;
+ uint32_t hwpt_id;
+
+ if (!s1_hwpt || !sdev->viommu) {
+ return;
+ }
+
+ if (abort) {
+ hwpt_id = sdev->viommu->abort_hwpt_id;
+ } else {
+ hwpt_id = sdev->viommu->bypass_hwpt_id;
+ }
+
+ if (!host_iommu_device_iommufd_attach_hwpt(idev, hwpt_id, NULL)) {
+ return;
+ }
+
+ iommufd_backend_free_id(idev->iommufd, s1_hwpt->hwpt_id);
+ sdev->s1_hwpt = NULL;
+ g_free(s1_hwpt);
+}
+
+int smmu_dev_install_nested_ste(SMMUDevice *sdev, uint32_t data_type,
+ uint32_t data_len, void *data)
+{
+ SMMUViommu *viommu = sdev->viommu;
+ SMMUS1Hwpt *s1_hwpt = sdev->s1_hwpt;
+ HostIOMMUDeviceIOMMUFD *idev = sdev->idev;
+
+ if (!idev || !viommu) {
+ return -ENOENT;
+ }
+
+ if (s1_hwpt) {
+ smmu_dev_uninstall_nested_ste(sdev, false);
+ }
+
+ s1_hwpt = g_new0(SMMUS1Hwpt, 1);
+ if (!s1_hwpt) {
+ return -ENOMEM;
+ }
+
+ s1_hwpt->smmu = sdev->smmu;
+ s1_hwpt->viommu = viommu;
+ s1_hwpt->iommufd = idev->iommufd;
+
+ if (!iommufd_backend_alloc_hwpt(idev->iommufd, idev->devid,
+ viommu->core->viommu_id, 0, data_type,
+ data_len, data, &s1_hwpt->hwpt_id, NULL)) {
+ goto free;
+ }
+
+ if (!host_iommu_device_iommufd_attach_hwpt(idev, s1_hwpt->hwpt_id, NULL)) {
+ goto free_hwpt;
+ }
+
+ sdev->s1_hwpt = s1_hwpt;
+
+ return 0;
+free_hwpt:
+ iommufd_backend_free_id(idev->iommufd, s1_hwpt->hwpt_id);
+free:
+ sdev->s1_hwpt = NULL;
+ g_free(s1_hwpt);
+
+ return -EINVAL;
+}
+
+int smmu_hwpt_invalidate_cache(SMMUS1Hwpt *s1_hwpt, uint32_t type, uint32_t len,
+ uint32_t *num, void *reqs)
+{
+ if (!s1_hwpt) {
+ return -ENOENT;
+ }
+
+ return iommufd_backend_invalidate_cache(s1_hwpt->iommufd, s1_hwpt->hwpt_id,
+ type, len, num, reqs);
+}
+
+int smmu_viommu_invalidate_cache(IOMMUFDViommu *viommu, uint32_t type,
+ uint32_t len, uint32_t *num, void *reqs)
+{
+ if (!viommu) {
+ return -ENOENT;
+ }
+
+ return iommufd_viommu_invalidate_cache(viommu->iommufd, viommu->viommu_id,
+ type, len, num, reqs);
+}
+
/* Unmap all notifiers attached to @mr */
static void smmu_inv_notifiers_mr(IOMMUMemoryRegion *mr)
{
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
index 3bfb68cef6..66dc7206ea 100644
--- a/include/hw/arm/smmu-common.h
+++ b/include/hw/arm/smmu-common.h
@@ -125,6 +125,15 @@ typedef struct SMMUViommu {
QLIST_ENTRY(SMMUViommu) next;
} SMMUViommu;
+typedef struct SMMUS1Hwpt {
+ void *smmu;
+ IOMMUFDBackend *iommufd;
+ SMMUViommu *viommu;
+ uint32_t hwpt_id;
+ QLIST_HEAD(, SMMUDevice) device_list;
+ QLIST_ENTRY(SMMUViommu) next;
+} SMMUS1Hwpt;
+
typedef struct SMMUDevice {
void *smmu;
PCIBus *bus;
@@ -132,6 +141,7 @@ typedef struct SMMUDevice {
IOMMUMemoryRegion iommu;
HostIOMMUDeviceIOMMUFD *idev;
SMMUViommu *viommu;
+ SMMUS1Hwpt *s1_hwpt;
AddressSpace as;
uint32_t cfg_cache_hits;
uint32_t cfg_cache_misses;
@@ -225,4 +235,14 @@ void smmu_iotlb_inv_iova(SMMUState *s, int asid, int vmid, dma_addr_t iova,
/* Unmap the range of all the notifiers registered to any IOMMU mr */
void smmu_inv_notifiers_all(SMMUState *s);
+/* IOMMUFD helpers */
+int smmu_dev_get_info(SMMUDevice *sdev, uint32_t *data_type,
+ uint32_t data_len, void *data);
+void smmu_dev_uninstall_nested_ste(SMMUDevice *sdev, bool abort);
+int smmu_dev_install_nested_ste(SMMUDevice *sdev, uint32_t data_type,
+ uint32_t data_len, void *data);
+int smmu_hwpt_invalidate_cache(SMMUS1Hwpt *s1_hwpt, uint32_t type, uint32_t len,
+ uint32_t *num, void *reqs);
+int smmu_viommu_invalidate_cache(IOMMUFDViommu *viommu, uint32_t type,
+ uint32_t len, uint32_t *num, void *reqs);
#endif /* HW_ARM_SMMU_COMMON_H */
--
2.41.0.windows.1

View File

@ -0,0 +1,283 @@
From 539e12641dc2db30a6fea7a0f061e163bc245d79 Mon Sep 17 00:00:00 2001
From: Nicolin Chen <nicolinc@nvidia.com>
Date: Wed, 22 Jun 2022 02:16:52 -0700
Subject: [PATCH] hw/arm/smmu-common: Add set/unset_iommu_device callback
Implement a set_iommu_device callback:
- Find an existing S2 hwpt to test attach() or allocate a new one
(Devices behind the same physical SMMU should share an S2 HWPT.)
- Attach the device to the S2 hwpt and add it to its device list
And add an unset_iommu_device doing the opposite cleanup routine.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
hw/arm/smmu-common.c | 177 +++++++++++++++++++++++++++++++++++
hw/arm/trace-events | 2 +
include/hw/arm/smmu-common.h | 21 +++++
3 files changed, 200 insertions(+)
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 03d9ff58d4..038ae857d8 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -20,6 +20,7 @@
#include "trace.h"
#include "exec/target_page.h"
#include "hw/core/cpu.h"
+#include "hw/pci/pci_device.h"
#include "hw/qdev-properties.h"
#include "qapi/error.h"
#include "qemu/jhash.h"
@@ -639,8 +640,184 @@ static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn)
return &sdev->as;
}
+static bool smmu_dev_attach_viommu(SMMUDevice *sdev,
+ HostIOMMUDeviceIOMMUFD *idev, Error **errp)
+{
+ struct iommu_hwpt_arm_smmuv3 bypass_data = {
+ .ste = { 0x9ULL, 0x0ULL }, //0x1ULL << (108 - 64) },
+ };
+ struct iommu_hwpt_arm_smmuv3 abort_data = {
+ .ste = { 0x1ULL, 0x0ULL },
+ };
+ SMMUState *s = sdev->smmu;
+ SMMUS2Hwpt *s2_hwpt;
+ SMMUViommu *viommu;
+ uint32_t s2_hwpt_id;
+
+ if (s->viommu) {
+ return host_iommu_device_iommufd_attach_hwpt(
+ idev, s->viommu->s2_hwpt->hwpt_id, errp);
+ }
+
+ if (!iommufd_backend_alloc_hwpt(idev->iommufd, idev->devid, idev->ioas_id,
+ IOMMU_HWPT_ALLOC_NEST_PARENT,
+ IOMMU_HWPT_DATA_NONE, 0, NULL,
+ &s2_hwpt_id, errp)) {
+ error_setg(errp, "failed to allocate an S2 hwpt");
+ return false;
+ }
+
+ /* Attach to S2 for MSI cookie */
+ if (!host_iommu_device_iommufd_attach_hwpt(idev, s2_hwpt_id, errp)) {
+ error_setg(errp, "failed to attach stage-2 HW pagetable");
+ goto free_s2_hwpt;
+ }
+
+ viommu = g_new0(SMMUViommu, 1);
+
+ viommu->core = iommufd_backend_alloc_viommu(idev->iommufd, idev->devid,
+ IOMMU_VIOMMU_TYPE_ARM_SMMUV3,
+ s2_hwpt_id);
+ if (!viommu->core) {
+ error_setg(errp, "failed to allocate a viommu");
+ goto free_viommu;
+ }
+
+ if (!iommufd_backend_alloc_hwpt(idev->iommufd, idev->devid,
+ viommu->core->viommu_id, 0,
+ IOMMU_HWPT_DATA_ARM_SMMUV3,
+ sizeof(abort_data), &abort_data,
+ &viommu->abort_hwpt_id, errp)) {
+ error_setg(errp, "failed to allocate an abort pagetable");
+ goto free_viommu_core;
+ }
+
+ if (!iommufd_backend_alloc_hwpt(idev->iommufd, idev->devid,
+ viommu->core->viommu_id, 0,
+ IOMMU_HWPT_DATA_ARM_SMMUV3,
+ sizeof(bypass_data), &bypass_data,
+ &viommu->bypass_hwpt_id, errp)) {
+ error_setg(errp, "failed to allocate a bypass pagetable");
+ goto free_abort_hwpt;
+ }
+
+ if (!host_iommu_device_iommufd_attach_hwpt(
+ idev, viommu->bypass_hwpt_id, errp)) {
+ error_setg(errp, "failed to attach the bypass pagetable");
+ goto free_bypass_hwpt;
+ }
+
+ s2_hwpt = g_new0(SMMUS2Hwpt, 1);
+ s2_hwpt->iommufd = idev->iommufd;
+ s2_hwpt->hwpt_id = s2_hwpt_id;
+ s2_hwpt->ioas_id = idev->ioas_id;
+
+ viommu->iommufd = idev->iommufd;
+ viommu->s2_hwpt = s2_hwpt;
+
+ s->viommu = viommu;
+ return true;
+
+free_bypass_hwpt:
+ iommufd_backend_free_id(idev->iommufd, viommu->bypass_hwpt_id);
+free_abort_hwpt:
+ iommufd_backend_free_id(idev->iommufd, viommu->abort_hwpt_id);
+free_viommu_core:
+ iommufd_backend_free_id(idev->iommufd, viommu->core->viommu_id);
+ g_free(viommu->core);
+free_viommu:
+ g_free(viommu);
+ host_iommu_device_iommufd_attach_hwpt(idev, sdev->idev->ioas_id, errp);
+free_s2_hwpt:
+ iommufd_backend_free_id(idev->iommufd, s2_hwpt_id);
+ return false;
+}
+
+static bool smmu_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
+ HostIOMMUDevice *hiod, Error **errp)
+{
+ HostIOMMUDeviceIOMMUFD *idev = HOST_IOMMU_DEVICE_IOMMUFD(hiod);
+ SMMUState *s = opaque;
+ SMMUPciBus *sbus = smmu_get_sbus(s, bus);
+ SMMUDevice *sdev = smmu_get_sdev(s, sbus, bus, devfn);
+
+ if (!s->nested) {
+ return true;
+ }
+
+ if (sdev->idev) {
+ if (sdev->idev != idev) {
+ return false;//-EEXIST;
+ } else {
+ return true;
+ }
+ }
+
+ if (!idev) {
+ return true;
+ }
+
+ if (!smmu_dev_attach_viommu(sdev, idev, errp)) {
+ error_report("Unable to attach viommu");
+ return false;
+ }
+
+ sdev->idev = idev;
+ sdev->viommu = s->viommu;
+ QLIST_INSERT_HEAD(&s->viommu->device_list, sdev, next);
+ trace_smmu_set_iommu_device(devfn, smmu_get_sid(sdev));
+
+ return true;
+}
+
+static void smmu_dev_unset_iommu_device(PCIBus *bus, void *opaque, int devfn)
+{
+ SMMUDevice *sdev;
+ SMMUViommu *viommu;
+ SMMUState *s = opaque;
+ SMMUPciBus *sbus = g_hash_table_lookup(s->smmu_pcibus_by_busptr, bus);
+
+ if (!s->nested) {
+ return;
+ }
+
+ if (!sbus) {
+ return;
+ }
+
+ sdev = sbus->pbdev[devfn];
+ if (!sdev) {
+ return;
+ }
+
+ if (!host_iommu_device_iommufd_attach_hwpt(sdev->idev,
+ sdev->idev->ioas_id, NULL)) {
+ error_report("Unable to attach dev to the default HW pagetable");
+ }
+
+ viommu = sdev->viommu;
+
+ sdev->idev = NULL;
+ sdev->viommu = NULL;
+ QLIST_REMOVE(sdev, next);
+ trace_smmu_unset_iommu_device(devfn, smmu_get_sid(sdev));
+
+ if (QLIST_EMPTY(&viommu->device_list)) {
+ iommufd_backend_free_id(viommu->iommufd, viommu->bypass_hwpt_id);
+ iommufd_backend_free_id(viommu->iommufd, viommu->abort_hwpt_id);
+ iommufd_backend_free_id(viommu->iommufd, viommu->core->viommu_id);
+ g_free(viommu->core);
+ iommufd_backend_free_id(viommu->iommufd, viommu->s2_hwpt->hwpt_id);
+ g_free(viommu->s2_hwpt);
+ g_free(viommu);
+ s->viommu = NULL;
+ }
+}
+
static const PCIIOMMUOps smmu_ops = {
.get_address_space = smmu_find_add_as,
+ .set_iommu_device = smmu_dev_set_iommu_device,
+ .unset_iommu_device = smmu_dev_unset_iommu_device,
};
IOMMUMemoryRegion *smmu_iommu_mr(SMMUState *s, uint32_t sid)
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index cdc1ea06a8..58e0636e95 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -5,6 +5,8 @@ virt_acpi_setup(void) "No fw cfg or ACPI disabled. Bailing out."
# smmu-common.c
smmu_add_mr(const char *name) "%s"
+smmu_set_iommu_device(int devfn, uint32_t sid) "devfn=%d (sid=%d)"
+smmu_unset_iommu_device(int devfn, uint32_t sid) "devfn=%d (sid=%d)"
smmu_ptw_level(int stage, int level, uint64_t iova, size_t subpage_size, uint64_t baseaddr, uint32_t offset, uint64_t pte) "stage=%d level=%d iova=0x%"PRIx64" subpage_sz=0x%zx baseaddr=0x%"PRIx64" offset=%d => pte=0x%"PRIx64
smmu_ptw_invalid_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint32_t offset, uint64_t pte) "stage=%d level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" offset=%d pte=0x%"PRIx64
smmu_ptw_page_pte(int stage, int level, uint64_t iova, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t address) "stage=%d level=%d iova=0x%"PRIx64" base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" page address = 0x%"PRIx64
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
index eae5d4d05b..3bfb68cef6 100644
--- a/include/hw/arm/smmu-common.h
+++ b/include/hw/arm/smmu-common.h
@@ -23,6 +23,7 @@
#include "hw/pci/pci.h"
#include "qom/object.h"
#include "sysemu/iommufd.h"
+#include <linux/iommufd.h>
#define SMMU_PCI_BUS_MAX 256
#define SMMU_PCI_DEVFN_MAX 256
@@ -107,11 +108,30 @@ typedef struct SMMUTransCfg {
struct SMMUS2Cfg s2cfg;
} SMMUTransCfg;
+typedef struct SMMUS2Hwpt {
+ IOMMUFDBackend *iommufd;
+ uint32_t hwpt_id;
+ uint32_t ioas_id;
+} SMMUS2Hwpt;
+
+typedef struct SMMUViommu {
+ void *smmu;
+ IOMMUFDBackend *iommufd;
+ IOMMUFDViommu *core;
+ SMMUS2Hwpt *s2_hwpt;
+ uint32_t bypass_hwpt_id;
+ uint32_t abort_hwpt_id;
+ QLIST_HEAD(, SMMUDevice) device_list;
+ QLIST_ENTRY(SMMUViommu) next;
+} SMMUViommu;
+
typedef struct SMMUDevice {
void *smmu;
PCIBus *bus;
int devfn;
IOMMUMemoryRegion iommu;
+ HostIOMMUDeviceIOMMUFD *idev;
+ SMMUViommu *viommu;
AddressSpace as;
uint32_t cfg_cache_hits;
uint32_t cfg_cache_misses;
@@ -139,6 +159,7 @@ struct SMMUState {
/* Nested SMMU */
bool nested;
+ SMMUViommu *viommu;
GHashTable *smmu_pcibus_by_busptr;
GHashTable *configs; /* cache for configuration data */
--
2.41.0.windows.1

View File

@ -0,0 +1,75 @@
From 6c330f39cc08e4c641a3567e2b6ad0ebcadf5165 Mon Sep 17 00:00:00 2001
From: Nicolin Chen <nicolinc@nvidia.com>
Date: Fri, 21 Jun 2024 21:22:04 +0000
Subject: [PATCH] hw/arm/smmu-common: Bypass emulated IOTLB for a nested SMMU
If a vSMMU is configured as a nested one, HW IOTLB will be used and all
cache invalidation should be done to the HW IOTLB too, v.s. the emulated
iotlb. In this case, an iommu notifier isn't registered, as the devices
behind a nested SMMU would stay in the system address space for stage-2
mappings.
However, the KVM code still requests an iommu address space to translate
an MSI doorbell gIOVA via get_msi_address_space() and translate().
Since a nested SMMU doesn't register an iommu notifier to flush emulated
iotlb, bypass the emulated IOTLB and always walk through the guest-level
IO page table.
Note that regular nested SMMU could still register an iommu notifier for
IOTLB invalidation, since QEMU traps the invalidation commands. But this
would result in invalidation inefficiency since each invlaidation would
be doubled for both HW IOTLB and the emulated IOTLB. Also, with NVIDIA's
CMDQV feature on its Grace SoC, invalidation commands are issued to the
CMDQ HW direclty, without any trapping. So, there is no way to maintain
the emulated IOTLB. Meanwhile, the stage-1 translation request from KVM
is only activated in case of an MSI table update, which does not happen
that often to impact performance if walking through the guest RAM every
time.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
hw/arm/smmu-common.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index c5f3e02065..016418a48c 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -75,6 +75,16 @@ SMMUTLBEntry *smmu_iotlb_lookup(SMMUState *bs, SMMUTransCfg *cfg,
uint8_t level = 4 - (inputsize - 4) / stride;
SMMUTLBEntry *entry = NULL;
+ /*
+ * Stage-1 translation with a nested SMMU in general uses HW IOTLB. However,
+ * KVM still requests for an iommu address space for an MSI fixup by looking
+ * up stage-1 page table. Make sure we don't go through the emulated pathway
+ * so that the emulated iotlb will not need any invalidation.
+ */
+ if (bs->nested) {
+ return NULL;
+ }
+
while (level <= 3) {
uint64_t subpage_size = 1ULL << level_shift(level, tt->granule_sz);
uint64_t mask = subpage_size - 1;
@@ -110,6 +120,16 @@ void smmu_iotlb_insert(SMMUState *bs, SMMUTransCfg *cfg, SMMUTLBEntry *new)
SMMUIOTLBKey *key = g_new0(SMMUIOTLBKey, 1);
uint8_t tg = (new->granule - 10) / 2;
+ /*
+ * Stage-1 translation with a nested SMMU in general uses HW IOTLB. However,
+ * KVM still requests for an iommu address space for an MSI fixup by looking
+ * up stage-1 page table. Make sure we don't go through the emulated pathway
+ * so that the emulated iotlb will not need any invalidation.
+ */
+ if (bs->nested) {
+ return;
+ }
+
if (g_hash_table_size(bs->iotlb) >= SMMU_IOTLB_MAX_SIZE) {
smmu_iotlb_inv_all(bs);
}
--
2.41.0.windows.1

View File

@ -0,0 +1,68 @@
From 2fea4f93632679afcb15f0c35b3d9abeede37778 Mon Sep 17 00:00:00 2001
From: Nicolin Chen <nicolinc@nvidia.com>
Date: Wed, 10 Apr 2024 16:37:25 +0000
Subject: [PATCH] hw/arm/smmu-common: Extract smmu_get_sbus and smmu_get_sdev
helpers
Add two helpers to get sbus and sdev respectively. These will be used
by the following patch adding set/unset_iommu_device ops.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
hw/arm/smmu-common.c | 24 +++++++++++++++++++-----
1 file changed, 19 insertions(+), 5 deletions(-)
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 016418a48c..03d9ff58d4 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -589,12 +589,9 @@ SMMUPciBus *smmu_find_smmu_pcibus(SMMUState *s, uint8_t bus_num)
return NULL;
}
-static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn)
+static SMMUPciBus *smmu_get_sbus(SMMUState *s, PCIBus *bus)
{
- SMMUState *s = opaque;
SMMUPciBus *sbus = g_hash_table_lookup(s->smmu_pcibus_by_busptr, bus);
- SMMUDevice *sdev;
- static unsigned int index;
if (!sbus) {
sbus = g_malloc0(sizeof(SMMUPciBus) +
@@ -603,7 +600,15 @@ static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn)
g_hash_table_insert(s->smmu_pcibus_by_busptr, bus, sbus);
}
- sdev = sbus->pbdev[devfn];
+ return sbus;
+}
+
+static SMMUDevice *smmu_get_sdev(SMMUState *s, SMMUPciBus *sbus,
+ PCIBus *bus, int devfn)
+{
+ SMMUDevice *sdev = sbus->pbdev[devfn];
+ static unsigned int index;
+
if (!sdev) {
char *name = g_strdup_printf("%s-%d-%d", s->mrtypename, devfn, index++);
@@ -622,6 +627,15 @@ static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn)
g_free(name);
}
+ return sdev;
+}
+
+static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn)
+{
+ SMMUState *s = opaque;
+ SMMUPciBus *sbus = smmu_get_sbus(s, bus);
+ SMMUDevice *sdev = smmu_get_sdev(s, sbus, bus, devfn);
+
return &sdev->as;
}
--
2.41.0.windows.1

View File

@ -0,0 +1,114 @@
From d8d7f775b602a84c37b8aced11e00cb5b0521c4e Mon Sep 17 00:00:00 2001
From: Nicolin Chen <nicolinc@nvidia.com>
Date: Tue, 18 Jun 2024 17:22:18 -0700
Subject: [PATCH] hw/arm/smmu-common: Replace smmu_iommu_mr with smmu_find_sdev
The caller of smmu_iommu_mr wants to get sdev for smmuv3_flush_config().
Do it directly instead of bridging with an iommu mr pointer.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Message-id: 20240619002218.926674-1-nicolinc@nvidia.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
hw/arm/smmu-common.c | 8 ++------
hw/arm/smmuv3.c | 12 ++++--------
include/hw/arm/smmu-common.h | 4 ++--
3 files changed, 8 insertions(+), 16 deletions(-)
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 9e9af8f5c7..d0bc620606 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -837,20 +837,16 @@ static const PCIIOMMUOps smmu_ops = {
.unset_iommu_device = smmu_dev_unset_iommu_device,
};
-IOMMUMemoryRegion *smmu_iommu_mr(SMMUState *s, uint32_t sid)
+SMMUDevice *smmu_find_sdev(SMMUState *s, uint32_t sid)
{
uint8_t bus_n, devfn;
SMMUPciBus *smmu_bus;
- SMMUDevice *smmu;
bus_n = PCI_BUS_NUM(sid);
smmu_bus = smmu_find_smmu_pcibus(s, bus_n);
if (smmu_bus) {
devfn = SMMU_PCI_DEVFN(sid);
- smmu = smmu_bus->pbdev[devfn];
- if (smmu) {
- return &smmu->iommu;
- }
+ return smmu_bus->pbdev[devfn];
}
return NULL;
}
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 9d44bb19bc..b2ffe2d40b 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1407,20 +1407,18 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
case SMMU_CMD_CFGI_STE:
{
uint32_t sid = CMD_SID(&cmd);
- IOMMUMemoryRegion *mr = smmu_iommu_mr(bs, sid);
- SMMUDevice *sdev;
+ SMMUDevice *sdev = smmu_find_sdev(bs, sid);
if (CMD_SSEC(&cmd)) {
cmd_error = SMMU_CERROR_ILL;
break;
}
- if (!mr) {
+ if (!sdev) {
break;
}
trace_smmuv3_cmdq_cfgi_ste(sid);
- sdev = container_of(mr, SMMUDevice, iommu);
smmuv3_flush_config(sdev);
smmuv3_install_nested_ste(sdev, sid);
@@ -1452,20 +1450,18 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
case SMMU_CMD_CFGI_CD_ALL:
{
uint32_t sid = CMD_SID(&cmd);
- IOMMUMemoryRegion *mr = smmu_iommu_mr(bs, sid);
- SMMUDevice *sdev;
+ SMMUDevice *sdev = smmu_find_sdev(bs, sid);
if (CMD_SSEC(&cmd)) {
cmd_error = SMMU_CERROR_ILL;
break;
}
- if (!mr) {
+ if (!sdev) {
break;
}
trace_smmuv3_cmdq_cfgi_cd(sid);
- sdev = container_of(mr, SMMUDevice, iommu);
smmuv3_flush_config(sdev);
break;
}
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
index 955ca716a5..e30539a8d4 100644
--- a/include/hw/arm/smmu-common.h
+++ b/include/hw/arm/smmu-common.h
@@ -234,8 +234,8 @@ int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm,
*/
SMMUTransTableInfo *select_tt(SMMUTransCfg *cfg, dma_addr_t iova);
-/* Return the iommu mr associated to @sid, or NULL if none */
-IOMMUMemoryRegion *smmu_iommu_mr(SMMUState *s, uint32_t sid);
+/* Return the SMMUDevice associated to @sid, or NULL if none */
+SMMUDevice *smmu_find_sdev(SMMUState *s, uint32_t sid);
#define SMMU_IOTLB_MAX_SIZE 256
--
2.41.0.windows.1

View File

@ -0,0 +1,87 @@
From 3c6c29612d5ca0ff07bcb8a45735a3877c8fadd4 Mon Sep 17 00:00:00 2001
From: Nicolin Chen <nicolinc@nvidia.com>
Date: Thu, 7 Dec 2023 20:04:47 +0000
Subject: [PATCH] hw/arm/smmu-common: Return sysmem if stage-1 is bypassed
When nested translation is enabled, there are 2-stage translation occuring
to two different address spaces: stage-1 in the iommu as, while stage-2 in
the system as.
If a device attached to the vSMMU doesn't enable stage-1 translation, e.g.
vSTE sets to Config=Bypass, the system as should be returned, so QEMU can
set up system memory mappings onto the stage-2 page table.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
hw/arm/smmu-common.c | 18 +++++++++++++++++-
include/hw/arm/smmu-common.h | 3 +++
2 files changed, 20 insertions(+), 1 deletion(-)
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index a79eb34277..cc41bf3de8 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -622,6 +622,9 @@ static SMMUDevice *smmu_get_sdev(SMMUState *s, SMMUPciBus *sbus,
memory_region_init_iommu(&sdev->iommu, sizeof(sdev->iommu),
s->mrtypename,
OBJECT(s), name, UINT64_MAX);
+ if (s->nested) {
+ address_space_init(&sdev->as_sysmem, &s->root, name);
+ }
address_space_init(&sdev->as,
MEMORY_REGION(&sdev->iommu), name);
trace_smmu_add_mr(name);
@@ -637,7 +640,12 @@ static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn)
SMMUPciBus *sbus = smmu_get_sbus(s, bus);
SMMUDevice *sdev = smmu_get_sdev(s, sbus, bus, devfn);
- return &sdev->as;
+ /* Return the system as if the device uses stage-2 only */
+ if (s->nested && !sdev->s1_hwpt) {
+ return &sdev->as_sysmem;
+ } else {
+ return &sdev->as;
+ }
}
static bool smmu_dev_attach_viommu(SMMUDevice *sdev,
@@ -983,6 +991,14 @@ static void smmu_base_realize(DeviceState *dev, Error **errp)
g_free, g_free);
s->smmu_pcibus_by_busptr = g_hash_table_new(NULL, NULL);
+ if (s->nested) {
+ memory_region_init(&s->root, OBJECT(s), "root", UINT64_MAX);
+ memory_region_init_alias(&s->sysmem, OBJECT(s),
+ "smmu-sysmem", get_system_memory(), 0,
+ memory_region_size(get_system_memory()));
+ memory_region_add_subregion(&s->root, 0, &s->sysmem);
+ }
+
if (s->primary_bus) {
pci_setup_iommu(s->primary_bus, &smmu_ops, s);
} else {
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
index 66dc7206ea..37dfeed026 100644
--- a/include/hw/arm/smmu-common.h
+++ b/include/hw/arm/smmu-common.h
@@ -143,6 +143,7 @@ typedef struct SMMUDevice {
SMMUViommu *viommu;
SMMUS1Hwpt *s1_hwpt;
AddressSpace as;
+ AddressSpace as_sysmem;
uint32_t cfg_cache_hits;
uint32_t cfg_cache_misses;
QLIST_ENTRY(SMMUDevice) next;
@@ -165,7 +166,9 @@ struct SMMUState {
/* <private> */
SysBusDevice dev;
const char *mrtypename;
+ MemoryRegion root;
MemoryRegion iomem;
+ MemoryRegion sysmem;
/* Nested SMMU */
bool nested;
--
2.41.0.windows.1

View File

@ -0,0 +1,233 @@
From 9895192512af4b52aff88432618a474e69b44bdd Mon Sep 17 00:00:00 2001
From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Date: Wed, 6 Nov 2024 14:47:27 +0000
Subject: [PATCH] hw/arm/smmuv3: Add initial support for SMMUv3 Nested device
Based on SMMUv3 as a parent device, add a user-creatable
smmuv3-nested device. Subsequent patches will add support to
specify a PCI bus for this device.
Currently only supported for "virt", so hook up the sybus mem & irq
for that as well.
No FDT support is added for now.
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
hw/arm/smmuv3.c | 34 ++++++++++++++++++++++++++++++++++
hw/arm/virt.c | 31 +++++++++++++++++++++++++++++--
hw/core/sysbus-fdt.c | 1 +
include/hw/arm/smmuv3.h | 15 +++++++++++++++
include/hw/arm/virt.h | 6 ++++++
5 files changed, 85 insertions(+), 2 deletions(-)
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index b860c8385f..3010471cdc 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -2069,6 +2069,19 @@ static void smmu_realize(DeviceState *d, Error **errp)
smmu_init_irq(s, dev);
}
+static void smmu_nested_realize(DeviceState *d, Error **errp)
+{
+ SMMUv3NestedState *s_nested = ARM_SMMUV3_NESTED(d);
+ SMMUv3NestedClass *c = ARM_SMMUV3_NESTED_GET_CLASS(s_nested);
+ Error *local_err = NULL;
+
+ c->parent_realize(d, &local_err);
+ if (local_err) {
+ error_propagate(errp, local_err);
+ return;
+ }
+}
+
static const VMStateDescription vmstate_smmuv3_queue = {
.name = "smmuv3_queue",
.version_id = 1,
@@ -2167,6 +2180,18 @@ static void smmuv3_class_init(ObjectClass *klass, void *data)
device_class_set_props(dc, smmuv3_properties);
}
+static void smmuv3_nested_class_init(ObjectClass *klass, void *data)
+{
+ DeviceClass *dc = DEVICE_CLASS(klass);
+ SMMUv3NestedClass *c = ARM_SMMUV3_NESTED_CLASS(klass);
+
+ dc->vmsd = &vmstate_smmuv3;
+ device_class_set_parent_realize(dc, smmu_nested_realize,
+ &c->parent_realize);
+ dc->user_creatable = true;
+ dc->hotpluggable = false;
+}
+
static int smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
IOMMUNotifierFlag old,
IOMMUNotifierFlag new,
@@ -2205,6 +2230,14 @@ static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
imrc->notify_flag_changed = smmuv3_notify_flag_changed;
}
+static const TypeInfo smmuv3_nested_type_info = {
+ .name = TYPE_ARM_SMMUV3_NESTED,
+ .parent = TYPE_ARM_SMMUV3,
+ .instance_size = sizeof(SMMUv3NestedState),
+ .class_size = sizeof(SMMUv3NestedClass),
+ .class_init = smmuv3_nested_class_init,
+};
+
static const TypeInfo smmuv3_type_info = {
.name = TYPE_ARM_SMMUV3,
.parent = TYPE_ARM_SMMU,
@@ -2223,6 +2256,7 @@ static const TypeInfo smmuv3_iommu_memory_region_info = {
static void smmuv3_register_types(void)
{
type_register(&smmuv3_type_info);
+ type_register(&smmuv3_nested_type_info);
type_register(&smmuv3_iommu_memory_region_info);
}
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 08c40c314b..a55f297af2 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -166,6 +166,7 @@ static const MemMapEntry base_memmap[] = {
/* In the virtCCA scenario, this space is used for MSI interrupt mapping */
[VIRT_CVM_MSI] = { 0x0a001000, 0x00fff000 },
[VIRT_CPUFREQ] = { 0x0b000000, 0x00010000 },
+ [VIRT_SMMU_NESTED] = { 0x0b010000, 0x00ff0000},
/* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
[VIRT_PLATFORM_BUS] = { 0x0c000000, 0x02000000 },
[VIRT_SECURE_MEM] = { 0x0e000000, 0x01000000 },
@@ -211,6 +212,7 @@ static const int a15irqmap[] = {
[VIRT_GIC_V2M] = 48, /* ...to 48 + NUM_GICV2M_SPIS - 1 */
[VIRT_SMMU] = 74, /* ...to 74 + NUM_SMMU_IRQS - 1 */
[VIRT_PLATFORM_BUS] = 112, /* ...to 112 + PLATFORM_BUS_NUM_IRQS -1 */
+ [VIRT_SMMU_NESTED] = 200,
};
static const char *valid_cpus[] = {
@@ -3613,10 +3615,34 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
DeviceState *dev, Error **errp)
{
VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
+ MachineClass *mc = MACHINE_GET_CLASS(vms);
- if (vms->platform_bus_dev) {
- MachineClass *mc = MACHINE_GET_CLASS(vms);
+ /* For smmuv3-nested devices we need to set the mem & irq */
+ if (device_is_dynamic_sysbus(mc, dev) &&
+ object_dynamic_cast(OBJECT(dev), TYPE_ARM_SMMUV3_NESTED)) {
+ hwaddr base = vms->memmap[VIRT_SMMU_NESTED].base;
+ int irq = vms->irqmap[VIRT_SMMU_NESTED];
+
+ if (vms->smmu_nested_count >= MAX_SMMU_NESTED) {
+ error_setg(errp, "smmuv3-nested max count reached!");
+ return;
+ }
+
+ base += (vms->smmu_nested_count * SMMU_IO_LEN);
+ irq += (vms->smmu_nested_count * NUM_SMMU_IRQS);
+ sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, base);
+ for (int i = 0; i < 4; i++) {
+ sysbus_connect_irq(SYS_BUS_DEVICE(dev), i,
+ qdev_get_gpio_in(vms->gic, irq + i));
+ }
+ if (vms->iommu != VIRT_IOMMU_SMMUV3_NESTED) {
+ vms->iommu = VIRT_IOMMU_SMMUV3_NESTED;
+ }
+ vms->smmu_nested_count++;
+ }
+
+ if (vms->platform_bus_dev) {
if (device_is_dynamic_sysbus(mc, dev)) {
platform_bus_link_device(PLATFORM_BUS_DEVICE(vms->platform_bus_dev),
SYS_BUS_DEVICE(dev));
@@ -3789,6 +3815,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
machine_class_allow_dynamic_sysbus_dev(mc, TYPE_VFIO_AMD_XGBE);
machine_class_allow_dynamic_sysbus_dev(mc, TYPE_RAMFB_DEVICE);
machine_class_allow_dynamic_sysbus_dev(mc, TYPE_VFIO_PLATFORM);
+ machine_class_allow_dynamic_sysbus_dev(mc, TYPE_ARM_SMMUV3_NESTED);
#ifdef CONFIG_TPM
machine_class_allow_dynamic_sysbus_dev(mc, TYPE_TPM_TIS_SYSBUS);
#endif
diff --git a/hw/core/sysbus-fdt.c b/hw/core/sysbus-fdt.c
index eebcd28f9a..0f0d0b3e58 100644
--- a/hw/core/sysbus-fdt.c
+++ b/hw/core/sysbus-fdt.c
@@ -489,6 +489,7 @@ static const BindingEntry bindings[] = {
#ifdef CONFIG_LINUX
TYPE_BINDING(TYPE_VFIO_CALXEDA_XGMAC, add_calxeda_midway_xgmac_fdt_node),
TYPE_BINDING(TYPE_VFIO_AMD_XGBE, add_amd_xgbe_fdt_node),
+ TYPE_BINDING("arm-smmuv3-nested", no_fdt_node),
VFIO_PLATFORM_BINDING("amd,xgbe-seattle-v1a", add_amd_xgbe_fdt_node),
#endif
#ifdef CONFIG_TPM
diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
index d183a62766..87e628be7a 100644
--- a/include/hw/arm/smmuv3.h
+++ b/include/hw/arm/smmuv3.h
@@ -84,6 +84,21 @@ struct SMMUv3Class {
#define TYPE_ARM_SMMUV3 "arm-smmuv3"
OBJECT_DECLARE_TYPE(SMMUv3State, SMMUv3Class, ARM_SMMUV3)
+#define TYPE_ARM_SMMUV3_NESTED "arm-smmuv3-nested"
+OBJECT_DECLARE_TYPE(SMMUv3NestedState, SMMUv3NestedClass, ARM_SMMUV3_NESTED)
+
+struct SMMUv3NestedState {
+ SMMUv3State smmuv3_state;
+};
+
+struct SMMUv3NestedClass {
+ /*< private >*/
+ SMMUv3Class smmuv3_class;
+ /*< public >*/
+
+ DeviceRealize parent_realize;
+};
+
#define STAGE1_SUPPORTED(s) FIELD_EX32(s->idr[0], IDR0, S1P)
#define STAGE2_SUPPORTED(s) FIELD_EX32(s->idr[0], IDR0, S2P)
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index e6a449becd..cd41e28202 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -109,6 +109,9 @@ typedef enum {
/* MMIO region size for SMMUv3 */
#define SMMU_IO_LEN 0x20000
+/* Max supported nested SMMUv3 */
+#define MAX_SMMU_NESTED 64
+
enum {
VIRT_FLASH,
VIRT_MEM,
@@ -121,6 +124,7 @@ enum {
VIRT_GIC_ITS,
VIRT_GIC_REDIST,
VIRT_SMMU,
+ VIRT_SMMU_NESTED,
VIRT_UART,
VIRT_CPUFREQ,
VIRT_MMIO,
@@ -155,6 +159,7 @@ enum {
typedef enum VirtIOMMUType {
VIRT_IOMMU_NONE,
VIRT_IOMMU_SMMUV3,
+ VIRT_IOMMU_SMMUV3_NESTED,
VIRT_IOMMU_VIRTIO,
} VirtIOMMUType;
@@ -222,6 +227,7 @@ struct VirtMachineState {
bool mte;
bool dtb_randomness;
bool pmu;
+ int smmu_nested_count;
OnOffAuto acpi;
VirtGICType gic_version;
VirtIOMMUType iommu;
--
2.41.0.windows.1

View File

@ -0,0 +1,92 @@
From 707bd8198642549595f11ef34c80094fbf7d2de1 Mon Sep 17 00:00:00 2001
From: Nicolin Chen <nicolinc@nvidia.com>
Date: Mon, 29 Apr 2024 21:26:41 +0000
Subject: [PATCH] hw/arm/smmuv3: Add missing STE invalidation
Multitple STEs can be invalidated in a range via SMMU_CMD_CFGI_STE_RANGE
or SMMU_CMD_CFGI_ALL command.
Add the missing STE invalidation in this pathway.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
hw/arm/smmu-internal.h | 1 +
hw/arm/smmuv3.c | 28 +++++++++++++++++++++++++---
2 files changed, 26 insertions(+), 3 deletions(-)
diff --git a/hw/arm/smmu-internal.h b/hw/arm/smmu-internal.h
index 843bebb185..5a81dd1b82 100644
--- a/hw/arm/smmu-internal.h
+++ b/hw/arm/smmu-internal.h
@@ -142,6 +142,7 @@ typedef struct SMMUIOTLBPageInvInfo {
} SMMUIOTLBPageInvInfo;
typedef struct SMMUSIDRange {
+ SMMUState *state;
uint32_t start;
uint32_t end;
} SMMUSIDRange;
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 540831ab8e..9d44bb19bc 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1322,11 +1322,9 @@ static void smmuv3_install_nested_ste(SMMUDevice *sdev, int sid)
}
static gboolean
-smmuv3_invalidate_ste(gpointer key, gpointer value, gpointer user_data)
+_smmuv3_invalidate_ste(SMMUDevice *sdev, SMMUSIDRange *sid_range)
{
- SMMUDevice *sdev = (SMMUDevice *)key;
uint32_t sid = smmu_get_sid(sdev);
- SMMUSIDRange *sid_range = (SMMUSIDRange *)user_data;
if (sid < sid_range->start || sid > sid_range->end) {
return false;
@@ -1337,6 +1335,28 @@ smmuv3_invalidate_ste(gpointer key, gpointer value, gpointer user_data)
return true;
}
+static gboolean
+smmuv3_invalidate_ste(gpointer key, gpointer value, gpointer user_data)
+{
+ return _smmuv3_invalidate_ste((SMMUDevice *)key, (SMMUSIDRange *)user_data);
+}
+
+static void smmuv3_invalidate_nested_ste(SMMUSIDRange *sid_range)
+{
+ SMMUState *bs = sid_range->state;
+ SMMUDevice *sdev;
+
+ if (!bs->viommu) {
+ return;
+ }
+
+ QLIST_FOREACH(sdev, &bs->viommu->device_list, next) {
+ if (smmu_get_sid(sdev)) {
+ _smmuv3_invalidate_ste(sdev, sid_range);
+ }
+ }
+}
+
static int smmuv3_cmdq_consume(SMMUv3State *s)
{
SMMUState *bs = ARM_SMMU(s);
@@ -1418,12 +1438,14 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
}
mask = (1ULL << (range + 1)) - 1;
+ sid_range.state = bs;
sid_range.start = sid & ~mask;
sid_range.end = sid_range.start + mask;
trace_smmuv3_cmdq_cfgi_ste_range(sid_range.start, sid_range.end);
g_hash_table_foreach_remove(bs->configs, smmuv3_invalidate_ste,
&sid_range);
+ smmuv3_invalidate_nested_ste(&sid_range);
break;
}
case SMMU_CMD_CFGI_CD:
--
2.41.0.windows.1

View File

@ -0,0 +1,255 @@
From 13b84313c9f7ca4823abdbad92baf091c337861e Mon Sep 17 00:00:00 2001
From: Nicolin Chen <nicolinc@nvidia.com>
Date: Fri, 21 Apr 2023 15:13:53 -0700
Subject: [PATCH] hw/arm/smmuv3: Add smmu_dev_install_nested_ste() for CFGI_STE
Call smmu_dev_install_nested_ste and eventually down to IOMMU_HWPT_ALLOC
ioctl for a nested HWPT allocation.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
hw/arm/smmu-common.c | 9 ++++
hw/arm/smmuv3-internal.h | 1 +
hw/arm/smmuv3.c | 97 +++++++++++++++++++++++++++++++++++-
hw/arm/trace-events | 1 +
include/hw/arm/smmu-common.h | 14 ++++++
5 files changed, 120 insertions(+), 2 deletions(-)
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index cc41bf3de8..9e9af8f5c7 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -780,6 +780,7 @@ static bool smmu_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
static void smmu_dev_unset_iommu_device(PCIBus *bus, void *opaque, int devfn)
{
+ SMMUVdev *vdev;
SMMUDevice *sdev;
SMMUViommu *viommu;
SMMUState *s = opaque;
@@ -803,13 +804,21 @@ static void smmu_dev_unset_iommu_device(PCIBus *bus, void *opaque, int devfn)
error_report("Unable to attach dev to the default HW pagetable");
}
+ vdev = sdev->vdev;
viommu = sdev->viommu;
sdev->idev = NULL;
sdev->viommu = NULL;
+ sdev->vdev = NULL;
QLIST_REMOVE(sdev, next);
trace_smmu_unset_iommu_device(devfn, smmu_get_sid(sdev));
+ if (vdev) {
+ iommufd_backend_free_id(viommu->iommufd, vdev->core->vdev_id);
+ g_free(vdev->core);
+ g_free(vdev);
+ }
+
if (QLIST_EMPTY(&viommu->device_list)) {
iommufd_backend_free_id(viommu->iommufd, viommu->bypass_hwpt_id);
iommufd_backend_free_id(viommu->iommufd, viommu->abort_hwpt_id);
diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index 6076025ad6..163459d450 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -552,6 +552,7 @@ typedef struct CD {
#define STE_S1FMT(x) extract32((x)->word[0], 4 , 2)
#define STE_S1CDMAX(x) extract32((x)->word[1], 27, 5)
+#define STE_S1DSS(x) extract32((x)->word[2], 0, 2)
#define STE_S1STALLD(x) extract32((x)->word[2], 27, 1)
#define STE_EATS(x) extract32((x)->word[2], 28, 2)
#define STE_STRW(x) extract32((x)->word[2], 30, 2)
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 253d297eec..540831ab8e 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -563,6 +563,27 @@ bad_ste:
return -EINVAL;
}
+static void decode_ste_config(SMMUTransCfg *cfg, uint32_t config)
+{
+
+ if (STE_CFG_ABORT(config)) {
+ cfg->aborted = true;
+ return;
+ }
+ if (STE_CFG_BYPASS(config)) {
+ cfg->bypassed = true;
+ return;
+ }
+
+ if (STE_CFG_S1_ENABLED(config)) {
+ cfg->stage = SMMU_STAGE_1;
+ }
+
+ if (STE_CFG_S2_ENABLED(config)) {
+ cfg->stage |= SMMU_STAGE_2;
+ }
+}
+
/* Returns < 0 in case of invalid STE, 0 otherwise */
static int decode_ste(SMMUv3State *s, SMMUTransCfg *cfg,
STE *ste, SMMUEventInfo *event)
@@ -579,12 +600,19 @@ static int decode_ste(SMMUv3State *s, SMMUTransCfg *cfg,
config = STE_CONFIG(ste);
- if (STE_CFG_ABORT(config)) {
+ decode_ste_config(cfg, config);
+
+ /* S1DSS.Terminate is same as Config.abort for default stream */
+ if (STE_CFG_S1_ENABLED(config) && STE_S1DSS(ste) == 0) {
cfg->aborted = true;
+ }
+
+ if (cfg->aborted || cfg->bypassed) {
return 0;
}
- if (STE_CFG_BYPASS(config)) {
+ /* S1DSS.Bypass is same as Config.bypass for default stream */
+ if (STE_CFG_S1_ENABLED(config) && STE_S1DSS(ste) == 0x1) {
cfg->bypassed = true;
return 0;
}
@@ -1231,6 +1259,68 @@ static void smmuv3_range_inval(SMMUState *s, Cmd *cmd)
}
}
+static void smmuv3_install_nested_ste(SMMUDevice *sdev, int sid)
+{
+#ifdef __linux__
+ SMMUEventInfo event = {.type = SMMU_EVT_NONE, .sid = sid,
+ .inval_ste_allowed = true};
+ struct iommu_hwpt_arm_smmuv3 nested_data = {};
+ SMMUv3State *s = sdev->smmu;
+ SMMUState *bs = &s->smmu_state;
+ uint32_t config;
+ STE ste;
+ int ret;
+
+ if (!sdev->viommu || !bs->nested) {
+ return;
+ }
+
+ if (!sdev->vdev && sdev->idev && sdev->viommu) {
+ SMMUVdev *vdev = g_new0(SMMUVdev, 1);
+ vdev->core = iommufd_backend_alloc_vdev(sdev->idev, sdev->viommu->core,
+ sid);
+ if (!vdev->core) {
+ error_report("failed to allocate a vDEVICE");
+ g_free(vdev);
+ return;
+ }
+ sdev->vdev = vdev;
+ }
+
+ ret = smmu_find_ste(sdev->smmu, sid, &ste, &event);
+ if (ret) {
+ /*
+ * For a 2-level Stream Table, the level-2 table might not be ready
+ * until the device gets inserted to the stream table. Ignore this.
+ */
+ return;
+ }
+
+ config = STE_CONFIG(&ste);
+ if (!STE_VALID(&ste) || !STE_CFG_S1_ENABLED(config)) {
+ smmu_dev_uninstall_nested_ste(sdev, STE_CFG_ABORT(config));
+ smmuv3_flush_config(sdev);
+ return;
+ }
+
+ nested_data.ste[0] = (uint64_t)ste.word[0] | (uint64_t)ste.word[1] << 32;
+ nested_data.ste[1] = (uint64_t)ste.word[2] | (uint64_t)ste.word[3] << 32;
+ /* V | CONFIG | S1FMT | S1CTXPTR | S1CDMAX */
+ nested_data.ste[0] &= 0xf80fffffffffffffULL;
+ /* S1DSS | S1CIR | S1COR | S1CSH | S1STALLD | EATS */
+ nested_data.ste[1] &= 0x380000ffULL;
+
+ ret = smmu_dev_install_nested_ste(sdev, IOMMU_HWPT_DATA_ARM_SMMUV3,
+ sizeof(nested_data), &nested_data);
+ if (ret) {
+ error_report("Unable to install nested STE=%16LX:%16LX, ret=%d",
+ nested_data.ste[1], nested_data.ste[0], ret);
+ }
+
+ trace_smmuv3_install_nested_ste(sid, nested_data.ste[1], nested_data.ste[0]);
+#endif
+}
+
static gboolean
smmuv3_invalidate_ste(gpointer key, gpointer value, gpointer user_data)
{
@@ -1241,6 +1331,8 @@ smmuv3_invalidate_ste(gpointer key, gpointer value, gpointer user_data)
if (sid < sid_range->start || sid > sid_range->end) {
return false;
}
+ smmuv3_flush_config(sdev);
+ smmuv3_install_nested_ste(sdev, sid);
trace_smmuv3_config_cache_inv(sid);
return true;
}
@@ -1310,6 +1402,7 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
trace_smmuv3_cmdq_cfgi_ste(sid);
sdev = container_of(mr, SMMUDevice, iommu);
smmuv3_flush_config(sdev);
+ smmuv3_install_nested_ste(sdev, sid);
break;
}
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 1e3d86382d..490da6349c 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -57,4 +57,5 @@ smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node for iommu mr=%s
smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node for iommu mr=%s"
smmuv3_get_device_info(uint32_t idr0, uint32_t idr1, uint32_t idr3, uint32_t idr5) "idr0=0x%x idr1=0x%x idr3=0x%x idr5=0x%x"
smmuv3_inv_notifiers_iova(const char *name, uint16_t asid, uint16_t vmid, uint64_t iova, uint8_t tg, uint64_t num_pages) "iommu mr=%s asid=%d vmid=%d iova=0x%"PRIx64" tg=%d num_pages=0x%"PRIx64
+smmuv3_install_nested_ste(uint32_t sid, uint64_t ste_1, uint64_t ste_0) "sid=%d ste=%"PRIx64":%"PRIx64
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
index d120c352cf..955ca716a5 100644
--- a/include/hw/arm/smmu-common.h
+++ b/include/hw/arm/smmu-common.h
@@ -51,6 +51,13 @@ typedef enum {
SMMU_PTW_ERR_PERMISSION, /* Permission fault */
} SMMUPTWEventType;
+/* SMMU Stage */
+typedef enum {
+ SMMU_STAGE_1 = 1,
+ SMMU_STAGE_2,
+ SMMU_NESTED,
+} SMMUStage;
+
typedef struct SMMUPTWEventInfo {
int stage;
SMMUPTWEventType type;
@@ -125,6 +132,12 @@ typedef struct SMMUViommu {
QLIST_ENTRY(SMMUViommu) next;
} SMMUViommu;
+typedef struct SMMUVdev {
+ SMMUViommu *vsmmu;
+ IOMMUFDVdev *core;
+ uint32_t sid;
+}SMMUVdev;
+
typedef struct SMMUS1Hwpt {
void *smmu;
IOMMUFDBackend *iommufd;
@@ -141,6 +154,7 @@ typedef struct SMMUDevice {
IOMMUMemoryRegion iommu;
HostIOMMUDeviceIOMMUFD *idev;
SMMUViommu *viommu;
+ SMMUVdev *vdev;
SMMUS1Hwpt *s1_hwpt;
AddressSpace as;
AddressSpace as_sysmem;
--
2.41.0.windows.1

View File

@ -0,0 +1,95 @@
From afca50145f52601d912a805b65bd4530e9278388 Mon Sep 17 00:00:00 2001
From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Date: Wed, 6 Nov 2024 15:53:45 +0000
Subject: [PATCH] hw/arm/smmuv3: Associate a pci bus with a SMMUv3 Nested
device
Subsequent patches will add IORT modifications to get this working.
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
hw/arm/smmuv3.c | 27 +++++++++++++++++++++++++++
include/hw/arm/smmuv3.h | 2 ++
2 files changed, 29 insertions(+)
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 3010471cdc..66e4e1b57d 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -24,6 +24,7 @@
#include "hw/qdev-properties.h"
#include "hw/qdev-core.h"
#include "hw/pci/pci.h"
+#include "hw/pci/pci_bridge.h"
#include "cpu.h"
#include "trace.h"
#include "qemu/log.h"
@@ -2069,12 +2070,32 @@ static void smmu_realize(DeviceState *d, Error **errp)
smmu_init_irq(s, dev);
}
+static int smmuv3_nested_pci_host_bridge(Object *obj, void *opaque)
+{
+ DeviceState *d = opaque;
+ SMMUv3NestedState *s_nested = ARM_SMMUV3_NESTED(d);
+
+ if (object_dynamic_cast(obj, TYPE_PCI_HOST_BRIDGE)) {
+ PCIBus *bus = PCI_HOST_BRIDGE(obj)->bus;
+ if (s_nested->pci_bus && !strcmp(bus->qbus.name, s_nested->pci_bus)) {
+ object_property_set_link(OBJECT(d), "primary-bus", OBJECT(bus),
+ &error_abort);
+ }
+ }
+ return 0;
+}
+
static void smmu_nested_realize(DeviceState *d, Error **errp)
{
SMMUv3NestedState *s_nested = ARM_SMMUV3_NESTED(d);
SMMUv3NestedClass *c = ARM_SMMUV3_NESTED_GET_CLASS(s_nested);
+ SysBusDevice *dev = SYS_BUS_DEVICE(d);
Error *local_err = NULL;
+ object_child_foreach_recursive(object_get_root(),
+ smmuv3_nested_pci_host_bridge, d);
+ object_property_set_bool(OBJECT(dev), "nested", true, &error_abort);
+
c->parent_realize(d, &local_err);
if (local_err) {
error_propagate(errp, local_err);
@@ -2161,6 +2182,11 @@ static Property smmuv3_properties[] = {
DEFINE_PROP_END_OF_LIST()
};
+static Property smmuv3_nested_properties[] = {
+ DEFINE_PROP_STRING("pci-bus", SMMUv3NestedState, pci_bus),
+ DEFINE_PROP_END_OF_LIST()
+};
+
static void smmuv3_instance_init(Object *obj)
{
/* Nothing much to do here as of now */
@@ -2188,6 +2214,7 @@ static void smmuv3_nested_class_init(ObjectClass *klass, void *data)
dc->vmsd = &vmstate_smmuv3;
device_class_set_parent_realize(dc, smmu_nested_realize,
&c->parent_realize);
+ device_class_set_props(dc, smmuv3_nested_properties);
dc->user_creatable = true;
dc->hotpluggable = false;
}
diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
index 87e628be7a..96513fce56 100644
--- a/include/hw/arm/smmuv3.h
+++ b/include/hw/arm/smmuv3.h
@@ -89,6 +89,8 @@ OBJECT_DECLARE_TYPE(SMMUv3NestedState, SMMUv3NestedClass, ARM_SMMUV3_NESTED)
struct SMMUv3NestedState {
SMMUv3State smmuv3_state;
+
+ char *pci_bus;
};
struct SMMUv3NestedClass {
--
2.41.0.windows.1

View File

@ -0,0 +1,38 @@
From fac9784bbedb50dc964feb9cf70b6f37472fcf60 Mon Sep 17 00:00:00 2001
From: Nicolin Chen <nicolinc@nvidia.com>
Date: Fri, 21 Apr 2023 22:10:44 -0700
Subject: [PATCH] hw/arm/smmuv3: Check idr registers for STE_S1CDMAX and
STE_S1STALLD
With nested translation, the underlying HW could support those two fields.
Allow them according to the updated idr registers after the hw_info ioctl.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
hw/arm/smmuv3.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 4208325ab3..253d297eec 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -622,13 +622,14 @@ static int decode_ste(SMMUv3State *s, SMMUTransCfg *cfg,
}
}
- if (STE_S1CDMAX(ste) != 0) {
+ if (!FIELD_EX32(s->idr[1], IDR1, SSIDSIZE) && STE_S1CDMAX(ste) != 0) {
qemu_log_mask(LOG_UNIMP,
"SMMUv3 does not support multiple context descriptors yet\n");
goto bad_ste;
}
- if (STE_S1STALLD(ste)) {
+ /* STALL_MODEL being 0b01 means "stall is not supported" */
+ if ((FIELD_EX32(s->idr[0], IDR0, STALL_MODEL) & 0x1) && STE_S1STALLD(ste)) {
qemu_log_mask(LOG_UNIMP,
"SMMUv3 S1 stalling fault model not allowed yet\n");
goto bad_ste;
--
2.41.0.windows.1

View File

@ -0,0 +1,76 @@
From c8267f88b2af37779a597aac00aeaf06adc80ccc Mon Sep 17 00:00:00 2001
From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Date: Mon, 11 Dec 2023 14:42:01 +0000
Subject: [PATCH] hw/arm/smmuv3: Enable sva/stall IDR features
Emulate features that will enable the stall and sva feature in Guest.
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
hw/arm/smmuv3-internal.h | 3 ++-
hw/arm/smmuv3.c | 8 +++-----
2 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index a411fd4048..cfc04c563e 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -74,6 +74,7 @@ REG32(IDR1, 0x4)
FIELD(IDR1, ECMDQ, 31, 1)
#define SMMU_IDR1_SIDSIZE 16
+#define SMMU_IDR1_SSIDSIZE 16
#define SMMU_CMDQS 19
#define SMMU_EVENTQS 19
@@ -104,7 +105,7 @@ REG32(IDR5, 0x14)
FIELD(IDR5, VAX, 10, 2);
FIELD(IDR5, STALL_MAX, 16, 16);
-#define SMMU_IDR5_OAS 4
+#define SMMU_IDR5_OAS 5
REG32(IIDR, 0x18)
REG32(AIDR, 0x1c)
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 66e4e1b57d..8d8dcccd48 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -343,13 +343,14 @@ static void smmuv3_init_regs(SMMUv3State *s)
s->idr[0] = FIELD_DP32(s->idr[0], IDR0, ASID16, 1); /* 16-bit ASID */
s->idr[0] = FIELD_DP32(s->idr[0], IDR0, VMID16, 1); /* 16-bit VMID */
s->idr[0] = FIELD_DP32(s->idr[0], IDR0, TTENDIAN, 2); /* little endian */
- s->idr[0] = FIELD_DP32(s->idr[0], IDR0, STALL_MODEL, 1); /* No stall */
+ s->idr[0] = FIELD_DP32(s->idr[0], IDR0, STALL_MODEL, 0); /* stall */
/* terminated transaction will always be aborted/error returned */
s->idr[0] = FIELD_DP32(s->idr[0], IDR0, TERM_MODEL, 1);
/* 2-level stream table supported */
s->idr[0] = FIELD_DP32(s->idr[0], IDR0, STLEVEL, 1);
s->idr[1] = FIELD_DP32(s->idr[1], IDR1, SIDSIZE, SMMU_IDR1_SIDSIZE);
+ s->idr[1] = FIELD_DP32(s->idr[1], IDR1, SSIDSIZE, SMMU_IDR1_SSIDSIZE);
s->idr[1] = FIELD_DP32(s->idr[1], IDR1, EVENTQS, SMMU_EVENTQS);
s->idr[1] = FIELD_DP32(s->idr[1], IDR1, CMDQS, SMMU_CMDQS);
@@ -361,7 +362,7 @@ static void smmuv3_init_regs(SMMUv3State *s)
s->idr[3] = FIELD_DP32(s->idr[3], IDR3, RIL, 1);
s->idr[3] = FIELD_DP32(s->idr[3], IDR3, BBML, 2);
- s->idr[5] = FIELD_DP32(s->idr[5], IDR5, OAS, SMMU_IDR5_OAS); /* 44 bits */
+ s->idr[5] = FIELD_DP32(s->idr[5], IDR5, OAS, SMMU_IDR5_OAS); /* 48 bits */
/* 4K, 16K and 64K granule support */
s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN4K, 1);
s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN16K, 1);
@@ -776,9 +777,6 @@ static int decode_cd(SMMUTransCfg *cfg, CD *cd, SMMUEventInfo *event)
if (!CD_A(cd)) {
goto bad_cd; /* SMMU_IDR0.TERM_MODEL == 1 */
}
- if (CD_S(cd)) {
- goto bad_cd; /* !STE_SECURE && SMMU_IDR0.STALL_MODEL == 1 */
- }
if (CD_HA(cd) || CD_HD(cd)) {
goto bad_cd; /* HTTU = 0 */
}
--
2.41.0.windows.1

View File

@ -0,0 +1,229 @@
From b331acc42fa54ca93496c32d92cdf5397927bff1 Mon Sep 17 00:00:00 2001
From: Nicolin Chen <nicolinc@nvidia.com>
Date: Fri, 21 Apr 2023 15:18:56 -0700
Subject: [PATCH] hw/arm/smmuv3: Forward cache invalidate commands via iommufd
Inroduce an SMMUCommandBatch and some helpers to batch the commands.
Rewind the q->cons accordingly when it fails to execute a batch/command.
Currently separate TLBI commands and device cache commands to avoid some
errata on certain version of SMMUs. Later it should check IIDR register
to detect if underlying SMMU hw has such an erratum.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
hw/arm/smmuv3-internal.h | 13 +++++
hw/arm/smmuv3.c | 113 ++++++++++++++++++++++++++++++++++++++-
2 files changed, 125 insertions(+), 1 deletion(-)
diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index 163459d450..a411fd4048 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -226,6 +226,19 @@ static inline bool smmuv3_gerror_irq_enabled(SMMUv3State *s)
#define Q_CONS_WRAP(q) (((q)->cons & WRAP_MASK(q)) >> (q)->log2size)
#define Q_PROD_WRAP(q) (((q)->prod & WRAP_MASK(q)) >> (q)->log2size)
+#define Q_IDX(llq, p) ((p) & ((1 << (llq)->max_n_shift) - 1))
+
+static inline int smmuv3_q_ncmds(SMMUQueue *q)
+{
+ uint32_t prod = Q_PROD(q);
+ uint32_t cons = Q_CONS(q);
+
+ if (Q_PROD_WRAP(q) == Q_CONS_WRAP(q))
+ return prod - cons;
+ else
+ return WRAP_MASK(q) - cons + prod;
+}
+
static inline bool smmuv3_q_full(SMMUQueue *q)
{
return ((q->cons ^ q->prod) & WRAP_INDEX_MASK(q)) == WRAP_MASK(q);
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index b2ffe2d40b..b860c8385f 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1357,16 +1357,85 @@ static void smmuv3_invalidate_nested_ste(SMMUSIDRange *sid_range)
}
}
+/**
+ * SMMUCommandBatch - batch of commands to issue for nested SMMU invalidation
+ * @cmds: Pointer to list of commands
+ * @cons: Pointer to list of CONS corresponding to the commands
+ * @ncmds: Total ncmds in the batch
+ * @dev_cache: Issue to a device cache
+ */
+typedef struct SMMUCommandBatch {
+ Cmd *cmds;
+ uint32_t *cons;
+ uint32_t ncmds;
+ bool dev_cache;
+} SMMUCommandBatch;
+
+/* Update batch->ncmds to the number of execute cmds */
+static int smmuv3_issue_cmd_batch(SMMUState *bs, SMMUCommandBatch *batch)
+{
+ uint32_t total = batch->ncmds;
+ int ret;
+
+ ret = smmu_viommu_invalidate_cache(bs->viommu->core,
+ IOMMU_VIOMMU_INVALIDATE_DATA_ARM_SMMUV3,
+ sizeof(Cmd), &batch->ncmds, batch->cmds);
+ if (total != batch->ncmds) {
+ error_report("%s failed: ret=%d, total=%d, done=%d",
+ __func__, ret, total, batch->ncmds);
+ return ret;
+ }
+
+ batch->ncmds = 0;
+ batch->dev_cache = false;
+ return ret;
+}
+
+static int smmuv3_batch_cmds(SMMUState *bs, SMMUCommandBatch *batch,
+ Cmd *cmd, uint32_t *cons, bool dev_cache)
+{
+ int ret;
+
+ if (!bs->nested || !bs->viommu) {
+ return 0;
+ }
+
+ /*
+ * Currently separate dev_cache and hwpt for safety, which might not be
+ * necessary if underlying HW SMMU does not have the errata.
+ *
+ * TODO check IIDR register values read from hw_info.
+ */
+ if (batch->ncmds && (dev_cache != batch->dev_cache)) {
+ ret = smmuv3_issue_cmd_batch(bs, batch);
+ if (ret) {
+ *cons = batch->cons[batch->ncmds];
+ return ret;
+ }
+ }
+ batch->dev_cache = dev_cache;
+ batch->cmds[batch->ncmds] = *cmd;
+ batch->cons[batch->ncmds++] = *cons;
+ return 0;
+}
+
static int smmuv3_cmdq_consume(SMMUv3State *s)
{
SMMUState *bs = ARM_SMMU(s);
SMMUCmdError cmd_error = SMMU_CERROR_NONE;
SMMUQueue *q = &s->cmdq;
SMMUCommandType type = 0;
+ SMMUCommandBatch batch = {};
+ uint32_t ncmds = 0;
if (!smmuv3_cmdq_enabled(s)) {
return 0;
}
+
+ ncmds = smmuv3_q_ncmds(q);
+ batch.cmds = g_new0(Cmd, ncmds);
+ batch.cons = g_new0(uint32_t, ncmds);
+
/*
* some commands depend on register values, typically CR0. In case those
* register values change while handling the command, spec says it
@@ -1463,6 +1532,13 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
trace_smmuv3_cmdq_cfgi_cd(sid);
smmuv3_flush_config(sdev);
+
+ if (sdev->s1_hwpt) {
+ if (smmuv3_batch_cmds(sdev->smmu, &batch, &cmd, &q->cons, true)) {
+ cmd_error = SMMU_CERROR_ILL;
+ break;
+ }
+ }
break;
}
case SMMU_CMD_TLBI_NH_ASID:
@@ -1477,6 +1553,10 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
trace_smmuv3_cmdq_tlbi_nh_asid(asid);
smmu_inv_notifiers_all(&s->smmu_state);
smmu_iotlb_inv_asid(bs, asid);
+ if (smmuv3_batch_cmds(bs, &batch, &cmd, &q->cons, false)) {
+ cmd_error = SMMU_CERROR_ILL;
+ break;
+ }
break;
}
case SMMU_CMD_TLBI_NH_ALL:
@@ -1489,6 +1569,11 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
trace_smmuv3_cmdq_tlbi_nh();
smmu_inv_notifiers_all(&s->smmu_state);
smmu_iotlb_inv_all(bs);
+
+ if (smmuv3_batch_cmds(bs, &batch, &cmd, &q->cons, false)) {
+ cmd_error = SMMU_CERROR_ILL;
+ break;
+ }
break;
case SMMU_CMD_TLBI_NH_VAA:
case SMMU_CMD_TLBI_NH_VA:
@@ -1497,7 +1582,24 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
break;
}
smmuv3_range_inval(bs, &cmd);
+
+ if (smmuv3_batch_cmds(bs, &batch, &cmd, &q->cons, false)) {
+ cmd_error = SMMU_CERROR_ILL;
+ break;
+ }
break;
+ case SMMU_CMD_ATC_INV:
+ {
+ SMMUDevice *sdev = smmu_find_sdev(bs, CMD_SID(&cmd));
+
+ if (sdev->s1_hwpt) {
+ if (smmuv3_batch_cmds(sdev->smmu, &batch, &cmd, &q->cons, true)) {
+ cmd_error = SMMU_CERROR_ILL;
+ break;
+ }
+ }
+ break;
+ }
case SMMU_CMD_TLBI_S12_VMALL:
{
uint16_t vmid = CMD_VMID(&cmd);
@@ -1529,7 +1631,6 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
case SMMU_CMD_TLBI_EL2_ASID:
case SMMU_CMD_TLBI_EL2_VA:
case SMMU_CMD_TLBI_EL2_VAA:
- case SMMU_CMD_ATC_INV:
case SMMU_CMD_PRI_RESP:
case SMMU_CMD_RESUME:
case SMMU_CMD_STALL_TERM:
@@ -1554,12 +1655,22 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
*/
queue_cons_incr(q);
}
+ qemu_mutex_lock(&s->mutex);
+ if (!cmd_error && batch.ncmds && bs->viommu) {
+ if (smmuv3_issue_cmd_batch(bs, &batch)) {
+ q->cons = batch.cons[batch.ncmds];
+ cmd_error = SMMU_CERROR_ILL;
+ }
+ }
+ qemu_mutex_unlock(&s->mutex);
if (cmd_error) {
trace_smmuv3_cmdq_consume_error(smmu_cmd_string(type), cmd_error);
smmu_write_cmdq_err(s, cmd_error);
smmuv3_trigger_irq(s, SMMU_IRQ_GERROR, R_GERROR_CMDQ_ERR_MASK);
}
+ g_free(batch.cmds);
+ g_free(batch.cons);
trace_smmuv3_cmdq_consume_out(Q_PROD(q), Q_CONS(q),
Q_PROD_WRAP(q), Q_CONS_WRAP(q));
--
2.41.0.windows.1

View File

@ -0,0 +1,43 @@
From 9f3b8c283d4c1014ff292faddb78bbbfd7ec22d3 Mon Sep 17 00:00:00 2001
From: Nicolin Chen <nicolinc@nvidia.com>
Date: Tue, 9 Apr 2024 01:49:26 +0000
Subject: [PATCH] hw/arm/smmuv3: Ignore IOMMU_NOTIFIER_MAP for nested-smmuv3
If a device's MemmoryRegion type is iommu, vfio core registers a listener,
passing the IOMMU_NOTIFIER_IOTLB_EVENTS flag (bundle of IOMMU_NOTIFIER_MAP
and IOMMU_NOTIFIER_UNMAP).
On the other hand, nested SMMUv3 does not use a map notifier. And it would
only insert an IOTLB entry for MSI doorbell page mapping, which can simply
be done by the mr->translate call.
Ignore the IOMMU_NOTIFIER_MAP flag and drop the error out.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
hw/arm/smmuv3.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 64ca4c5542..db111220c7 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1881,12 +1881,9 @@ static int smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
return -EINVAL;
}
- if (new & IOMMU_NOTIFIER_MAP) {
- error_setg(errp,
- "device %02x.%02x.%x requires iommu MAP notifier which is "
- "not currently supported", pci_bus_num(sdev->bus),
- PCI_SLOT(sdev->devfn), PCI_FUNC(sdev->devfn));
- return -EINVAL;
+ /* nested-smmuv3 does not need IOMMU_NOTIFIER_MAP. Ignore it. */
+ if (s->nested) {
+ new &= ~IOMMU_NOTIFIER_MAP;
}
if (old == IOMMU_NOTIFIER_NONE) {
--
2.41.0.windows.1

View File

@ -0,0 +1,135 @@
From 03964c037862a594b4eb7d2e3754acd32c01c80b Mon Sep 17 00:00:00 2001
From: Nicolin Chen <nicolinc@nvidia.com>
Date: Thu, 22 Sep 2022 14:06:07 -0700
Subject: [PATCH] hw/arm/smmuv3: Read host SMMU device info
Read the underlying SMMU device info and set corresponding IDR bits.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
hw/arm/smmuv3.c | 77 ++++++++++++++++++++++++++++++++++++
hw/arm/trace-events | 1 +
include/hw/arm/smmu-common.h | 1 +
3 files changed, 79 insertions(+)
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index db111220c7..4208325ab3 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -254,6 +254,80 @@ void smmuv3_record_event(SMMUv3State *s, SMMUEventInfo *info)
info->recorded = true;
}
+static void smmuv3_nested_init_regs(SMMUv3State *s)
+{
+ SMMUState *bs = ARM_SMMU(s);
+ SMMUDevice *sdev;
+ uint32_t data_type;
+ uint32_t val;
+ int ret;
+
+ if (!bs->nested || !bs->viommu) {
+ return;
+ }
+
+ sdev = QLIST_FIRST(&bs->viommu->device_list);
+ if (!sdev) {
+ return;
+ }
+
+ if (sdev->info.idr[0]) {
+ error_report("reusing the previous hw_info");
+ goto out;
+ }
+
+ ret = smmu_dev_get_info(sdev, &data_type, sizeof(sdev->info), &sdev->info);
+ if (ret) {
+ error_report("failed to get SMMU device info");
+ return;
+ }
+
+ if (data_type != IOMMU_HW_INFO_TYPE_ARM_SMMUV3) {
+ error_report( "Wrong data type (%d)!", data_type);
+ return;
+ }
+
+out:
+ trace_smmuv3_get_device_info(sdev->info.idr[0], sdev->info.idr[1],
+ sdev->info.idr[3], sdev->info.idr[5]);
+
+ val = FIELD_EX32(sdev->info.idr[0], IDR0, BTM);
+ s->idr[0] = FIELD_DP32(s->idr[0], IDR0, BTM, val);
+ val = FIELD_EX32(sdev->info.idr[0], IDR0, ATS);
+ s->idr[0] = FIELD_DP32(s->idr[0], IDR0, ATS, val);
+ val = FIELD_EX32(sdev->info.idr[0], IDR0, ASID16);
+ s->idr[0] = FIELD_DP32(s->idr[0], IDR0, ASID16, val);
+ val = FIELD_EX32(sdev->info.idr[0], IDR0, TERM_MODEL);
+ s->idr[0] = FIELD_DP32(s->idr[0], IDR0, TERM_MODEL, val);
+ val = FIELD_EX32(sdev->info.idr[0], IDR0, STALL_MODEL);
+ s->idr[0] = FIELD_DP32(s->idr[0], IDR0, STALL_MODEL, val);
+ val = FIELD_EX32(sdev->info.idr[0], IDR0, STLEVEL);
+ s->idr[0] = FIELD_DP32(s->idr[0], IDR0, STLEVEL, val);
+
+ val = FIELD_EX32(sdev->info.idr[1], IDR1, SIDSIZE);
+ s->idr[1] = FIELD_DP32(s->idr[1], IDR1, SIDSIZE, val);
+ val = FIELD_EX32(sdev->info.idr[1], IDR1, SSIDSIZE);
+ s->idr[1] = FIELD_DP32(s->idr[1], IDR1, SSIDSIZE, val);
+
+ val = FIELD_EX32(sdev->info.idr[3], IDR3, HAD);
+ s->idr[3] = FIELD_DP32(s->idr[3], IDR3, HAD, val);
+ val = FIELD_EX32(sdev->info.idr[3], IDR3, RIL);
+ s->idr[3] = FIELD_DP32(s->idr[3], IDR3, RIL, val);
+ val = FIELD_EX32(sdev->info.idr[3], IDR3, BBML);
+ s->idr[3] = FIELD_DP32(s->idr[3], IDR3, BBML, val);
+
+ val = FIELD_EX32(sdev->info.idr[5], IDR5, GRAN4K);
+ s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN4K, val);
+ val = FIELD_EX32(sdev->info.idr[5], IDR5, GRAN16K);
+ s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN16K, val);
+ val = FIELD_EX32(sdev->info.idr[5], IDR5, GRAN64K);
+ s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN64K, val);
+ val = FIELD_EX32(sdev->info.idr[5], IDR5, OAS);
+ s->idr[5] = FIELD_DP32(s->idr[5], IDR5, OAS, val);
+
+ /* FIXME check iidr and aidr registrs too */
+}
+
static void smmuv3_init_regs(SMMUv3State *s)
{
/* Based on sys property, the stages supported in smmu will be advertised.*/
@@ -292,6 +366,9 @@ static void smmuv3_init_regs(SMMUv3State *s)
s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN16K, 1);
s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN64K, 1);
+ /* Override IDR fields with HW caps */
+ smmuv3_nested_init_regs(s);
+
s->cmdq.base = deposit64(s->cmdq.base, 0, 5, SMMU_CMDQS);
s->cmdq.prod = 0;
s->cmdq.cons = 0;
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 58e0636e95..1e3d86382d 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -55,5 +55,6 @@ smmuv3_cmdq_tlbi_s12_vmid(uint16_t vmid) "vmid=%d"
smmuv3_config_cache_inv(uint32_t sid) "Config cache INV for sid=0x%x"
smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node for iommu mr=%s"
smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node for iommu mr=%s"
+smmuv3_get_device_info(uint32_t idr0, uint32_t idr1, uint32_t idr3, uint32_t idr5) "idr0=0x%x idr1=0x%x idr3=0x%x idr5=0x%x"
smmuv3_inv_notifiers_iova(const char *name, uint16_t asid, uint16_t vmid, uint64_t iova, uint8_t tg, uint64_t num_pages) "iommu mr=%s asid=%d vmid=%d iova=0x%"PRIx64" tg=%d num_pages=0x%"PRIx64
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
index 37dfeed026..d120c352cf 100644
--- a/include/hw/arm/smmu-common.h
+++ b/include/hw/arm/smmu-common.h
@@ -146,6 +146,7 @@ typedef struct SMMUDevice {
AddressSpace as_sysmem;
uint32_t cfg_cache_hits;
uint32_t cfg_cache_misses;
+ struct iommu_hw_info_arm_smmuv3 info;
QLIST_ENTRY(SMMUDevice) next;
} SMMUDevice;
--
2.41.0.windows.1

View File

@ -0,0 +1,47 @@
From a6c7b16107b506f85e6643604c923291e41f70d1 Mon Sep 17 00:00:00 2001
From: Nicolin Chen <nicolinc@nvidia.com>
Date: Wed, 19 Jun 2024 04:42:33 +0000
Subject: [PATCH] hw/arm/virt: Add an SMMU_IO_LEN macro
A following patch will add a new MMIO region for nested SMMU instances.
This macro will be repeatedly used to set offsets and MMIO sizes in both
virt and virt-acpi-build.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
hw/arm/virt.c | 2 +-
include/hw/arm/virt.h | 3 +++
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 8823f2ed1c..08c40c314b 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -155,7 +155,7 @@ static const MemMapEntry base_memmap[] = {
[VIRT_FW_CFG] = { 0x09020000, 0x00000018 },
[VIRT_GPIO] = { 0x09030000, 0x00001000 },
[VIRT_SECURE_UART] = { 0x09040000, 0x00001000 },
- [VIRT_SMMU] = { 0x09050000, 0x00020000 },
+ [VIRT_SMMU] = { 0x09050000, SMMU_IO_LEN },
[VIRT_PCDIMM_ACPI] = { 0x09070000, MEMORY_HOTPLUG_IO_LEN },
[VIRT_ACPI_GED] = { 0x09080000, ACPI_GED_EVT_SEL_LEN },
[VIRT_NVDIMM_ACPI] = { 0x09090000, NVDIMM_ACPI_IO_LEN},
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 345b2d5594..e6a449becd 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -106,6 +106,9 @@ typedef enum {
ARM_L3_CACHE
} ArmCacheType;
+/* MMIO region size for SMMUv3 */
+#define SMMU_IO_LEN 0x20000
+
enum {
VIRT_FLASH,
VIRT_MEM,
--
2.41.0.windows.1

View File

@ -0,0 +1,187 @@
From 1746ba1aee671b9552540e36a629988b00846a82 Mon Sep 17 00:00:00 2001
From: Eric Auger <eric.auger@redhat.com>
Date: Tue, 5 Oct 2021 10:53:13 +0200
Subject: [PATCH] hw/arm/virt-acpi-build: Add IORT RMR regions to handle MSI
nested binding
To handle SMMUv3 nested stage support it is practical to
expose the guest with reserved memory regions (RMRs)
covering the IOVAs used by the host kernel to map
physical MSI doorbells.
Those IOVAs belong to [0x8000000, 0x8100000] matching
MSI_IOVA_BASE and MSI_IOVA_LENGTH definitions in kernel
arm-smmu-v3 driver. This is the window used to allocate
IOVAs matching physical MSI doorbells.
With those RMRs, the guest is forced to use a flat mapping
for this range. Hence the assigned device is programmed
with one IOVA from this range. Stage 1, owned by the guest
has a flat mapping for this IOVA. Stage2, owned by the VMM
then enforces a mapping from this IOVA to the physical
MSI doorbell.
The creation of those RMR nodes only is relevant if nested
stage SMMU is in use, along with VFIO. As VFIO devices can be
hotplugged, all RMRs need to be created in advance. Hence
the patch introduces a new arm virt "nested-smmuv3" iommu type.
ARM DEN 0049E.b IORT specification also mandates that when
RMRs are present, the OS must preserve PCIe configuration
performed by the boot FW. So along with the RMR IORT nodes,
a _DSM function #5, as defined by PCI FIRMWARE SPECIFICATION
EVISION 3.3, chapter 4.6.5 is added to PCIe host bridge
and PCIe expander bridge objects.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
hw/arm/virt-acpi-build.c | 71 +++++++++++++++++++++++++++++++++++-----
1 file changed, 63 insertions(+), 8 deletions(-)
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 1d7839e4a0..ad0f79e03d 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -417,6 +417,14 @@ static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry *memmap,
.bus = vms->bus,
};
+ /*
+ * Nested SMMU requires RMRs for MSI 1-1 mapping, which
+ * require _DSM for PreservingPCI Boot Configurations
+ */
+ if (vms->iommu == VIRT_IOMMU_SMMUV3_NESTED) {
+ cfg.preserve_config = true;
+ }
+
if (vms->highmem_mmio) {
cfg.mmio64 = memmap[VIRT_HIGH_PCIE_MMIO];
}
@@ -495,7 +503,7 @@ static void acpi_dsdt_add_tpm(Aml *scope, VirtMachineState *vms)
#define IORT_NODE_OFFSET 48
static void build_iort_id_mapping(GArray *table_data, uint32_t input_base,
- uint32_t id_count, uint32_t out_ref)
+ uint32_t id_count, uint32_t out_ref, uint32_t flags)
{
/* Table 4 ID mapping format */
build_append_int_noprefix(table_data, input_base, 4); /* Input base */
@@ -503,7 +511,7 @@ static void build_iort_id_mapping(GArray *table_data, uint32_t input_base,
build_append_int_noprefix(table_data, input_base, 4); /* Output base */
build_append_int_noprefix(table_data, out_ref, 4); /* Output Reference */
/* Flags */
- build_append_int_noprefix(table_data, 0 /* Single mapping (disabled) */, 4);
+ build_append_int_noprefix(table_data, flags, 4); /* Flags */
}
struct AcpiIortIdMapping {
@@ -545,6 +553,50 @@ static int iort_idmap_compare(gconstpointer a, gconstpointer b)
return idmap_a->input_base - idmap_b->input_base;
}
+static void
+build_iort_rmr_nodes(GArray *table_data, GArray *smmu_idmaps,
+ size_t *smmu_offset, uint32_t *id)
+{
+ AcpiIortIdMapping *range;
+ int i;
+
+ for (i = 0; i < smmu_idmaps->len; i++) {
+ range = &g_array_index(smmu_idmaps, AcpiIortIdMapping, i);
+ int bdf = range->input_base;
+
+ /* Table 18 Reserved Memory Range Node */
+
+ build_append_int_noprefix(table_data, 6 /* RMR */, 1); /* Type */
+ /* Length */
+ build_append_int_noprefix(table_data, 28 + ID_MAPPING_ENTRY_SIZE + 20, 2);
+ build_append_int_noprefix(table_data, 3, 1); /* Revision */
+ build_append_int_noprefix(table_data, *id, 4); /* Identifier */
+ /* Number of ID mappings */
+ build_append_int_noprefix(table_data, 1, 4);
+ /* Reference to ID Array */
+ build_append_int_noprefix(table_data, 28, 4);
+
+ /* RMR specific data */
+
+ /* Flags */
+ build_append_int_noprefix(table_data, 0 /* Disallow remapping */, 4);
+ /* Number of Memory Range Descriptors */
+ build_append_int_noprefix(table_data, 1 , 4);
+ /* Reference to Memory Range Descriptors */
+ build_append_int_noprefix(table_data, 28 + ID_MAPPING_ENTRY_SIZE, 4);
+ build_iort_id_mapping(table_data, bdf, range->id_count, smmu_offset[i], 1);
+
+ /* Table 19 Memory Range Descriptor */
+
+ /* Physical Range offset */
+ build_append_int_noprefix(table_data, 0x8000000, 8);
+ /* Physical Range length */
+ build_append_int_noprefix(table_data, 0x100000, 8);
+ build_append_int_noprefix(table_data, 0, 4); /* Reserved */
+ *id += 1;
+ }
+}
+
/*
* Input Output Remapping Table (IORT)
* Conforms to "IO Remapping Table System Software on ARM Platforms",
@@ -554,7 +606,6 @@ static void
build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
{
int i, nb_nodes, rc_mapping_count;
- const uint32_t iort_node_offset = IORT_NODE_OFFSET;
size_t node_size, *smmu_offset;
AcpiIortIdMapping *idmap;
hwaddr base;
@@ -563,7 +614,7 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
GArray *smmu_idmaps = g_array_new(false, true, sizeof(AcpiIortIdMapping));
GArray *its_idmaps = g_array_new(false, true, sizeof(AcpiIortIdMapping));
- AcpiTable table = { .sig = "IORT", .rev = 3, .oem_id = vms->oem_id,
+ AcpiTable table = { .sig = "IORT", .rev = 5, .oem_id = vms->oem_id,
.oem_table_id = vms->oem_table_id };
/* Table 2 The IORT */
acpi_table_begin(&table, table_data);
@@ -668,7 +719,7 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
build_append_int_noprefix(table_data, 0, 4);
/* output IORT node is the ITS group node (the first node) */
- build_iort_id_mapping(table_data, 0, 0x10000, IORT_NODE_OFFSET);
+ build_iort_id_mapping(table_data, 0, 0x10000, IORT_NODE_OFFSET, 0);
}
/* Table 17 Root Complex Node */
@@ -709,7 +760,7 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
range = &g_array_index(smmu_idmaps, AcpiIortIdMapping, i);
/* output IORT node is the smmuv3 node */
build_iort_id_mapping(table_data, range->input_base,
- range->id_count, smmu_offset[i]);
+ range->id_count, smmu_offset[i], 0);
}
/* bypassed RIDs connect to ITS group node directly: RC -> ITS */
@@ -717,11 +768,15 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
range = &g_array_index(its_idmaps, AcpiIortIdMapping, i);
/* output IORT node is the ITS group node (the first node) */
build_iort_id_mapping(table_data, range->input_base,
- range->id_count, iort_node_offset);
+ range->id_count, IORT_NODE_OFFSET, 0);
}
} else {
/* output IORT node is the ITS group node (the first node) */
- build_iort_id_mapping(table_data, 0, 0xFFFF, IORT_NODE_OFFSET);
+ build_iort_id_mapping(table_data, 0, 0x10000, IORT_NODE_OFFSET, 0);
+ }
+
+ if (vms->iommu == VIRT_IOMMU_SMMUV3_NESTED) {
+ build_iort_rmr_nodes(table_data, smmu_idmaps, smmu_offset, &id);
}
acpi_table_end(linker, &table);
--
2.41.0.windows.1

View File

@ -0,0 +1,155 @@
From a7ffb5856940a1515ef84a4d4644b7c7c07afb8f Mon Sep 17 00:00:00 2001
From: Nicolin Chen <nicolinc@nvidia.com>
Date: Wed, 6 Nov 2024 19:22:13 +0000
Subject: [PATCH] hw/arm/virt-acpi-build: Build IORT with multiple SMMU nodes
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Now that we can have multiple user-creatable smmuv3-nested
devices, each associated with different pci buses, update
IORT ID mappings accordingly.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
hw/arm/virt-acpi-build.c | 43 ++++++++++++++++++++++++++++------------
include/hw/arm/virt.h | 6 ++++++
2 files changed, 36 insertions(+), 13 deletions(-)
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 076781423b..1d7839e4a0 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -555,8 +555,10 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
{
int i, nb_nodes, rc_mapping_count;
const uint32_t iort_node_offset = IORT_NODE_OFFSET;
- size_t node_size, smmu_offset = 0;
+ size_t node_size, *smmu_offset;
AcpiIortIdMapping *idmap;
+ hwaddr base;
+ int irq, num_smmus = 0;
uint32_t id = 0;
GArray *smmu_idmaps = g_array_new(false, true, sizeof(AcpiIortIdMapping));
GArray *its_idmaps = g_array_new(false, true, sizeof(AcpiIortIdMapping));
@@ -566,7 +568,21 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
/* Table 2 The IORT */
acpi_table_begin(&table, table_data);
- if (vms->iommu == VIRT_IOMMU_SMMUV3) {
+ if (vms->smmu_nested_count) {
+ irq = vms->irqmap[VIRT_SMMU_NESTED] + ARM_SPI_BASE;
+ base = vms->memmap[VIRT_SMMU_NESTED].base;
+ num_smmus = vms->smmu_nested_count;
+ } else if (virt_has_smmuv3(vms)) {
+ irq = vms->irqmap[VIRT_SMMU] + ARM_SPI_BASE;
+ base = vms->memmap[VIRT_SMMU].base;
+ num_smmus = 1;
+ }
+
+ smmu_offset = g_new0(size_t, num_smmus);
+ nb_nodes = 2; /* RC, ITS */
+ nb_nodes += num_smmus; /* SMMU nodes */
+
+ if (virt_has_smmuv3(vms)) {
AcpiIortIdMapping next_range = {0};
object_child_foreach_recursive(object_get_root(),
@@ -588,18 +604,19 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
}
next_range.input_base = idmap->input_base + idmap->id_count;
+ if (vms->iommu == VIRT_IOMMU_SMMUV3_NESTED) {
+ nb_nodes++; /* RMR node per SMMU */
+ }
}
/* Append the last RC -> ITS ID mapping */
- if (next_range.input_base < 0xFFFF) {
- next_range.id_count = 0xFFFF - next_range.input_base;
+ if (next_range.input_base < 0x10000) {
+ next_range.id_count = 0x10000 - next_range.input_base;
g_array_append_val(its_idmaps, next_range);
}
- nb_nodes = 3; /* RC, ITS, SMMUv3 */
rc_mapping_count = smmu_idmaps->len + its_idmaps->len;
} else {
- nb_nodes = 2; /* RC, ITS */
rc_mapping_count = 1;
}
/* Number of IORT Nodes */
@@ -621,10 +638,9 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
/* GIC ITS Identifier Array */
build_append_int_noprefix(table_data, 0 /* MADT translation_id */, 4);
- if (vms->iommu == VIRT_IOMMU_SMMUV3) {
- int irq = vms->irqmap[VIRT_SMMU] + ARM_SPI_BASE;
+ for (i = 0; i < num_smmus; i++) {
+ smmu_offset[i] = table_data->len - table.table_offset;
- smmu_offset = table_data->len - table.table_offset;
/* Table 9 SMMUv3 Format */
build_append_int_noprefix(table_data, 4 /* SMMUv3 */, 1); /* Type */
node_size = SMMU_V3_ENTRY_SIZE + ID_MAPPING_ENTRY_SIZE;
@@ -635,7 +651,7 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
/* Reference to ID Array */
build_append_int_noprefix(table_data, SMMU_V3_ENTRY_SIZE, 4);
/* Base address */
- build_append_int_noprefix(table_data, vms->memmap[VIRT_SMMU].base, 8);
+ build_append_int_noprefix(table_data, base + (i * SMMU_IO_LEN), 8);
/* Flags */
build_append_int_noprefix(table_data, 1 /* COHACC Override */, 4);
build_append_int_noprefix(table_data, 0, 4); /* Reserved */
@@ -646,12 +662,13 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
build_append_int_noprefix(table_data, irq + 1, 4); /* PRI */
build_append_int_noprefix(table_data, irq + 3, 4); /* GERR */
build_append_int_noprefix(table_data, irq + 2, 4); /* Sync */
+ irq += NUM_SMMU_IRQS;
build_append_int_noprefix(table_data, 0, 4); /* Proximity domain */
/* DeviceID mapping index (ignored since interrupts are GSIV based) */
build_append_int_noprefix(table_data, 0, 4);
/* output IORT node is the ITS group node (the first node) */
- build_iort_id_mapping(table_data, 0, 0xFFFF, IORT_NODE_OFFSET);
+ build_iort_id_mapping(table_data, 0, 0x10000, IORT_NODE_OFFSET);
}
/* Table 17 Root Complex Node */
@@ -684,7 +701,7 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
build_append_int_noprefix(table_data, 0, 3); /* Reserved */
/* Output Reference */
- if (vms->iommu == VIRT_IOMMU_SMMUV3) {
+ if (virt_has_smmuv3(vms)) {
AcpiIortIdMapping *range;
/* translated RIDs connect to SMMUv3 node: RC -> SMMUv3 -> ITS */
@@ -692,7 +709,7 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
range = &g_array_index(smmu_idmaps, AcpiIortIdMapping, i);
/* output IORT node is the smmuv3 node */
build_iort_id_mapping(table_data, range->input_base,
- range->id_count, smmu_offset);
+ range->id_count, smmu_offset[i]);
}
/* bypassed RIDs connect to ITS group node directly: RC -> ITS */
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index cd41e28202..bc3c8b70da 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -295,4 +295,10 @@ static inline int virt_gicv3_redist_region_count(VirtMachineState *vms)
vms->highmem_redists) ? 2 : 1;
}
+static inline bool virt_has_smmuv3(const VirtMachineState *vms)
+{
+ return vms->iommu == VIRT_IOMMU_SMMUV3 ||
+ vms->iommu == VIRT_IOMMU_SMMUV3_NESTED;
+}
+
#endif /* QEMU_ARM_VIRT_H */
--
2.41.0.windows.1

View File

@ -0,0 +1,85 @@
From ecca2052693cc2a91459ac418bface2f1e635c88 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini@redhat.com>
Date: Thu, 14 Nov 2024 13:53:18 +0100
Subject: [PATCH] hw/audio/hda: fix memory leak on audio setup
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
When SET_STREAM_FORMAT is called, the st->buft timer is overwritten, thus
causing a memory leak. This was originally fixed in commit 816139ae6a5
("hw/audio/hda: fix memory leak on audio setup", 2024-11-14) but that
caused the audio to break in SPICE.
Fortunately, a simpler fix is possible. The timer only needs to be
reset, because the callback is always the same (st->output is set at
realize time in hda_audio_init); call to timer_new_ns overkill. Replace
it with timer_del and only initialize the timer once; for simplicity,
do it even if use_timer is false.
An even simpler fix would be to free the old time in hda_audio_setup().
However, it seems better to place the initialization of the timer close
to that of st->ouput.
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-ID: <20241114125318.1707590-3-pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 626b39006d2f9b1378a04cb88a2187bb852cb055)
Signed-off-by: zhujun2 <zhujun2_yewu@cmss.chinamobile.com>
---
hw/audio/hda-codec.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/hw/audio/hda-codec.c b/hw/audio/hda-codec.c
index 19f401cabe..ac908e56c6 100644
--- a/hw/audio/hda-codec.c
+++ b/hw/audio/hda-codec.c
@@ -487,8 +487,7 @@ static void hda_audio_setup(HDAAudioStream *st)
if (st->output) {
if (use_timer) {
cb = hda_audio_output_cb;
- st->buft = timer_new_ns(QEMU_CLOCK_VIRTUAL,
- hda_audio_output_timer, st);
+ timer_del(st->buft);
} else {
cb = hda_audio_compat_output_cb;
}
@@ -497,8 +496,7 @@ static void hda_audio_setup(HDAAudioStream *st)
} else {
if (use_timer) {
cb = hda_audio_input_cb;
- st->buft = timer_new_ns(QEMU_CLOCK_VIRTUAL,
- hda_audio_input_timer, st);
+ timer_del(st->buft);
} else {
cb = hda_audio_compat_input_cb;
}
@@ -726,8 +724,12 @@ static void hda_audio_init(HDACodecDevice *hda,
st->gain_right = QEMU_HDA_AMP_STEPS;
st->compat_bpos = sizeof(st->compat_buf);
st->output = true;
+ st->buft = timer_new_ns(QEMU_CLOCK_VIRTUAL,
+ hda_audio_output_timer, st);
} else {
st->output = false;
+ st->buft = timer_new_ns(QEMU_CLOCK_VIRTUAL,
+ hda_audio_input_timer, st);
}
st->format = AC_FMT_TYPE_PCM | AC_FMT_BITS_16 |
(1 << AC_FMT_CHAN_SHIFT);
@@ -750,9 +752,7 @@ static void hda_audio_exit(HDACodecDevice *hda)
if (st->node == NULL) {
continue;
}
- if (a->use_timer) {
- timer_free(st->buft);
- }
+ timer_free(st->buft);
if (st->output) {
AUD_close_out(&a->card, st->voice.out);
} else {
--
2.41.0.windows.1

View File

@ -0,0 +1,42 @@
From 482808a35957c10d9eb4264492a8e11a2ba749c1 Mon Sep 17 00:00:00 2001
From: gubin <gubin_yewu@cmss.chinamobile.com>
Date: Fri, 22 Nov 2024 17:49:38 +0800
Subject: [PATCH] hw/audio/virtio-snd: Always use little endian audio format
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
cherry-pick from a276ec8e2632c9015d0f9b4e47194e4e91dfa8bb
The VIRTIO Sound Device conforms with the Virtio spec v1.2,
thus only use little endianness.
Remove the suspicious target_words_bigendian() noticed during
code review.
Cc: qemu-stable@nongnu.org
Fixes: eb9ad377bb ("virtio-sound: handle control messages and streams")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20240422211830.25606-1-philmd@linaro.org>
Signed-off-by: gubin <gubin_yewu@cmss.chinamobile.com>
---
hw/audio/virtio-snd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/audio/virtio-snd.c b/hw/audio/virtio-snd.c
index 817fdcd910..9f7a69e408 100644
--- a/hw/audio/virtio-snd.c
+++ b/hw/audio/virtio-snd.c
@@ -377,7 +377,7 @@ static void virtio_snd_get_qemu_audsettings(audsettings *as,
as->nchannels = MIN(AUDIO_MAX_CHANNELS, params->channels);
as->fmt = virtio_snd_get_qemu_format(params->format);
as->freq = virtio_snd_get_qemu_freq(params->rate);
- as->endianness = target_words_bigendian() ? 1 : 0;
+ as->endianness = 0; /* Conforming to VIRTIO 1.0: always little endian. */
}
/*
--
2.41.0.windows.1

View File

@ -0,0 +1,33 @@
From 5405fa36c5f2784a9a6b19ee60d44b6cffb9f769 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Sat, 11 Jan 2025 10:52:57 +0800
Subject: [PATCH] hw/i386: Activate IOMMUFD for q35 machines
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
---
hw/i386/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
index 682e324f1c..908f29e02b 100644
--- a/hw/i386/Kconfig
+++ b/hw/i386/Kconfig
@@ -105,6 +105,7 @@ config Q35
imply E1000E_PCI_EXPRESS
imply VMPORT
imply VMMOUSE
+ imply IOMMUFD
select PC_PCI
select PC_ACPI
select PCI_EXPRESS_Q35
--
2.41.0.windows.1

View File

@ -0,0 +1,204 @@
From d6f75f9e532a4a4b6bb4610049f4fa7f26160733 Mon Sep 17 00:00:00 2001
From: Xianglai Li <lixianglai@loongson.cn>
Date: Thu, 20 Feb 2025 19:24:18 +0800
Subject: [PATCH] hw/intc: Add extioi ability of 256 vcpu interrupt routing
Add the feature field for the CPU-encoded interrupt
route to extioi and the corresponding mechanism for
backup recovery.
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
---
hw/intc/loongarch_extioi_kvm.c | 65 ++++++++++++++++++++++++++++--
hw/loongarch/virt.c | 2 +
include/hw/intc/loongarch_extioi.h | 4 ++
linux-headers/asm-loongarch/kvm.h | 10 +++++
4 files changed, 77 insertions(+), 4 deletions(-)
diff --git a/hw/intc/loongarch_extioi_kvm.c b/hw/intc/loongarch_extioi_kvm.c
index f5bbc33255..2e7c764b7c 100644
--- a/hw/intc/loongarch_extioi_kvm.c
+++ b/hw/intc/loongarch_extioi_kvm.c
@@ -18,8 +18,32 @@
static void kvm_extioi_access_regs(int fd, uint64_t addr,
void *val, int is_write)
{
- kvm_device_access(fd, KVM_DEV_LOONGARCH_EXTIOI_GRP_REGS,
- addr, val, is_write, &error_abort);
+ kvm_device_access(fd, KVM_DEV_LOONGARCH_EXTIOI_GRP_REGS,
+ addr, val, is_write, &error_abort);
+}
+
+static void kvm_extioi_access_sw_status(int fd, uint64_t addr,
+ void *val, bool is_write)
+{
+ kvm_device_access(fd, KVM_DEV_LOONGARCH_EXTIOI_GRP_SW_STATUS,
+ addr, val, is_write, &error_abort);
+}
+
+static void kvm_extioi_save_load_sw_status(void *opaque, bool is_write)
+{
+ KVMLoongArchExtIOI *s = (KVMLoongArchExtIOI *)opaque;
+ KVMLoongArchExtIOIClass *class = KVM_LOONGARCH_EXTIOI_GET_CLASS(s);
+ int fd = class->dev_fd;
+ int addr;
+
+ addr = KVM_DEV_LOONGARCH_EXTIOI_SW_STATUS_NUM_CPU;
+ kvm_extioi_access_sw_status(fd, addr, (void *)&s->num_cpu, is_write);
+
+ addr = KVM_DEV_LOONGARCH_EXTIOI_SW_STATUS_FEATURE;
+ kvm_extioi_access_sw_status(fd, addr, (void *)&s->features, is_write);
+
+ addr = KVM_DEV_LOONGARCH_EXTIOI_SW_STATUS_STATE;
+ kvm_extioi_access_sw_status(fd, addr, (void *)&s->status, is_write);
}
static int kvm_loongarch_extioi_pre_save(void *opaque)
@@ -41,6 +65,8 @@ static int kvm_loongarch_extioi_pre_save(void *opaque)
kvm_extioi_access_regs(fd, EXTIOI_COREISR_START,
(void *)s->coreisr, false);
+ kvm_extioi_save_load_sw_status(opaque, false);
+
return 0;
}
@@ -61,12 +87,19 @@ static int kvm_loongarch_extioi_post_load(void *opaque, int version_id)
(void *)s->sw_coremap, true);
kvm_extioi_access_regs(fd, EXTIOI_COREISR_START, (void *)s->coreisr, true);
+ kvm_extioi_save_load_sw_status(opaque, true);
+
+ kvm_device_access(fd, KVM_DEV_LOONGARCH_EXTIOI_GRP_CTRL,
+ KVM_DEV_LOONGARCH_EXTIOI_CTRL_LOAD_FINISHED,
+ NULL, true, &error_abort);
+
return 0;
}
static void kvm_loongarch_extioi_realize(DeviceState *dev, Error **errp)
{
KVMLoongArchExtIOIClass *extioi_class = KVM_LOONGARCH_EXTIOI_GET_CLASS(dev);
+ KVMLoongArchExtIOI *s = KVM_LOONGARCH_EXTIOI(dev);
struct kvm_create_device cd = {0};
Error *err = NULL;
int ret,i;
@@ -77,6 +110,10 @@ static void kvm_loongarch_extioi_realize(DeviceState *dev, Error **errp)
return;
}
+ if (s->features & BIT(EXTIOI_HAS_VIRT_EXTENSION)) {
+ s->features |= EXTIOI_VIRT_HAS_FEATURES;
+ }
+
if (!extioi_class->is_created) {
cd.type = KVM_DEV_TYPE_LA_EXTIOI;
ret = kvm_vm_ioctl(kvm_state, KVM_CREATE_DEVICE, &cd);
@@ -87,6 +124,15 @@ static void kvm_loongarch_extioi_realize(DeviceState *dev, Error **errp)
}
extioi_class->is_created = true;
extioi_class->dev_fd = cd.fd;
+
+ kvm_device_access(cd.fd, KVM_DEV_LOONGARCH_EXTIOI_GRP_CTRL,
+ KVM_DEV_LOONGARCH_EXTIOI_CTRL_INIT_NUM_CPU,
+ &s->num_cpu, true, NULL);
+
+ kvm_device_access(cd.fd, KVM_DEV_LOONGARCH_EXTIOI_GRP_CTRL,
+ KVM_DEV_LOONGARCH_EXTIOI_CTRL_INIT_FEATURE,
+ &s->features, true, NULL);
+
fprintf(stdout, "Create LoongArch extioi irqchip in KVM done!\n");
}
@@ -102,8 +148,8 @@ static void kvm_loongarch_extioi_realize(DeviceState *dev, Error **errp)
static const VMStateDescription vmstate_kvm_extioi_core = {
.name = "kvm-extioi-single",
- .version_id = 1,
- .minimum_version_id = 1,
+ .version_id = 2,
+ .minimum_version_id = 2,
.pre_save = kvm_loongarch_extioi_pre_save,
.post_load = kvm_loongarch_extioi_post_load,
.fields = (VMStateField[]) {
@@ -119,10 +165,20 @@ static const VMStateDescription vmstate_kvm_extioi_core = {
EXTIOI_IRQS_IPMAP_SIZE / 4),
VMSTATE_UINT32_ARRAY(coremap, KVMLoongArchExtIOI, EXTIOI_IRQS / 4),
VMSTATE_UINT8_ARRAY(sw_coremap, KVMLoongArchExtIOI, EXTIOI_IRQS),
+ VMSTATE_UINT32(num_cpu, KVMLoongArchExtIOI),
+ VMSTATE_UINT32(features, KVMLoongArchExtIOI),
+ VMSTATE_UINT32(status, KVMLoongArchExtIOI),
VMSTATE_END_OF_LIST()
}
};
+static Property extioi_properties[] = {
+ DEFINE_PROP_UINT32("num-cpu", KVMLoongArchExtIOI, num_cpu, 1),
+ DEFINE_PROP_BIT("has-virtualization-extension", KVMLoongArchExtIOI,
+ features, EXTIOI_HAS_VIRT_EXTENSION, 0),
+ DEFINE_PROP_END_OF_LIST(),
+};
+
static void kvm_loongarch_extioi_class_init(ObjectClass *oc, void *data)
{
DeviceClass *dc = DEVICE_CLASS(oc);
@@ -131,6 +187,7 @@ static void kvm_loongarch_extioi_class_init(ObjectClass *oc, void *data)
extioi_class->parent_realize = dc->realize;
dc->realize = kvm_loongarch_extioi_realize;
extioi_class->is_created = false;
+ device_class_set_props(dc, extioi_properties);
dc->vmsd = &vmstate_kvm_extioi_core;
}
diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index ce026a4c3c..233297d78f 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -874,6 +874,8 @@ static void virt_irq_init(LoongArchVirtMachineState *lvms)
/* Create EXTIOI device */
if (kvm_enabled() && kvm_irqchip_in_kernel()) {
extioi = qdev_new(TYPE_KVM_LOONGARCH_EXTIOI);
+ qdev_prop_set_uint32(extioi, "num-cpu", ms->smp.max_cpus);
+ qdev_prop_set_bit(extioi, "has-virtualization-extension", true);
sysbus_realize_and_unref(SYS_BUS_DEVICE(extioi), &error_fatal);
} else {
extioi = qdev_new(TYPE_LOONGARCH_EXTIOI);
diff --git a/include/hw/intc/loongarch_extioi.h b/include/hw/intc/loongarch_extioi.h
index 9966cd98d3..92b38d5c38 100644
--- a/include/hw/intc/loongarch_extioi.h
+++ b/include/hw/intc/loongarch_extioi.h
@@ -94,6 +94,10 @@ struct LoongArchExtIOI {
struct KVMLoongArchExtIOI {
SysBusDevice parent_obj;
+ uint32_t num_cpu;
+ uint32_t features;
+ uint32_t status;
+
/* hardware state */
uint32_t nodetype[EXTIOI_IRQS_NODETYPE_COUNT / 2];
uint32_t bounce[EXTIOI_IRQS_GROUP_COUNT];
diff --git a/linux-headers/asm-loongarch/kvm.h b/linux-headers/asm-loongarch/kvm.h
index 13c1280662..34abd65939 100644
--- a/linux-headers/asm-loongarch/kvm.h
+++ b/linux-headers/asm-loongarch/kvm.h
@@ -141,6 +141,16 @@ struct kvm_iocsr_entry {
#define KVM_DEV_LOONGARCH_EXTIOI_GRP_REGS 0x40000003
+#define KVM_DEV_LOONGARCH_EXTIOI_GRP_SW_STATUS 0x40000006
+#define KVM_DEV_LOONGARCH_EXTIOI_SW_STATUS_NUM_CPU 0x0
+#define KVM_DEV_LOONGARCH_EXTIOI_SW_STATUS_FEATURE 0x1
+#define KVM_DEV_LOONGARCH_EXTIOI_SW_STATUS_STATE 0x2
+
+#define KVM_DEV_LOONGARCH_EXTIOI_GRP_CTRL 0x40000007
+#define KVM_DEV_LOONGARCH_EXTIOI_CTRL_INIT_NUM_CPU 0x0
+#define KVM_DEV_LOONGARCH_EXTIOI_CTRL_INIT_FEATURE 0x1
+#define KVM_DEV_LOONGARCH_EXTIOI_CTRL_LOAD_FINISHED 0x3
+
#define KVM_DEV_LOONGARCH_PCH_PIC_GRP_CTRL 0x40000004
#define KVM_DEV_LOONGARCH_PCH_PIC_CTRL_INIT 0
--
2.41.0.windows.1

View File

@ -0,0 +1,39 @@
From b44fc9f3fc91363c55f6ba739f6c09222f979d88 Mon Sep 17 00:00:00 2001
From: Sergey Makarov <s.makarov@syntacore.com>
Date: Wed, 18 Sep 2024 17:02:29 +0300
Subject: [PATCH] hw/intc: Don't clear pending bits on IRQ lowering
According to PLIC specification (chapter 5), there
is only one case, when interrupt is claimed. Fix
PLIC controller to match this behavior.
Signed-off-by: Sergey Makarov <s.makarov@syntacore.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240918140229.124329-3-s.makarov@syntacore.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit a84be2baa9eca8bc500f866ad943b8f63dc99adf)
Signed-off-by: zhujun2 <zhujun2_yewu@cmss.chinamobile.com>
---
hw/intc/sifive_plic.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/hw/intc/sifive_plic.c b/hw/intc/sifive_plic.c
index 5522ede2cf..e5de52bc44 100644
--- a/hw/intc/sifive_plic.c
+++ b/hw/intc/sifive_plic.c
@@ -349,8 +349,10 @@ static void sifive_plic_irq_request(void *opaque, int irq, int level)
{
SiFivePLICState *s = opaque;
- sifive_plic_set_pending(s, irq, level > 0);
- sifive_plic_update(s);
+ if (level > 0) {
+ sifive_plic_set_pending(s, irq, true);
+ sifive_plic_update(s);
+ }
}
static void sifive_plic_realize(DeviceState *dev, Error **errp)
--
2.41.0.windows.1

View File

@ -0,0 +1,95 @@
From 16670675cbf7fc4db147a698ba7787d2e2fa675b Mon Sep 17 00:00:00 2001
From: Xianglai Li <lixianglai@loongson.cn>
Date: Wed, 26 Mar 2025 17:02:37 +0800
Subject: [PATCH] hw/loongarch/boot: Adjust the loading position of the initrd
When only the -kernel parameter is used to load the elf kernel,
the initrd is loaded in the ram. If the initrd size is too large,
the loading fails, resulting in a VM startup failure.
This patch first loads initrd near the kernel.
When the nearby memory space of the kernel is insufficient,
it tries to load it to the starting position of high memory.
If there is still not enough, qemu will report an error
and ask the user to increase the memory space for the
virtual machine to boot.
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
---
hw/loongarch/boot.c | 53 +++++++++++++++++++++++++++++++++++++--------
1 file changed, 44 insertions(+), 9 deletions(-)
diff --git a/hw/loongarch/boot.c b/hw/loongarch/boot.c
index 53dcefbb55..39c4a6d8c6 100644
--- a/hw/loongarch/boot.c
+++ b/hw/loongarch/boot.c
@@ -171,6 +171,48 @@ static uint64_t cpu_loongarch_virt_to_phys(void *opaque, uint64_t addr)
return addr & MAKE_64BIT_MASK(0, TARGET_PHYS_ADDR_SPACE_BITS);
}
+static void find_initrd_loadoffset(struct loongarch_boot_info *info,
+ uint64_t kernel_high, ssize_t kernel_size)
+{
+ hwaddr base, size, gap, low_end;
+ ram_addr_t initrd_end, initrd_start;
+
+ base = VIRT_LOWMEM_BASE;
+ gap = VIRT_LOWMEM_SIZE;
+ initrd_start = ROUND_UP(kernel_high + 4 * kernel_size, 64 * KiB);
+ initrd_end = initrd_start + initrd_size;
+
+ size = info->ram_size;
+ low_end = base + MIN(size, gap);
+ if (initrd_end <= low_end) {
+ initrd_offset = initrd_start;
+ return;
+ }
+
+ if (size <= gap) {
+ error_report("The low memory too small for initial ram disk '%s',"
+ "You need to expand the memory space",
+ info->initrd_filename);
+ exit(1);
+ }
+
+ /*
+ * Try to load initrd in the high memory
+ */
+ size -= gap;
+ base = VIRT_HIGHMEM_BASE;
+ initrd_start = ROUND_UP(base, 64 * KiB);
+ if (initrd_size <= size) {
+ initrd_offset = initrd_start;
+ return;
+ }
+
+ error_report("The high memory too small for initial ram disk '%s',"
+ "You need to expand the memory space",
+ info->initrd_filename);
+ exit(1);
+}
+
static int64_t load_kernel_info(struct loongarch_boot_info *info)
{
uint64_t kernel_entry, kernel_low, kernel_high;
@@ -192,16 +234,9 @@ static int64_t load_kernel_info(struct loongarch_boot_info *info)
if (info->initrd_filename) {
initrd_size = get_image_size(info->initrd_filename);
if (initrd_size > 0) {
- initrd_offset = ROUND_UP(kernel_high + 4 * kernel_size, 64 * KiB);
-
- if (initrd_offset + initrd_size > info->ram_size) {
- error_report("memory too small for initial ram disk '%s'",
- info->initrd_filename);
- exit(1);
- }
-
+ find_initrd_loadoffset(info, kernel_high, kernel_size);
initrd_size = load_image_targphys(info->initrd_filename, initrd_offset,
- info->ram_size - initrd_offset);
+ initrd_size);
}
if (initrd_size == (target_ulong)-1) {
--
2.41.0.windows.1

View File

@ -0,0 +1,50 @@
From 7e1bd6e7e109c6228bc4c40ea6f2af2d7f281fca Mon Sep 17 00:00:00 2001
From: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
Date: Tue, 8 Apr 2025 05:59:29 -0400
Subject: [PATCH] hw/misc/aspeed_hace: Fix buffer overflow in has_padding
function
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
cheery-pick from 78877b2e06464f49f777e086845e094ea7bc82ef
The maximum padding size is either 64 or 128 bytes and should always be smaller
than "req_len". If "padding_size" exceeds "req_len", then
"req_len - padding_size" underflows due to "uint32_t" data type, leading to a
large incorrect value (e.g., `0xFFXXXXXX`). This causes an out-of-bounds memory
access, potentially leading to a buffer overflow.
Added a check to ensure "padding_size" does not exceed "req_len" before
computing "pad_offset". This prevents "req_len - padding_size" from underflowing
and avoids accessing invalid memory.
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Fixes: 5cd7d8564a8b563da724b9e6264c967f0a091afa ("aspeed/hace: Support AST2600 HACE ")
Link: https://lore.kernel.org/qemu-devel/20250321092623.2097234-3-jamin_lin@aspeedtech.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
---
hw/misc/aspeed_hace.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/hw/misc/aspeed_hace.c b/hw/misc/aspeed_hace.c
index b07506ec04..8706e3d376 100644
--- a/hw/misc/aspeed_hace.c
+++ b/hw/misc/aspeed_hace.c
@@ -123,6 +123,11 @@ static bool has_padding(AspeedHACEState *s, struct iovec *iov,
if (*total_msg_len <= s->total_req_len) {
uint32_t padding_size = s->total_req_len - *total_msg_len;
uint8_t *padding = iov->iov_base;
+
+ if (padding_size > req_len) {
+ return false;
+ }
+
*pad_offset = req_len - padding_size;
if (padding[*pad_offset] == 0x80) {
return true;
--
2.41.0.windows.1

View File

@ -0,0 +1,49 @@
From f0be5a2c99d2f893a27839cd5eb5fa74f3ff5564 Mon Sep 17 00:00:00 2001
From: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
Date: Mon, 18 Nov 2024 21:03:55 -0500
Subject: [PATCH] hw/misc/mos6522: Fix bad class definition of the MOS6522
device
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
cheery-pick from c3d7c18b0d616cf7fb3c1f325503e1462307209d
When compiling QEMU with --enable-cfi, the "q800" m68k machine
currently crashes very early, when the q800_machine_init() function
tries to wire the interrupts of the "via1" device.
This happens because TYPE_MOS6522_Q800_VIA1 is supposed to be a
proper SysBus device, but its parent (TYPE_MOS6522) has a mistake
in its class definition where it is only derived from DeviceClass,
and not from SysBusDeviceClass, so we end up in funny memory access
issues here. Using the right class hierarchy for the MOS6522 device
fixes the problem.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2675
Signed-off-by: Thomas Huth <thuth@redhat.com>
Fixes: 51f233ec92 ("misc: introduce new mos6522 VIA device")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-ID: <20241114104653.963812-1-thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
---
include/hw/misc/mos6522.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/hw/misc/mos6522.h b/include/hw/misc/mos6522.h
index fba45668ab..920871a598 100644
--- a/include/hw/misc/mos6522.h
+++ b/include/hw/misc/mos6522.h
@@ -154,7 +154,7 @@ struct MOS6522State {
OBJECT_DECLARE_TYPE(MOS6522State, MOS6522DeviceClass, MOS6522)
struct MOS6522DeviceClass {
- DeviceClass parent_class;
+ SysBusDeviceClass parent_class;
ResettablePhases parent_phases;
void (*portB_write)(MOS6522State *dev);
--
2.41.0.windows.1

View File

@ -0,0 +1,70 @@
From e6b4460566522f1a9d608217bcb1534bf6709cab Mon Sep 17 00:00:00 2001
From: Zhang Jiao <zhangjiao2_yewu@cmss.chinamobile.com>
Date: Thu, 12 Dec 2024 12:16:01 +0800
Subject: [PATCH] hw/misc/nrf51_rng: Don't use BIT_MASK() when we mean BIT()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
cheery-pick from a29a9776407e68c5560687e07828925bda710150
The BIT_MASK() macro from bitops.h provides the mask of a bit
within a particular word of a multi-word bit array; it is intended
to be used with its counterpart BIT_WORD() that gives the index
of the word in the array.
In nrf51_rng we are using it for cases where we have a bit number
that we know is the index of a bit within a single word (in fact, it
happens that all the bit numbers we pass to it are zero). This
happens to give the right answer, but the macro that actually
does the job we want here is BIT().
Use BIT() instead of BIT_MASK().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20241108135644.4007151-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhang Jiao <zhangjiao2_yewu@cmss.chinamobile.com>
---
hw/misc/nrf51_rng.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/hw/misc/nrf51_rng.c b/hw/misc/nrf51_rng.c
index fc86e1b697..e911b3a3a3 100644
--- a/hw/misc/nrf51_rng.c
+++ b/hw/misc/nrf51_rng.c
@@ -107,25 +107,25 @@ static void rng_write(void *opaque, hwaddr offset,
break;
case NRF51_RNG_REG_SHORTS:
s->shortcut_stop_on_valrdy =
- (value & BIT_MASK(NRF51_RNG_REG_SHORTS_VALRDY_STOP)) ? 1 : 0;
+ (value & BIT(NRF51_RNG_REG_SHORTS_VALRDY_STOP)) ? 1 : 0;
break;
case NRF51_RNG_REG_INTEN:
s->interrupt_enabled =
- (value & BIT_MASK(NRF51_RNG_REG_INTEN_VALRDY)) ? 1 : 0;
+ (value & BIT(NRF51_RNG_REG_INTEN_VALRDY)) ? 1 : 0;
break;
case NRF51_RNG_REG_INTENSET:
- if (value & BIT_MASK(NRF51_RNG_REG_INTEN_VALRDY)) {
+ if (value & BIT(NRF51_RNG_REG_INTEN_VALRDY)) {
s->interrupt_enabled = 1;
}
break;
case NRF51_RNG_REG_INTENCLR:
- if (value & BIT_MASK(NRF51_RNG_REG_INTEN_VALRDY)) {
+ if (value & BIT(NRF51_RNG_REG_INTEN_VALRDY)) {
s->interrupt_enabled = 0;
}
break;
case NRF51_RNG_REG_CONFIG:
s->filter_enabled =
- (value & BIT_MASK(NRF51_RNG_REG_CONFIG_DECEN)) ? 1 : 0;
+ (value & BIT(NRF51_RNG_REG_CONFIG_DECEN)) ? 1 : 0;
break;
default:
--
2.41.0.windows.1

View File

@ -0,0 +1,36 @@
From 43fdaaa492ea10ab0e90ec4cc68ec45aed1d415c Mon Sep 17 00:00:00 2001
From: gubin <gubin_yewu@cmss.chinamobile.com>
Date: Sat, 22 Mar 2025 15:20:27 +0800
Subject: [PATCH] hw/nvme: fix invalid check on mcl
cherry-pick from 8c78015a55d84c016da6d5e41b6b5f618ecb25ab
The number of logical blocks within a source range is converted into a
1s based number at the time of parsing. However, when verifying the copy
length we add one again, causing the check against MCL to fail in error.
Cc: qemu-stable@nongnu.org
Fixes: 381ab99d8587 ("hw/nvme: check maximum copy length (MCL) for COPY")
Reviewed-by: Minwoo Im <minwoo.im@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: gubin <gubin_yewu@cmss.chinamobile.com>
---
hw/nvme/ctrl.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 29445938d5..407004b2f7 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -2863,7 +2863,7 @@ static inline uint16_t nvme_check_copy_mcl(NvmeNamespace *ns,
uint32_t nlb;
nvme_copy_source_range_parse(iocb->ranges, idx, iocb->format, NULL,
&nlb, NULL, NULL, NULL);
- copy_len += nlb + 1;
+ copy_len += nlb;
}
if (copy_len > ns->id_ns.mcl) {
--
2.41.0.windows.1

View File

@ -0,0 +1,42 @@
From 6de964bac51139ef24f43bde56933cd8eafaf317 Mon Sep 17 00:00:00 2001
From: gubin <gubin_yewu@cmss.chinamobile.com>
Date: Sat, 22 Mar 2025 15:25:39 +0800
Subject: [PATCH] hw/nvme: fix invalid endian conversion
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
cherry-pick from d2b5bb860e6c17442ad95cc275feb07c1665be5c
numcntl is one byte and so is max_vfs. Using cpu_to_le16 on big endian
hosts results in numcntl being set to 0.
Fix by dropping the endian conversion.
Fixes: 99f48ae7ae ("hw/nvme: Add support for Secondary Controller List")
Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Minwoo Im <minwoo.im@samsung.com>
Message-ID: <20240222-fix-sriov-numcntl-v1-1-d60bea5e72d0@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: gubin <gubin_yewu@cmss.chinamobile.com>
---
hw/nvme/ctrl.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 29445938d5..9410344844 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -7928,7 +7928,7 @@ static void nvme_init_state(NvmeCtrl *n)
n->aer_reqs = g_new0(NvmeRequest *, n->params.aerl + 1);
QTAILQ_INIT(&n->aer_queue);
- list->numcntl = cpu_to_le16(max_vfs);
+ list->numcntl = max_vfs;
for (i = 0; i < max_vfs; i++) {
sctrl = &list->sec[i];
sctrl->pcid = cpu_to_le16(n->cntlid);
--
2.41.0.windows.1

View File

@ -0,0 +1,95 @@
From 03f9b12e33238587da36be24523911fd1b003324 Mon Sep 17 00:00:00 2001
From: Zhenzhong Duan <zhenzhong.duan@intel.com>
Date: Wed, 5 Jun 2024 16:30:38 +0800
Subject: [PATCH] hw/pci: Introduce helper function
pci_device_get_iommu_bus_devfn()
Extract out pci_device_get_iommu_bus_devfn() from
pci_device_iommu_address_space() to facilitate
implementation of pci_device_[set|unset]_iommu_device()
in following patch.
No functional change intended.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
---
hw/pci/pci.c | 48 +++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 45 insertions(+), 3 deletions(-)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 7467a2a9de..0884fbb760 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2681,11 +2681,27 @@ static void pci_device_class_base_init(ObjectClass *klass, void *data)
}
}
-AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
+/*
+ * Get IOMMU root bus, aliased bus and devfn of a PCI device
+ *
+ * IOMMU root bus is needed by all call sites to call into iommu_ops.
+ * For call sites which don't need aliased BDF, passing NULL to
+ * aliased_[bus|devfn] is allowed.
+ *
+ * @piommu_bus: return root #PCIBus backed by an IOMMU for the PCI device.
+ *
+ * @aliased_bus: return aliased #PCIBus of the PCI device, optional.
+ *
+ * @aliased_devfn: return aliased devfn of the PCI device, optional.
+ */
+static void pci_device_get_iommu_bus_devfn(PCIDevice *dev,
+ PCIBus **piommu_bus,
+ PCIBus **aliased_bus,
+ int *aliased_devfn)
{
PCIBus *bus = pci_get_bus(dev);
PCIBus *iommu_bus = bus;
- uint8_t devfn = dev->devfn;
+ int devfn = dev->devfn;
while (iommu_bus && !iommu_bus->iommu_ops && iommu_bus->parent_dev) {
PCIBus *parent_bus = pci_get_bus(iommu_bus->parent_dev);
@@ -2726,7 +2742,33 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
iommu_bus = parent_bus;
}
- if (!pci_bus_bypass_iommu(bus) && iommu_bus->iommu_ops) {
+
+ assert(0 <= devfn && devfn < PCI_DEVFN_MAX);
+ assert(iommu_bus);
+
+ if (pci_bus_bypass_iommu(bus) || !iommu_bus->iommu_ops) {
+ iommu_bus = NULL;
+ }
+
+ *piommu_bus = iommu_bus;
+
+ if (aliased_bus) {
+ *aliased_bus = bus;
+ }
+
+ if (aliased_devfn) {
+ *aliased_devfn = devfn;
+ }
+}
+
+AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
+{
+ PCIBus *bus;
+ PCIBus *iommu_bus;
+ int devfn;
+
+ pci_device_get_iommu_bus_devfn(dev, &iommu_bus, &bus, &devfn);
+ if (iommu_bus) {
return iommu_bus->iommu_ops->get_address_space(bus,
iommu_bus->iommu_opaque, devfn);
}
--
2.41.0.windows.1

View File

@ -0,0 +1,120 @@
From 7bc73d38984460315df315d007789f87f4d11994 Mon Sep 17 00:00:00 2001
From: Yi Liu <yi.l.liu@intel.com>
Date: Wed, 5 Jun 2024 16:30:39 +0800
Subject: [PATCH] hw/pci: Introduce pci_device_[set|unset]_iommu_device()
pci_device_[set|unset]_iommu_device() call pci_device_get_iommu_bus_devfn()
to get iommu_bus->iommu_ops and call [set|unset]_iommu_device callback to
set/unset HostIOMMUDevice for a given PCI device.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
---
hw/pci/pci.c | 27 +++++++++++++++++++++++++++
include/hw/pci/pci.h | 38 +++++++++++++++++++++++++++++++++++++-
2 files changed, 64 insertions(+), 1 deletion(-)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 0884fbb760..d6f627aa51 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2775,6 +2775,33 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
return &address_space_memory;
}
+bool pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
+ Error **errp)
+{
+ PCIBus *iommu_bus;
+
+ /* set_iommu_device requires device's direct BDF instead of aliased BDF */
+ pci_device_get_iommu_bus_devfn(dev, &iommu_bus, NULL, NULL);
+ if (iommu_bus && iommu_bus->iommu_ops->set_iommu_device) {
+ return iommu_bus->iommu_ops->set_iommu_device(pci_get_bus(dev),
+ iommu_bus->iommu_opaque,
+ dev->devfn, hiod, errp);
+ }
+ return true;
+}
+
+void pci_device_unset_iommu_device(PCIDevice *dev)
+{
+ PCIBus *iommu_bus;
+
+ pci_device_get_iommu_bus_devfn(dev, &iommu_bus, NULL, NULL);
+ if (iommu_bus && iommu_bus->iommu_ops->unset_iommu_device) {
+ return iommu_bus->iommu_ops->unset_iommu_device(pci_get_bus(dev),
+ iommu_bus->iommu_opaque,
+ dev->devfn);
+ }
+}
+
void pci_setup_iommu(PCIBus *bus, const PCIIOMMUOps *ops, void *opaque)
{
/*
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index cee0cf7460..8d1af44249 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -3,6 +3,7 @@
#include "exec/memory.h"
#include "sysemu/dma.h"
+#include "sysemu/host_iommu_device.h"
/* PCI includes legacy ISA access. */
#include "hw/isa/isa.h"
@@ -384,10 +385,45 @@ typedef struct PCIIOMMUOps {
*
* @devfn: device and function number
*/
- AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn);
+ AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn);
+ /**
+ * @set_iommu_device: attach a HostIOMMUDevice to a vIOMMU
+ *
+ * Optional callback, if not implemented in vIOMMU, then vIOMMU can't
+ * retrieve host information from the associated HostIOMMUDevice.
+ *
+ * @bus: the #PCIBus of the PCI device.
+ *
+ * @opaque: the data passed to pci_setup_iommu().
+ *
+ * @devfn: device and function number of the PCI device.
+ *
+ * @dev: the #HostIOMMUDevice to attach.
+ *
+ * @errp: pass an Error out only when return false
+ *
+ * Returns: true if HostIOMMUDevice is attached or else false with errp set.
+ */
+ bool (*set_iommu_device)(PCIBus *bus, void *opaque, int devfn,
+ HostIOMMUDevice *dev, Error **errp);
+ /**
+ * @unset_iommu_device: detach a HostIOMMUDevice from a vIOMMU
+ *
+ * Optional callback.
+ *
+ * @bus: the #PCIBus of the PCI device.
+ *
+ * @opaque: the data passed to pci_setup_iommu().
+ *
+ * @devfn: device and function number of the PCI device.
+ */
+ void (*unset_iommu_device)(PCIBus *bus, void *opaque, int devfn);
} PCIIOMMUOps;
AddressSpace *pci_device_iommu_address_space(PCIDevice *dev);
+bool pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
+ Error **errp);
+void pci_device_unset_iommu_device(PCIDevice *dev);
/**
* pci_setup_iommu: Initialize specific IOMMU handlers for a PCIBus
--
2.41.0.windows.1

View File

@ -0,0 +1,46 @@
From d1b98e84eeec0b94403fb716bef41080f6bee3b3 Mon Sep 17 00:00:00 2001
From: Zhang Jiao <zhangjiao2_yewu@cmss.chinamobile.com>
Date: Thu, 12 Dec 2024 10:31:47 +0800
Subject: [PATCH] hw/pci: Remove unused pci_irq_pulse() method
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
cheery-pick from ef45f46f382a5e2c41c39c71fd3364cff4f41bf5
Last use of pci_irq_pulse() was removed 7 years ago in commit
5e9aa92eb1 ("hw/block: Fix pin-based interrupt behaviour of NVMe").
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20241122103418.539-1-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Zhang Jiao <zhangjiao2_yewu@cmss.chinamobile.com>
---
include/hw/pci/pci.h | 10 ----------
1 file changed, 10 deletions(-)
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 7cf7b5619a..cee0cf7460 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -632,16 +632,6 @@ static inline void pci_irq_deassert(PCIDevice *pci_dev)
pci_set_irq(pci_dev, 0);
}
-/*
- * FIXME: PCI does not work this way.
- * All the callers to this method should be fixed.
- */
-static inline void pci_irq_pulse(PCIDevice *pci_dev)
-{
- pci_irq_assert(pci_dev);
- pci_irq_deassert(pci_dev);
-}
-
MSIMessage pci_get_msi_message(PCIDevice *dev, int vector);
void pci_set_power(PCIDevice *pci_dev, bool state);
--
2.41.0.windows.1

View File

@ -0,0 +1,41 @@
From c1f1346eea8da6552e085aa13630bbf5227db00f Mon Sep 17 00:00:00 2001
From: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
Date: Mon, 7 Apr 2025 12:54:10 -0400
Subject: [PATCH] hw/pci-host/designware: Fix ATU_UPPER_TARGET register access
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
cheery-pick from 04e99f9eb7920b0f0fcce65686c3bedf5e32a1f9
Fix copy/paste error writing to the ATU_UPPER_TARGET
register, we want to update the upper 32 bits.
Cc: qemu-stable@nongnu.org
Reported-by: Joey <jeundery@gmail.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2861
Fixes: d64e5eabc4c ("pci: Add support for Designware IP block")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Gustavo Romero <gustavo.romero@linaro.org>
Message-Id: <20250331152041.74533-2-philmd@linaro.org>
Signed-off-by: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
---
hw/pci-host/designware.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/pci-host/designware.c b/hw/pci-host/designware.c
index f477f97847..004142709c 100644
--- a/hw/pci-host/designware.c
+++ b/hw/pci-host/designware.c
@@ -360,7 +360,7 @@ static void designware_pcie_root_config_write(PCIDevice *d, uint32_t address,
case DESIGNWARE_PCIE_ATU_UPPER_TARGET:
viewport->target &= 0x00000000FFFFFFFFULL;
- viewport->target |= val;
+ viewport->target |= (uint64_t)val << 32;
break;
case DESIGNWARE_PCIE_ATU_LIMIT:
--
2.41.0.windows.1

View File

@ -0,0 +1,119 @@
From 37308e60d43323c0ea65d734487ce6542f8a9d3b Mon Sep 17 00:00:00 2001
From: Eric Auger <eric.auger@redhat.com>
Date: Tue, 5 Oct 2021 10:53:12 +0200
Subject: [PATCH] hw/pci-host/gpex: [needs kernel fix] Allow to generate
preserve boot config DSM #5
Add a 'preserve_config' field in struct GPEXConfig and
if set, generate the DSM #5 for preserving PCI boot configurations.
The DSM presence is needed to expose RMRs.
At the moment the DSM generation is not yet enabled.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/pci-host/gpex-acpi.c | 35 +++++++++++++++++++++++++++++++----
include/hw/pci-host/gpex.h | 1 +
2 files changed, 32 insertions(+), 4 deletions(-)
diff --git a/hw/pci-host/gpex-acpi.c b/hw/pci-host/gpex-acpi.c
index ac5d229757..ce424fc9da 100644
--- a/hw/pci-host/gpex-acpi.c
+++ b/hw/pci-host/gpex-acpi.c
@@ -49,9 +49,10 @@ static void acpi_dsdt_add_pci_route_table(Aml *dev, uint32_t irq)
}
}
-static void acpi_dsdt_add_pci_osc(Aml *dev)
+static void acpi_dsdt_add_pci_osc(Aml *dev, bool preserve_config)
{
Aml *method, *UUID, *ifctx, *ifctx1, *elsectx, *buf;
+ uint8_t byte_list[1] = {0};
/* Declare an _OSC (OS Control Handoff) method */
aml_append(dev, aml_name_decl("SUPP", aml_int(0)));
@@ -113,10 +114,24 @@ static void acpi_dsdt_add_pci_osc(Aml *dev)
UUID = aml_touuid("E5C937D0-3553-4D7A-9117-EA4D19C3434D");
ifctx = aml_if(aml_equal(aml_arg(0), UUID));
ifctx1 = aml_if(aml_equal(aml_arg(2), aml_int(0)));
- uint8_t byte_list[1] = {0};
+ if (preserve_config) {
+ /* support for functions other than function 0 and function 5 */
+ byte_list[0] = 0x21;
+ }
buf = aml_buffer(1, byte_list);
aml_append(ifctx1, aml_return(buf));
aml_append(ifctx, ifctx1);
+
+ if (preserve_config) {
+ Aml *ifctx2 = aml_if(aml_equal(aml_arg(2), aml_int(5)));
+ /*
+ * 0 - The operating system must not ignore the PCI configuration that
+ * firmware has done at boot time.
+ */
+ aml_append(ifctx2, aml_return(aml_int(0)));
+ aml_append(ifctx, ifctx2);
+ }
+
aml_append(method, ifctx);
byte_list[0] = 0;
@@ -174,6 +189,12 @@ void acpi_dsdt_add_gpex(Aml *scope, struct GPEXConfig *cfg)
aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
}
+ if (cfg->preserve_config) {
+ method = aml_method("_DSM", 5, AML_SERIALIZED);
+ aml_append(method, aml_return(aml_int(0)));
+ aml_append(dev, method);
+ }
+
acpi_dsdt_add_pci_route_table(dev, cfg->irq);
/*
@@ -188,7 +209,7 @@ void acpi_dsdt_add_gpex(Aml *scope, struct GPEXConfig *cfg)
if (is_cxl) {
build_cxl_osc_method(dev);
} else {
- acpi_dsdt_add_pci_osc(dev);
+ acpi_dsdt_add_pci_osc(dev, cfg->preserve_config);
}
aml_append(scope, dev);
@@ -205,6 +226,12 @@ void acpi_dsdt_add_gpex(Aml *scope, struct GPEXConfig *cfg)
aml_append(dev, aml_name_decl("_STR", aml_unicode("PCIe 0 Device")));
aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
+ if (cfg->preserve_config) {
+ method = aml_method("_DSM", 5, AML_SERIALIZED);
+ aml_append(method, aml_return(aml_int(0)));
+ aml_append(dev, method);
+ }
+
acpi_dsdt_add_pci_route_table(dev, cfg->irq);
method = aml_method("_CBA", 0, AML_NOTSERIALIZED);
@@ -263,7 +290,7 @@ void acpi_dsdt_add_gpex(Aml *scope, struct GPEXConfig *cfg)
}
aml_append(dev, aml_name_decl("_CRS", rbuf));
- acpi_dsdt_add_pci_osc(dev);
+ acpi_dsdt_add_pci_osc(dev, cfg->preserve_config);
Aml *dev_res0 = aml_device("%s", "RES0");
aml_append(dev_res0, aml_name_decl("_HID", aml_string("PNP0C02")));
diff --git a/include/hw/pci-host/gpex.h b/include/hw/pci-host/gpex.h
index b0240bd768..65475f7f9d 100644
--- a/include/hw/pci-host/gpex.h
+++ b/include/hw/pci-host/gpex.h
@@ -64,6 +64,7 @@ struct GPEXConfig {
MemMapEntry pio;
int irq;
PCIBus *bus;
+ bool preserve_config;
};
int gpex_set_irq_num(GPEXHost *s, int index, int gsi);
--
2.41.0.windows.1

View File

@ -0,0 +1,137 @@
From 4044284b230182cbaeb401bdb1b65dcbd11c7550 Mon Sep 17 00:00:00 2001
From: Xianglai Li <lixianglai@loongson.cn>
Date: Mon, 7 Apr 2025 18:59:42 +0800
Subject: [PATCH] hw/rtc: Fixed loongson rtc emulation errors
The expire time is sent to the timer only
when the expire Time is greater than 0 or
greater than now. Otherwise, the timer
will trigger interruption continuously.
Timer interrupts are sent using pulse functions.
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
---
hw/loongarch/virt.c | 9 +++++++--
hw/rtc/ls7a_rtc.c | 22 +++++++++++++---------
2 files changed, 20 insertions(+), 11 deletions(-)
diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index 0c24e632bb..ce026a4c3c 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -51,6 +51,11 @@
#include "qemu/error-report.h"
#include "qemu/guest-random.h"
+#define FDT_IRQ_FLAGS_EDGE_LO_HI 1
+#define FDT_IRQ_FLAGS_EDGE_HI_LO 2
+#define FDT_IRQ_FLAGS_LEVEL_HI 4
+#define FDT_IRQ_FLAGS_LEVEL_LO 8
+
static bool virt_is_veiointc_enabled(LoongArchVirtMachineState *lvms)
{
if (lvms->veiointc == ON_OFF_AUTO_OFF) {
@@ -275,7 +280,7 @@ static void fdt_add_rtc_node(LoongArchVirtMachineState *lvms,
"loongson,ls7a-rtc");
qemu_fdt_setprop_sized_cells(ms->fdt, nodename, "reg", 2, base, 2, size);
qemu_fdt_setprop_cells(ms->fdt, nodename, "interrupts",
- VIRT_RTC_IRQ - VIRT_GSI_BASE , 0x4);
+ VIRT_RTC_IRQ - VIRT_GSI_BASE , FDT_IRQ_FLAGS_EDGE_LO_HI);
qemu_fdt_setprop_cell(ms->fdt, nodename, "interrupt-parent",
*pch_pic_phandle);
g_free(nodename);
@@ -334,7 +339,7 @@ static void fdt_add_uart_node(LoongArchVirtMachineState *lvms,
qemu_fdt_setprop_cell(ms->fdt, nodename, "clock-frequency", 100000000);
if (chosen)
qemu_fdt_setprop_string(ms->fdt, "/chosen", "stdout-path", nodename);
- qemu_fdt_setprop_cells(ms->fdt, nodename, "interrupts", irq, 0x4);
+ qemu_fdt_setprop_cells(ms->fdt, nodename, "interrupts", irq, FDT_IRQ_FLAGS_LEVEL_HI);
qemu_fdt_setprop_cell(ms->fdt, nodename, "interrupt-parent",
*pch_pic_phandle);
g_free(nodename);
diff --git a/hw/rtc/ls7a_rtc.c b/hw/rtc/ls7a_rtc.c
index 1f9e38a735..be9546c850 100644
--- a/hw/rtc/ls7a_rtc.c
+++ b/hw/rtc/ls7a_rtc.c
@@ -145,20 +145,22 @@ static void toymatch_write(LS7ARtcState *s, uint64_t val, int num)
now = qemu_clock_get_ms(rtc_clock);
toymatch_val_to_time(s, val, &tm);
expire_time = now + (qemu_timedate_diff(&tm) - s->offset_toy) * 1000;
- timer_mod(s->toy_timer[num], expire_time);
+ if (expire_time > now)
+ timer_mod(s->toy_timer[num], expire_time);
}
}
static void rtcmatch_write(LS7ARtcState *s, uint64_t val, int num)
{
- uint64_t expire_ns;
+ int64_t expire_ns;
/* it do not support write when toy disabled */
if (rtc_enabled(s)) {
s->rtcmatch[num] = val;
/* calculate expire time */
expire_ns = ticks_to_ns(val) - ticks_to_ns(s->offset_rtc);
- timer_mod_ns(s->rtc_timer[num], expire_ns);
+ if (expire_ns > 0)
+ timer_mod_ns(s->rtc_timer[num], expire_ns);
}
}
@@ -185,7 +187,7 @@ static void ls7a_rtc_stop(LS7ARtcState *s)
static void ls7a_toy_start(LS7ARtcState *s)
{
int i;
- uint64_t expire_time, now;
+ int64_t expire_time, now;
struct tm tm = {};
now = qemu_clock_get_ms(rtc_clock);
@@ -194,19 +196,21 @@ static void ls7a_toy_start(LS7ARtcState *s)
for (i = 0; i < TIMER_NUMS; i++) {
toymatch_val_to_time(s, s->toymatch[i], &tm);
expire_time = now + (qemu_timedate_diff(&tm) - s->offset_toy) * 1000;
- timer_mod(s->toy_timer[i], expire_time);
+ if (expire_time > now)
+ timer_mod(s->toy_timer[i], expire_time);
}
}
static void ls7a_rtc_start(LS7ARtcState *s)
{
int i;
- uint64_t expire_time;
+ int64_t expire_time;
/* recalculate expire time and enable timer */
for (i = 0; i < TIMER_NUMS; i++) {
expire_time = ticks_to_ns(s->rtcmatch[i]) - ticks_to_ns(s->offset_rtc);
- timer_mod_ns(s->rtc_timer[i], expire_time);
+ if (expire_time > 0)
+ timer_mod_ns(s->rtc_timer[i], expire_time);
}
}
@@ -370,7 +374,7 @@ static void toy_timer_cb(void *opaque)
LS7ARtcState *s = opaque;
if (toy_enabled(s)) {
- qemu_irq_raise(s->irq);
+ qemu_irq_pulse(s->irq);
}
}
@@ -379,7 +383,7 @@ static void rtc_timer_cb(void *opaque)
LS7ARtcState *s = opaque;
if (rtc_enabled(s)) {
- qemu_irq_raise(s->irq);
+ qemu_irq_pulse(s->irq);
}
}
--
2.41.0.windows.1

View File

@ -0,0 +1,46 @@
From 3746a434596b9bc20994c869c79fb9db24227418 Mon Sep 17 00:00:00 2001
From: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
Date: Mon, 7 Apr 2025 13:56:18 -0400
Subject: [PATCH] hw/sd/sdhci: free irq on exit
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
cheery-pick from 1c2d03bb0889b7a9a677d53126fb035190683af4
Fix a memory leak bug in sdhci_pci_realize() due to s->irq
not being freed in sdhci_pci_exit().
Signed-off-by: Zheng Huang <hz1624917200@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <09ddf42b-a6db-42d5-954b-148d09d8d6cc@gmail.com>
[PMD: Moved qemu_free_irq() call before sdhci_common_unrealize()]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
---
hw/sd/sdhci-pci.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/hw/sd/sdhci-pci.c b/hw/sd/sdhci-pci.c
index 9b7bee8b3f..c1eb67cf29 100644
--- a/hw/sd/sdhci-pci.c
+++ b/hw/sd/sdhci-pci.c
@@ -18,6 +18,7 @@
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu/module.h"
+#include "hw/irq.h"
#include "hw/qdev-properties.h"
#include "hw/sd/sdhci.h"
#include "sdhci-internal.h"
@@ -49,6 +50,7 @@ static void sdhci_pci_exit(PCIDevice *dev)
{
SDHCIState *s = PCI_SDHCI(dev);
+ qemu_free_irq(s->irq);
sdhci_common_unrealize(s);
sdhci_uninitfn(s);
}
--
2.41.0.windows.1

View File

@ -0,0 +1,36 @@
From d0076c906a96019c0fe12be78e5ab21eaf15e69e Mon Sep 17 00:00:00 2001
From: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
Date: Mon, 25 Nov 2024 04:48:16 -0500
Subject: [PATCH] hw/timer/exynos4210_mct: fix possible int overflow
cheery-pick from c5d36da7ec62e4c72a72a437057fb6072cf0d6ab
The product "icnto * s->tcntb" may overflow uint32_t.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Dmitry Frolov <frolov@swemel.ru>
Message-id: 20241106083801.219578-2-frolov@swemel.ru
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
---
hw/timer/exynos4210_mct.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/timer/exynos4210_mct.c b/hw/timer/exynos4210_mct.c
index 446bbd2b96..6f47bfe2c2 100644
--- a/hw/timer/exynos4210_mct.c
+++ b/hw/timer/exynos4210_mct.c
@@ -815,7 +815,7 @@ static uint32_t exynos4210_ltick_cnt_get_cnto(struct tick_timer *s)
/* Both are counting */
icnto = remain / s->tcntb;
if (icnto) {
- tcnto = remain % (icnto * s->tcntb);
+ tcnto = remain % ((uint64_t)icnto * s->tcntb);
} else {
tcnto = remain % s->tcntb;
}
--
2.41.0.windows.1

View File

@ -0,0 +1,46 @@
From 068fef175047c18f60900dacd54c7a436114c164 Mon Sep 17 00:00:00 2001
From: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
Date: Mon, 7 Apr 2025 13:18:47 -0400
Subject: [PATCH] hw/ufs: free irq on exit
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
cheery-pick from c458f9474d6574505ce9144ab1a90b951e69c1bd
Fix a memory leak bug in ufs_init_pci() due to u->irq
not being freed in ufs_exit().
Signed-off-by: Zheng Huang <hz1624917200@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <43ceb427-87aa-44ee-9007-dbaecc499bba@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
---
hw/ufs/ufs.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/hw/ufs/ufs.c b/hw/ufs/ufs.c
index 068895b27b..f57d33e771 100644
--- a/hw/ufs/ufs.c
+++ b/hw/ufs/ufs.c
@@ -25,6 +25,7 @@
#include "qapi/error.h"
#include "migration/vmstate.h"
#include "scsi/constants.h"
+#include "hw/irq.h"
#include "trace.h"
#include "ufs.h"
@@ -1286,6 +1287,8 @@ static void ufs_exit(PCIDevice *pci_dev)
{
UfsHc *u = UFS(pci_dev);
+ qemu_free_irq(u->irq);
+
qemu_bh_delete(u->doorbell_bh);
qemu_bh_delete(u->complete_bh);
--
2.41.0.windows.1

View File

@ -0,0 +1,40 @@
From 4ca8ac93bd2c328c80841540b3b5e297ff24d3c9 Mon Sep 17 00:00:00 2001
From: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
Date: Wed, 5 Feb 2025 06:02:50 -0500
Subject: [PATCH] hw/usb/hcd-ehci: Fix debug printf format string
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
cheery-pick from a40b5f32867294b7c855d2e4b98a4c2d32b3be28
The variable is uint64_t so needs %PRIu64 instead of %d.
Fixes: 3ae7eb88c47 ("ehci: fix overflow in frame timer code")
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20250124124713.64F8C4E6031@zero.eik.bme.hu>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
---
hw/usb/hcd-ehci.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/hw/usb/hcd-ehci.c b/hw/usb/hcd-ehci.c
index 7b093acd98..fa8c7af5c8 100644
--- a/hw/usb/hcd-ehci.c
+++ b/hw/usb/hcd-ehci.c
@@ -2287,7 +2287,8 @@ static void ehci_work_bh(void *opaque)
ehci_update_frindex(ehci, skipped_uframes);
ehci->last_run_ns += UFRAME_TIMER_NS * skipped_uframes;
uframes -= skipped_uframes;
- DPRINTF("WARNING - EHCI skipped %d uframes\n", skipped_uframes);
+ DPRINTF("WARNING - EHCI skipped %"PRIu64" uframes\n",
+ skipped_uframes);
}
for (i = 0; i < uframes; i++) {
--
2.41.0.windows.1

View File

@ -0,0 +1,43 @@
From 5eb0bb1f8ce9835b368e78d414ff6136c77ef94b Mon Sep 17 00:00:00 2001
From: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
Date: Tue, 8 Apr 2025 06:51:26 -0400
Subject: [PATCH] hw/xen: Fix xen_bus_realize() error handling
cheery-pick from de7b18083bfed4e1a01bb40b4ad050c47d2011fa
The Error ** argument must be NULL, &error_abort, &error_fatal, or a
pointer to a variable containing NULL. Passing an argument of the
latter kind twice without clearing it in between is wrong: if the
first call sets an error, it no longer points to NULL for the second
call.
xen_bus_realize() is wrong that way: it passes &local_err to
xs_node_watch() in a loop. If this fails in more than one iteration,
it can trip error_setv()'s assertion.
Fix by clearing @local_err.
Fixes: c4583c8c394e (xen-bus: reduce scope of backend watch)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250314143500.2449658-2-armbru@redhat.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: qihao_yewu <qihao_yewu@cmss.chinamobile.com>
---
hw/xen/xen-bus.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
index 4973e7d9c9..c10b089914 100644
--- a/hw/xen/xen-bus.c
+++ b/hw/xen/xen-bus.c
@@ -352,6 +352,7 @@ static void xen_bus_realize(BusState *bus, Error **errp)
error_reportf_err(local_err,
"failed to set up '%s' enumeration watch: ",
type[i]);
+ local_err = NULL;
}
g_free(node);
--
2.41.0.windows.1

View File

@ -0,0 +1,38 @@
From 0d5ac4f36208eadbb922f552ba1b762f5bd0c3a6 Mon Sep 17 00:00:00 2001
From: Xiaoyao Li <xiaoyao.li@intel.com>
Date: Wed, 24 Jan 2024 21:40:15 -0500
Subject: [PATCH] i386/cpuid: Remove subleaf constraint on CPUID leaf 1F
commit a3b5376521a0de898440e8d0942b54e628f0949f upstream.
No such constraint that subleaf index needs to be less than 64.
Intel-SIG: commit a3b5376521a0 i386/cpuid: Remove subleaf constraint on CPUID leaf 1F
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by:Yang Weijiang <weijiang.yang@intel.com>
Message-ID: <20240125024016.2521244-3-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Zeng <jason.zeng@intel.com>
---
target/i386/kvm/kvm.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index ce96ed9158..850104f6b5 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1928,10 +1928,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
break;
}
- if (i == 0x1f && j == 64) {
- break;
- }
-
c->function = i;
c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
c->index = j;
--
2.41.0.windows.1

View File

@ -0,0 +1,70 @@
From 4ef1b086272552378c09356b0e9fd2548a27a621 Mon Sep 17 00:00:00 2001
From: Zhenzhong Duan <zhenzhong.duan@intel.com>
Date: Wed, 5 Jun 2024 16:30:43 +0800
Subject: [PATCH] intel_iommu: Check compatibility with host IOMMU capabilities
If check fails, host device (either VFIO or VDPA device) is not
compatible with current vIOMMU config and should not be passed to
guest.
Only aw_bits is checked for now, we don't care about other caps
before scalable modern mode is introduced.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
---
hw/i386/intel_iommu.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index bdc14f8438..60d86e0cb6 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3838,6 +3838,30 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
return vtd_dev_as;
}
+static bool vtd_check_hiod(IntelIOMMUState *s, HostIOMMUDevice *hiod,
+ Error **errp)
+{
+ HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_GET_CLASS(hiod);
+ int ret;
+
+ if (!hiodc->get_cap) {
+ error_setg(errp, ".get_cap() not implemented");
+ return false;
+ }
+
+ /* Common checks */
+ ret = hiodc->get_cap(hiod, HOST_IOMMU_DEVICE_CAP_AW_BITS, errp);
+ if (ret < 0) {
+ return false;
+ }
+ if (s->aw_bits > ret) {
+ error_setg(errp, "aw-bits %d > host aw-bits %d", s->aw_bits, ret);
+ return false;
+ }
+
+ return true;
+}
+
static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
HostIOMMUDevice *hiod, Error **errp)
{
@@ -3858,6 +3882,11 @@ static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
return false;
}
+ if (!vtd_check_hiod(s, hiod, errp)) {
+ vtd_iommu_unlock(s);
+ return false;
+ }
+
new_key = g_malloc(sizeof(*new_key));
new_key->bus = bus;
new_key->devfn = devfn;
--
2.41.0.windows.1

View File

@ -0,0 +1,142 @@
From a051e4349316d7065c9418de691787edae8e7f4e Mon Sep 17 00:00:00 2001
From: Zhenzhong Duan <zhenzhong.duan@intel.com>
Date: Wed, 5 Jun 2024 16:30:41 +0800
Subject: [PATCH] intel_iommu: Extract out vtd_cap_init() to initialize
cap/ecap
Extract cap/ecap initialization in vtd_cap_init() to make code
cleaner.
No functional change intended.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
---
hw/i386/intel_iommu.c | 93 ++++++++++++++++++++++++-------------------
1 file changed, 51 insertions(+), 42 deletions(-)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 3da56e439e..6716407b7a 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3935,30 +3935,10 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n)
return;
}
-/* Do the initialization. It will also be called when reset, so pay
- * attention when adding new initialization stuff.
- */
-static void vtd_init(IntelIOMMUState *s)
+static void vtd_cap_init(IntelIOMMUState *s)
{
X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
- memset(s->csr, 0, DMAR_REG_SIZE);
- memset(s->wmask, 0, DMAR_REG_SIZE);
- memset(s->w1cmask, 0, DMAR_REG_SIZE);
- memset(s->womask, 0, DMAR_REG_SIZE);
-
- s->root = 0;
- s->root_scalable = false;
- s->dmar_enabled = false;
- s->intr_enabled = false;
- s->iq_head = 0;
- s->iq_tail = 0;
- s->iq = 0;
- s->iq_size = 0;
- s->qi_enabled = false;
- s->iq_last_desc_type = VTD_INV_DESC_NONE;
- s->iq_dw = false;
- s->next_frcd_reg = 0;
s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND |
VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS |
VTD_CAP_MGAW(s->aw_bits);
@@ -3975,27 +3955,6 @@ static void vtd_init(IntelIOMMUState *s)
}
s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO;
- /*
- * Rsvd field masks for spte
- */
- vtd_spte_rsvd[0] = ~0ULL;
- vtd_spte_rsvd[1] = VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits,
- x86_iommu->dt_supported);
- vtd_spte_rsvd[2] = VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits);
- vtd_spte_rsvd[3] = VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits);
- vtd_spte_rsvd[4] = VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits);
-
- vtd_spte_rsvd_large[2] = VTD_SPTE_LPAGE_L2_RSVD_MASK(s->aw_bits,
- x86_iommu->dt_supported);
- vtd_spte_rsvd_large[3] = VTD_SPTE_LPAGE_L3_RSVD_MASK(s->aw_bits,
- x86_iommu->dt_supported);
-
- if (s->scalable_mode || s->snoop_control) {
- vtd_spte_rsvd[1] &= ~VTD_SPTE_SNP;
- vtd_spte_rsvd_large[2] &= ~VTD_SPTE_SNP;
- vtd_spte_rsvd_large[3] &= ~VTD_SPTE_SNP;
- }
-
if (x86_iommu_ir_supported(x86_iommu)) {
s->ecap |= VTD_ECAP_IR | VTD_ECAP_MHMV;
if (s->intr_eim == ON_OFF_AUTO_ON) {
@@ -4028,6 +3987,56 @@ static void vtd_init(IntelIOMMUState *s)
if (s->pasid) {
s->ecap |= VTD_ECAP_PASID;
}
+}
+
+/*
+ * Do the initialization. It will also be called when reset, so pay
+ * attention when adding new initialization stuff.
+ */
+static void vtd_init(IntelIOMMUState *s)
+{
+ X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
+
+ memset(s->csr, 0, DMAR_REG_SIZE);
+ memset(s->wmask, 0, DMAR_REG_SIZE);
+ memset(s->w1cmask, 0, DMAR_REG_SIZE);
+ memset(s->womask, 0, DMAR_REG_SIZE);
+
+ s->root = 0;
+ s->root_scalable = false;
+ s->dmar_enabled = false;
+ s->intr_enabled = false;
+ s->iq_head = 0;
+ s->iq_tail = 0;
+ s->iq = 0;
+ s->iq_size = 0;
+ s->qi_enabled = false;
+ s->iq_last_desc_type = VTD_INV_DESC_NONE;
+ s->iq_dw = false;
+ s->next_frcd_reg = 0;
+
+ vtd_cap_init(s);
+
+ /*
+ * Rsvd field masks for spte
+ */
+ vtd_spte_rsvd[0] = ~0ULL;
+ vtd_spte_rsvd[1] = VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits,
+ x86_iommu->dt_supported);
+ vtd_spte_rsvd[2] = VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits);
+ vtd_spte_rsvd[3] = VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits);
+ vtd_spte_rsvd[4] = VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits);
+
+ vtd_spte_rsvd_large[2] = VTD_SPTE_LPAGE_L2_RSVD_MASK(s->aw_bits,
+ x86_iommu->dt_supported);
+ vtd_spte_rsvd_large[3] = VTD_SPTE_LPAGE_L3_RSVD_MASK(s->aw_bits,
+ x86_iommu->dt_supported);
+
+ if (s->scalable_mode || s->snoop_control) {
+ vtd_spte_rsvd[1] &= ~VTD_SPTE_SNP;
+ vtd_spte_rsvd_large[2] &= ~VTD_SPTE_SNP;
+ vtd_spte_rsvd_large[3] &= ~VTD_SPTE_SNP;
+ }
vtd_reset_caches(s);
--
2.41.0.windows.1

View File

@ -0,0 +1,160 @@
From 5834bb1ccce592380a91a5cf127f90a031cd7cf2 Mon Sep 17 00:00:00 2001
From: Yi Liu <yi.l.liu@intel.com>
Date: Wed, 5 Jun 2024 16:30:42 +0800
Subject: [PATCH] intel_iommu: Implement [set|unset]_iommu_device() callbacks
Implement [set|unset]_iommu_device() callbacks in Intel vIOMMU.
In set call, we take a reference of HostIOMMUDevice and store it
in hash table indexed by PCI BDF.
Note this BDF index is device's real BDF not the aliased one which
is different from the index of VTDAddressSpace. There can be multiple
assigned devices under same virtual iommu group and share same
VTDAddressSpace, but each has its own HostIOMMUDevice.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
---
hw/i386/intel_iommu.c | 81 +++++++++++++++++++++++++++++++++++
include/hw/i386/intel_iommu.h | 2 +
2 files changed, 83 insertions(+)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 6716407b7a..bdc14f8438 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -61,6 +61,12 @@ struct vtd_as_key {
uint32_t pasid;
};
+/* bus/devfn is PCI device's real BDF not the aliased one */
+struct vtd_hiod_key {
+ PCIBus *bus;
+ uint8_t devfn;
+};
+
struct vtd_iotlb_key {
uint64_t gfn;
uint32_t pasid;
@@ -250,6 +256,25 @@ static guint vtd_as_hash(gconstpointer v)
return (guint)(value << 8 | key->devfn);
}
+/* Same implementation as vtd_as_hash() */
+static guint vtd_hiod_hash(gconstpointer v)
+{
+ return vtd_as_hash(v);
+}
+
+static gboolean vtd_hiod_equal(gconstpointer v1, gconstpointer v2)
+{
+ const struct vtd_hiod_key *key1 = v1;
+ const struct vtd_hiod_key *key2 = v2;
+
+ return (key1->bus == key2->bus) && (key1->devfn == key2->devfn);
+}
+
+static void vtd_hiod_destroy(gpointer v)
+{
+ object_unref(v);
+}
+
static gboolean vtd_hash_remove_by_domain(gpointer key, gpointer value,
gpointer user_data)
{
@@ -3813,6 +3838,58 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
return vtd_dev_as;
}
+static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
+ HostIOMMUDevice *hiod, Error **errp)
+{
+ IntelIOMMUState *s = opaque;
+ struct vtd_as_key key = {
+ .bus = bus,
+ .devfn = devfn,
+ };
+ struct vtd_as_key *new_key;
+
+ assert(hiod);
+
+ vtd_iommu_lock(s);
+
+ if (g_hash_table_lookup(s->vtd_host_iommu_dev, &key)) {
+ error_setg(errp, "Host IOMMU device already exist");
+ vtd_iommu_unlock(s);
+ return false;
+ }
+
+ new_key = g_malloc(sizeof(*new_key));
+ new_key->bus = bus;
+ new_key->devfn = devfn;
+
+ object_ref(hiod);
+ g_hash_table_insert(s->vtd_host_iommu_dev, new_key, hiod);
+
+ vtd_iommu_unlock(s);
+
+ return true;
+}
+
+static void vtd_dev_unset_iommu_device(PCIBus *bus, void *opaque, int devfn)
+{
+ IntelIOMMUState *s = opaque;
+ struct vtd_as_key key = {
+ .bus = bus,
+ .devfn = devfn,
+ };
+
+ vtd_iommu_lock(s);
+
+ if (!g_hash_table_lookup(s->vtd_host_iommu_dev, &key)) {
+ vtd_iommu_unlock(s);
+ return;
+ }
+
+ g_hash_table_remove(s->vtd_host_iommu_dev, &key);
+
+ vtd_iommu_unlock(s);
+}
+
/* Unmap the whole range in the notifier's scope. */
static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
{
@@ -4117,6 +4194,8 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
static PCIIOMMUOps vtd_iommu_ops = {
.get_address_space = vtd_host_dma_iommu,
+ .set_iommu_device = vtd_dev_set_iommu_device,
+ .unset_iommu_device = vtd_dev_unset_iommu_device,
};
static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
@@ -4240,6 +4319,8 @@ static void vtd_realize(DeviceState *dev, Error **errp)
g_free, g_free);
s->vtd_address_spaces = g_hash_table_new_full(vtd_as_hash, vtd_as_equal,
g_free, g_free);
+ s->vtd_host_iommu_dev = g_hash_table_new_full(vtd_hiod_hash, vtd_hiod_equal,
+ g_free, vtd_hiod_destroy);
vtd_init(s);
pci_setup_iommu(bus, &vtd_iommu_ops, dev);
/* Pseudo address space under root PCI bus. */
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 7fa0a695c8..1eb05c29fc 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -292,6 +292,8 @@ struct IntelIOMMUState {
/* list of registered notifiers */
QLIST_HEAD(, VTDAddressSpace) vtd_as_with_notifiers;
+ GHashTable *vtd_host_iommu_dev; /* HostIOMMUDevice */
+
/* interrupt remapping */
bool intr_enabled; /* Whether guest enabled IR */
dma_addr_t intr_root; /* Interrupt remapping table pointer */
--
2.41.0.windows.1

View File

@ -0,0 +1,90 @@
From 8414bc02f988ecca7dda5325227ff5ffbe45150c Mon Sep 17 00:00:00 2001
From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Date: Wed, 15 Jan 2025 10:02:58 +0000
Subject: [PATCH] iommufd.h: Updated to openeuler olk-6.6 kernel
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
linux-headers/linux/iommufd.h | 26 ++++++++++++--------------
1 file changed, 12 insertions(+), 14 deletions(-)
diff --git a/linux-headers/linux/iommufd.h b/linux-headers/linux/iommufd.h
index 41559c6064..3e57fee01c 100644
--- a/linux-headers/linux/iommufd.h
+++ b/linux-headers/linux/iommufd.h
@@ -51,8 +51,8 @@ enum {
IOMMUFD_CMD_HWPT_GET_DIRTY_BITMAP = 0x8c,
IOMMUFD_CMD_HWPT_INVALIDATE = 0x8d,
IOMMUFD_CMD_FAULT_QUEUE_ALLOC = 0x8e,
- IOMMUFD_CMD_VIOMMU_ALLOC = 0x8f,
- IOMMUFD_CMD_VDEVICE_ALLOC = 0x90,
+ IOMMUFD_CMD_VIOMMU_ALLOC = 0x90,
+ IOMMUFD_CMD_VDEVICE_ALLOC = 0x91,
};
/**
@@ -397,18 +397,20 @@ struct iommu_hwpt_vtd_s1 {
};
/**
- * struct iommu_hwpt_arm_smmuv3 - ARM SMMUv3 Context Descriptor Table info
+ * struct iommu_hwpt_arm_smmuv3 - ARM SMMUv3 nested STE
* (IOMMU_HWPT_DATA_ARM_SMMUV3)
*
* @ste: The first two double words of the user space Stream Table Entry for
- * a user stage-1 Context Descriptor Table. Must be little-endian.
+ * the translation. Must be little-endian.
* Allowed fields: (Refer to "5.2 Stream Table Entry" in SMMUv3 HW Spec)
* - word-0: V, Cfg, S1Fmt, S1ContextPtr, S1CDMax
* - word-1: EATS, S1DSS, S1CIR, S1COR, S1CSH, S1STALLD
*
* -EIO will be returned if @ste is not legal or contains any non-allowed field.
* Cfg can be used to select a S1, Bypass or Abort configuration. A Bypass
- * nested domain will translate the same as the nesting parent.
+ * nested domain will translate the same as the nesting parent. The S1 will
+ * install a Context Descriptor Table pointing at userspace memory translated
+ * by the nesting parent.
*/
struct iommu_hwpt_arm_smmuv3 {
__aligned_le64 ste[2];
@@ -920,8 +922,8 @@ enum iommu_viommu_type {
* that is unique to a specific VM. Operations global to the IOMMU are connected
* to the vIOMMU, such as:
* - Security namespace for guest owned ID, e.g. guest-controlled cache tags
+ * - Non-device-affiliated event reporting, e.g. invalidation queue errors
* - Access to a sharable nesting parent pagetable across physical IOMMUs
- * - Non-affiliated event reporting (e.g. an invalidation queue error)
* - Virtualization of various platforms IDs, e.g. RIDs and others
* - Delivery of paravirtualized invalidation
* - Direct assigned invalidation queues
@@ -941,12 +943,10 @@ struct iommu_viommu_alloc {
* struct iommu_vdevice_alloc - ioctl(IOMMU_VDEVICE_ALLOC)
* @size: sizeof(struct iommu_vdevice_alloc)
* @viommu_id: vIOMMU ID to associate with the virtual device
- * @dev_id: The pyhsical device to allocate a virtual instance on the vIOMMU
- * @__reserved: Must be 0
+ * @dev_id: The physical device to allocate a virtual instance on the vIOMMU
+ * @out_vdevice_id: Object handle for the vDevice. Pass to IOMMU_DESTORY
* @virt_id: Virtual device ID per vIOMMU, e.g. vSID of ARM SMMUv3, vDeviceID
- * of AMD IOMMU, and vID of a nested Intel VT-d to a Context Table.
- * @out_vdevice_id: Output virtual instance ID for the allocated object
- * @__reserved2: Must be 0
+ * of AMD IOMMU, and vRID of a nested Intel VT-d to a Context Table
*
* Allocate a virtual device instance (for a physical device) against a vIOMMU.
* This instance holds the device's information (related to its vIOMMU) in a VM.
@@ -955,10 +955,8 @@ struct iommu_vdevice_alloc {
__u32 size;
__u32 viommu_id;
__u32 dev_id;
- __u32 __reserved;
- __aligned_u64 virt_id;
__u32 out_vdevice_id;
- __u32 __reserved2;
+ __aligned_u64 virt_id;
};
#define IOMMU_VDEVICE_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VDEVICE_ALLOC)
#endif
--
2.41.0.windows.1

View File

@ -0,0 +1,34 @@
From 3dfc0dd0b59925d1b73ca1a0db6d307ae597f76e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Sat, 11 Jan 2025 10:52:56 +0800
Subject: [PATCH] kconfig: Activate IOMMUFD for s390x machines
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
---
hw/s390x/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/hw/s390x/Kconfig b/hw/s390x/Kconfig
index 4c068d7960..26ad104485 100644
--- a/hw/s390x/Kconfig
+++ b/hw/s390x/Kconfig
@@ -6,6 +6,7 @@ config S390_CCW_VIRTIO
imply VFIO_CCW
imply WDT_DIAG288
imply PCIE_DEVICES
+ imply IOMMUFD
select PCI_EXPRESS
select S390_FLIC
select S390_FLIC_KVM if KVM
--
2.41.0.windows.1

Some files were not shown because too many files have changed in this diff Show More