In the Linux kernel, the following vulnerability has been resolved:
wifi: ath10k: Delay the unmapping of the buffer
On WCN3990, we are seeing a rare scenario where copy engine hardware is sending a copy complete interrupt to the host driver while still processing the buffer that the driver has sent, this is leading into an SMMU fault triggering kernel panic. This is happening on copy engine channel 3 (CE3) where the driver normally enqueues WMI commands to the firmware. Upon receiving a copy complete interrupt, host driver will immediately unmap and frees the buffer presuming that hardware has processed the buffer. In the issue case, upon receiving copy complete interrupt, host driver will unmap and free the buffer but since hardware is still accessing the buffer (which in this case got unmapped in parallel), SMMU hardware will trigger an SMMU fault resulting in a kernel panic.
In order to avoid this, as a work around, add a delay before unmapping the copy engine source DMA buffer. This is conditionally done for WCN3990 and only for the CE3 channel where issue is seen.
Below is the crash signature:
wifi smmu error: kernel: [ 10.120965] arm-smmu 15000000.iommu: Unhandled context fault: fsr=0x402, iova=0x7fdfd8ac0, fsynr=0x500003,cbfrsynra=0xc1, cb=6 arm-smmu 15000000.iommu: Unhandled context fault:fsr=0x402, iova=0x7fe06fdc0, fsynr=0x710003, cbfrsynra=0xc1, cb=6 qcom-q6v5-mss 4080000.remoteproc: fatal error received: err_qdi.c:1040:EF:wlan_process:0x1:WLAN RT:0x2091: cmnos_thread.c:3998:Asserted in copy_engine.c:AXI_ERROR_DETECTED:2149 remoteproc remoteproc0: crash detected in 4080000.remoteproc: type fatal error <3> remoteproc remoteproc0: handling crash #1 in 4080000.remoteproc
pc : __arm_lpae_unmap+0x500/0x514 lr : __arm_lpae_unmap+0x4bc/0x514 sp : ffffffc011ffb530 x29: ffffffc011ffb590 x28: 0000000000000000 x27: 0000000000000000 x26: 0000000000000004 x25: 0000000000000003 x24: ffffffc011ffb890 x23: ffffffa762ef9be0 x22: ffffffa77244ef00 x21:...