commit ab435ce49bd1d02e33dfec24f76955dc1196970b Author: Greg Kroah-Hartman Date: Sun Nov 1 12:45:43 2020 +0100 Linux 5.8.18 Tested-by: Guenter Roeck Tested-by: Linux Kernel Functional Testing Link: https://lore.kernel.org/r/20201031113459.481803250@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman commit 4a5649e0d3796c1612cbcac4b0c30f2476f5a886 Author: Pali Rohár Date: Wed Sep 2 16:43:43 2020 +0200 phy: marvell: comphy: Convert internal SMCC firmware return codes to errno commit ea17a0f153af2cd890e4ce517130dcccaa428c13 upstream. Driver ->power_on and ->power_off callbacks leaks internal SMCC firmware return codes to phy caller. This patch converts SMCC error codes to standard linux errno codes. Include file linux/arm-smccc.h already provides defines for SMCC error codes, so use them instead of custom driver defines. Note that return value is signed 32bit, but stored in unsigned long type with zero padding. Tested-by: Tomasz Maciej Nowak Link: https://lore.kernel.org/r/20200902144344.16684-2-pali@kernel.org Signed-off-by: Pali Rohár Signed-off-by: Lorenzo Pieralisi Reviewed-by: Rob Herring Signed-off-by: Greg Kroah-Hartman commit b8049438969ba6e429a0aa3d1e3347a49b553295 Author: Ricky Wu Date: Mon Aug 24 11:00:06 2020 +0800 misc: rtsx: do not setting OC_POWER_DOWN reg in rtsx_pci_init_ocp() commit 551b6729578a8981c46af964c10bf7d5d9ddca83 upstream. this power saving action in rtsx_pci_init_ocp() cause INTEL-NUC6 platform missing card reader Signed-off-by: Ricky Wu Link: https://lore.kernel.org/r/20200824030006.30033-1-ricky_wu@realtek.com Cc: Chris Clayton Signed-off-by: Greg Kroah-Hartman commit ad9ee9ce9d682b3c98611acdd82217c47c6e2388 Author: Stafford Horne Date: Thu Sep 3 05:54:40 2020 +0900 openrisc: Fix issue with get_user for 64-bit values commit d877322bc1adcab9850732275670409e8bcca4c4 upstream. A build failure was raised by kbuild with the following error. drivers/android/binder.c: Assembler messages: drivers/android/binder.c:3861: Error: unrecognized keyword/register name `l.lwz ?ap,4(r24)' drivers/android/binder.c:3866: Error: unrecognized keyword/register name `l.addi ?ap,r0,0' The issue is with 64-bit get_user() calls on openrisc. I traced this to a problem where in the internally in the get_user macros there is a cast to long __gu_val this causes GCC to think the get_user call is 32-bit. This binder code is really long and GCC allocates register r30, which triggers the issue. The 64-bit get_user asm tries to get the 64-bit pair register, which for r30 overflows the general register names and returns the dummy register ?ap. The fix here is to move the temporary variables into the asm macros. We use a 32-bit __gu_tmp for 32-bit and smaller macro and a 64-bit tmp in the 64-bit macro. The cast in the 64-bit macro has a trick of casting through __typeof__((x)-(x)) which avoids the below warning. This was barrowed from riscv. arch/openrisc/include/asm/uaccess.h:240:8: warning: cast to pointer from integer of different size I tested this in a small unit test to check reading between 64-bit and 32-bit pointers to 64-bit and 32-bit values in all combinations. Also I ran make C=1 to confirm no new sparse warnings came up. It all looks clean to me. Link: https://lore.kernel.org/lkml/202008200453.ohnhqkjQ%25lkp@intel.com/ Signed-off-by: Stafford Horne Reviewed-by: Luc Van Oostenryck Signed-off-by: Greg Kroah-Hartman commit f594998331bc478f21de1c47b8ea9f10221e1f08 Author: Souptick Joarder Date: Sun Sep 6 12:21:53 2020 +0530 xen/gntdev.c: Mark pages as dirty commit 779055842da5b2e508f3ccf9a8153cb1f704f566 upstream. There seems to be a bug in the original code when gntdev_get_page() is called with writeable=true then the page needs to be marked dirty before being put. To address this, a bool writeable is added in gnt_dev_copy_batch, set it in gntdev_grant_copy_seg() (and drop `writeable` argument to gntdev_get_page()) and then, based on batch->writeable, use set_page_dirty_lock(). Fixes: a4cdb556cae0 (xen/gntdev: add ioctl for grant copy) Suggested-by: Boris Ostrovsky Signed-off-by: Souptick Joarder Cc: John Hubbard Cc: Boris Ostrovsky Cc: Juergen Gross Cc: David Vrabel Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1599375114-32360-1-git-send-email-jrdr.linux@gmail.com Reviewed-by: Boris Ostrovsky Signed-off-by: Boris Ostrovsky Signed-off-by: Greg Kroah-Hartman commit 67e326e4f5df1a32cef6c35b20f0e8cba2be3a06 Author: Geert Uytterhoeven Date: Thu Sep 17 15:09:20 2020 +0200 ata: sata_rcar: Fix DMA boundary mask commit df9c590986fdb6db9d5636d6cd93bc919c01b451 upstream. Before commit 9495b7e92f716ab2 ("driver core: platform: Initialize dma_parms for platform devices"), the R-Car SATA device didn't have DMA parameters. Hence the DMA boundary mask supplied by its driver was silently ignored, as __scsi_init_queue() doesn't check the return value of dma_set_seg_boundary(), and the default value of 0xffffffff was used. Now the device has gained DMA parameters, the driver-supplied value is used, and the following warning is printed on Salvator-XS: DMA-API: sata_rcar ee300000.sata: mapping sg segment across boundary [start=0x00000000ffffe000] [end=0x00000000ffffefff] [boundary=0x000000001ffffffe] WARNING: CPU: 5 PID: 38 at kernel/dma/debug.c:1233 debug_dma_map_sg+0x298/0x300 (the range of start/end values depend on whether IOMMU support is enabled or not) The issue here is that SATA_RCAR_DMA_BOUNDARY doesn't have bit 0 set, so any typical end value, which is odd, will trigger the check. Fix this by increasing the DMA boundary value by 1. This also fixes the following WRITE DMA EXT timeout issue: # dd if=/dev/urandom of=/mnt/de1/file1-1024M bs=1M count=1024 ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen ata1.00: failed command: WRITE DMA EXT ata1.00: cmd 35/00:00:00:e6:0c/00:0a:00:00:00/e0 tag 0 dma 1310720 out res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata1.00: status: { DRDY } as seen by Shimoda-san since commit 429120f3df2dba2b ("block: fix splitting segments on boundary masks"). Fixes: 8bfbeed58665dbbf ("sata_rcar: correct 'sata_rcar_sht'") Fixes: 9495b7e92f716ab2 ("driver core: platform: Initialize dma_parms for platform devices") Fixes: 429120f3df2dba2b ("block: fix splitting segments on boundary masks") Signed-off-by: Geert Uytterhoeven Tested-by: Lad Prabhakar Tested-by: Yoshihiro Shimoda Reviewed-by: Christoph Hellwig Reviewed-by: Greg Kroah-Hartman Reviewed-by: Sergei Shtylyov Reviewed-by: Ulf Hansson Cc: stable Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit f6b94060a123b3e59f90147a26e727af38e85bd1 Author: Grygorii Strashko Date: Fri Sep 18 19:55:18 2020 +0300 PM: runtime: Fix timer_expires data type on 32-bit arches commit 6b61d49a55796dbbc479eeb4465e59fd656c719c upstream. Commit 8234f6734c5d ("PM-runtime: Switch autosuspend over to using hrtimers") switched PM runtime autosuspend to use hrtimers and all related time accounting in ns, but missed to update the timer_expires data type in struct dev_pm_info to u64. This causes the timer_expires value to be truncated on 32-bit architectures when assignment is done from u64 values: rpm_suspend() |- dev->power.timer_expires = expires; Fix it by changing the timer_expires type to u64. Fixes: 8234f6734c5d ("PM-runtime: Switch autosuspend over to using hrtimers") Signed-off-by: Grygorii Strashko Acked-by: Pavel Machek Acked-by: Vincent Guittot Cc: 5.0+ # 5.0+ [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 53faca2f4ca3443aa00f7f46c4f327bf4a98728e Author: Peter Zijlstra Date: Wed Sep 30 13:04:32 2020 +0100 serial: pl011: Fix lockdep splat when handling magic-sysrq interrupt commit 534cf755d9df99e214ddbe26b91cd4d81d2603e2 upstream. Issuing a magic-sysrq via the PL011 causes the following lockdep splat, which is easily reproducible under QEMU: | sysrq: Changing Loglevel | sysrq: Loglevel set to 9 | | ====================================================== | WARNING: possible circular locking dependency detected | 5.9.0-rc7 #1 Not tainted | ------------------------------------------------------ | systemd-journal/138 is trying to acquire lock: | ffffab133ad950c0 (console_owner){-.-.}-{0:0}, at: console_lock_spinning_enable+0x34/0x70 | | but task is already holding lock: | ffff0001fd47b098 (&port_lock_key){-.-.}-{2:2}, at: pl011_int+0x40/0x488 | | which lock already depends on the new lock. [...] | Possible unsafe locking scenario: | | CPU0 CPU1 | ---- ---- | lock(&port_lock_key); | lock(console_owner); | lock(&port_lock_key); | lock(console_owner); | | *** DEADLOCK *** The issue being that CPU0 takes 'port_lock' on the irq path in pl011_int() before taking 'console_owner' on the printk() path, whereas CPU1 takes the two locks in the opposite order on the printk() path due to setting the "console_owner" prior to calling into into the actual console driver. Fix this in the same way as the msm-serial driver by dropping 'port_lock' before handling the sysrq. Cc: # 4.19+ Cc: Russell King Cc: Greg Kroah-Hartman Cc: Jiri Slaby Link: https://lore.kernel.org/r/20200811101313.GA6970@willie-the-truck Signed-off-by: Peter Zijlstra Tested-by: Will Deacon Signed-off-by: Will Deacon Link: https://lore.kernel.org/r/20200930120432.16551-1-will@kernel.org Signed-off-by: Greg Kroah-Hartman commit e3f6c126a3f72b9dbe53bf3e96c27b4dc346e07c Author: Paras Sharma Date: Wed Sep 30 11:35:26 2020 +0530 serial: qcom_geni_serial: To correct QUP Version detection logic commit c9ca43d42ed8d5fd635d327a664ed1d8579eb2af upstream. For QUP IP versions 2.5 and above the oversampling rate is halved from 32 to 16. Commit ce734600545f ("tty: serial: qcom_geni_serial: Update the oversampling rate") is pushed to handle this scenario. But the existing logic is failing to classify QUP Version 3.0 into the correct group ( 2.5 and above). As result Serial Engine clocks are not configured properly for baud rate and garbage data is sampled to FIFOs from the line. So, fix the logic to detect QUP with versions 2.5 and above. Fixes: ce734600545f ("tty: serial: qcom_geni_serial: Update the oversampling rate") Cc: stable Signed-off-by: Paras Sharma Reviewed-by: Akash Asthana Link: https://lore.kernel.org/r/1601445926-23673-1-git-send-email-parashar@codeaurora.org Signed-off-by: Greg Kroah-Hartman commit 8f924c0a5665a74fc74ea50f71999914095aaf88 Author: Chris Wilson Date: Thu Jul 23 18:21:19 2020 +0100 drm/i915/gem: Serialise debugfs i915_gem_objects with ctx->mutex commit 4fe9af8e881d946bf60790eeb37a7c4f96e28382 upstream. Since the debugfs may peek into the GEM contexts as the corresponding client/fd is being closed, we may try and follow a dangling pointer. However, the context closure itself is serialised with the ctx->mutex, so if we hold that mutex as we inspect the state coupled in the context, we know the pointers within the context are stable and will remain valid as we inspect their tables. Signed-off-by: Chris Wilson Cc: CQ Tang Cc: Daniel Vetter Cc: stable@vger.kernel.org Reviewed-by: Tvrtko Ursulin Link: https://patchwork.freedesktop.org/patch/msgid/20200723172119.17649-3-chris@chris-wilson.co.uk (cherry picked from commit 102f5aa491f262c818e607fc4fee08a724a76c69) Signed-off-by: Rodrigo Vivi Signed-off-by: Greg Kroah-Hartman commit 241bd102e33761412746cbf416d36ffe37bb28f5 Author: Gustavo A. R. Silva Date: Mon Apr 27 14:50:37 2020 -0500 mtd: lpddr: Fix bad logic in print_drs_error commit 1c9c02bb22684f6949d2e7ddc0a3ff364fd5a6fc upstream. Update logic for broken test. Use a more common logging style. It appears the logic in this function is broken for the consecutive tests of if (prog_status & 0x3) ... else if (prog_status & 0x2) ... else (prog_status & 0x1) ... Likely the first test should be if ((prog_status & 0x3) == 0x3) Found by inspection of include files using printk. Fixes: eb3db27507f7 ("[MTD] LPDDR PFOW definition") Cc: stable@vger.kernel.org Reported-by: Joe Perches Signed-off-by: Gustavo A. R. Silva Acked-by: Miquel Raynal Signed-off-by: Miquel Raynal Link: https://lore.kernel.org/linux-mtd/3fb0e29f5b601db8be2938a01d974b00c8788501.1588016644.git.gustavo@embeddedor.com Signed-off-by: Greg Kroah-Hartman commit 5868beda60c864705a1613433d4bb060f3f00881 Author: Jason Gunthorpe Date: Wed Sep 30 10:20:07 2020 +0300 RDMA/addr: Fix race with netevent_callback()/rdma_addr_cancel() commit 2ee9bf346fbfd1dad0933b9eb3a4c2c0979b633e upstream. This three thread race can result in the work being run once the callback becomes NULL: CPU1 CPU2 CPU3 netevent_callback() process_one_req() rdma_addr_cancel() [..] spin_lock_bh() set_timeout() spin_unlock_bh() spin_lock_bh() list_del_init(&req->list); spin_unlock_bh() req->callback = NULL spin_lock_bh() if (!list_empty(&req->list)) // Skipped! // cancel_delayed_work(&req->work); spin_unlock_bh() process_one_req() // again req->callback() // BOOM cancel_delayed_work_sync() The solution is to always cancel the work once it is completed so any in between set_timeout() does not result in it running again. Cc: stable@vger.kernel.org Fixes: 44e75052bc2a ("RDMA/rdma_cm: Make rdma_addr_cancel into a fence") Link: https://lore.kernel.org/r/20200930072007.1009692-1-leon@kernel.org Reported-by: Dan Aloni Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit a8069b80a1fb99312ffb5699dc148253cd2bcc45 Author: Frederic Barrat Date: Tue Apr 7 13:56:01 2020 +0200 cxl: Rework error message for incompatible slots commit 40ac790d99c6dd16b367d5c2339e446a5f1b0593 upstream. Improve the error message shown if a capi adapter is plugged on a capi-incompatible slot directly under the PHB (no intermediate switch). Fixes: 5632874311db ("cxl: Add support for POWER9 DD2") Cc: stable@vger.kernel.org # 4.14+ Signed-off-by: Frederic Barrat Reviewed-by: Andrew Donnellan Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20200407115601.25453-1-fbarrat@linux.ibm.com Signed-off-by: Greg Kroah-Hartman commit 9f9dc704c8cd442b2c93d330b9ae0e3f9c0efddc Author: Jia-Ju Bai Date: Sun Aug 2 21:29:49 2020 +0800 p54: avoid accessing the data mapped to streaming DMA commit 478762855b5ae9f68fa6ead1edf7abada70fcd5f upstream. In p54p_tx(), skb->data is mapped to streaming DMA on line 337: mapping = pci_map_single(..., skb->data, ...); Then skb->data is accessed on line 349: desc->device_addr = ((struct p54_hdr *)skb->data)->req_id; This access may cause data inconsistency between CPU cache and hardware. To fix this problem, ((struct p54_hdr *)skb->data)->req_id is stored in a local variable before DMA mapping, and then the driver accesses this local variable instead of skb->data. Cc: Signed-off-by: Jia-Ju Bai Acked-by: Christian Lamparter Signed-off-by: Kalle Valo Link: https://lore.kernel.org/r/20200802132949.26788-1-baijiaju@tsinghua.edu.cn Signed-off-by: Greg Kroah-Hartman commit 9f4ef6a90c1b2a97e13d599d0afd5518cd19541b Author: Roberto Sassu Date: Fri Sep 4 11:23:30 2020 +0200 evm: Check size of security.evm before using it commit 455b6c9112eff8d249e32ba165742085678a80a4 upstream. This patch checks the size for the EVM_IMA_XATTR_DIGSIG and EVM_XATTR_PORTABLE_DIGSIG types to ensure that the algorithm is read from the buffer returned by vfs_getxattr_alloc(). Cc: stable@vger.kernel.org # 4.19.x Fixes: 5feeb61183dde ("evm: Allow non-SHA1 digital signatures") Signed-off-by: Roberto Sassu Signed-off-by: Mimi Zohar Signed-off-by: Greg Kroah-Hartman commit a42b1273af7390001f843285004909527f5ab3f2 Author: Song Liu Date: Thu Sep 10 13:33:14 2020 -0700 bpf: Fix comment for helper bpf_current_task_under_cgroup() commit 1aef5b4391f0c75c0a1523706a7b0311846ee12f upstream. This should be "current" not "skb". Fixes: c6b5fb8690fa ("bpf: add documentation for eBPF helpers (42-50)") Signed-off-by: Song Liu Signed-off-by: Alexei Starovoitov Cc: Link: https://lore.kernel.org/bpf/20200910203314.70018-1-songliubraving@fb.com Signed-off-by: Greg Kroah-Hartman commit 07d54b8dc56e513fc1175b4e6882cebfffba8ee5 Author: Miklos Szeredi Date: Fri Sep 18 10:36:50 2020 +0200 fuse: fix page dereference after free commit d78092e4937de9ce55edcb4ee4c5e3c707be0190 upstream. After unlock_request() pages from the ap->pages[] array may be put (e.g. by aborting the connection) and the pages can be freed. Prevent use after free by grabbing a reference to the page before calling unlock_request(). The original patch was created by Pradeep P V K. Reported-by: Pradeep P V K Cc: Signed-off-by: Miklos Szeredi Signed-off-by: Greg Kroah-Hartman commit 78453a7dbb1a431ff96afb41a23c4439215d1c3f Author: Pali Rohár Date: Fri Oct 9 10:42:44 2020 +0200 ata: ahci: mvebu: Make SATA PHY optional for Armada 3720 commit 45aefe3d2251e4e229d7662052739f96ad1d08d9 upstream. Older ATF does not provide SMC call for SATA phy power on functionality and therefore initialization of ahci_mvebu is failing when older version of ATF is using. In this case phy_power_on() function returns -EOPNOTSUPP. This patch adds a new hflag AHCI_HFLAG_IGN_NOTSUPP_POWER_ON which cause that ahci_platform_enable_phys() would ignore -EOPNOTSUPP errors from phy_power_on() call. It fixes initialization of ahci_mvebu on Espressobin boards where is older Marvell's Arm Trusted Firmware without SMC call for SATA phy power. This is regression introduced in commit 8e18c8e58da64 ("arm64: dts: marvell: armada-3720-espressobin: declare SATA PHY property") where SATA phy was defined and therefore ahci_platform_enable_phys() on Espressobin started failing. Fixes: 8e18c8e58da64 ("arm64: dts: marvell: armada-3720-espressobin: declare SATA PHY property") Signed-off-by: Pali Rohár Tested-by: Tomasz Maciej Nowak Cc: # 5.1+: ea17a0f153af: phy: marvell: comphy: Convert internal SMCC firmware return codes to errno Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 4752a1313463cb03592706baf44f2ba80d56da59 Author: Pali Rohár Date: Wed Sep 2 16:43:44 2020 +0200 PCI: aardvark: Fix initialization with old Marvell's Arm Trusted Firmware commit b0c6ae0f8948a2be6bf4e8b4bbab9ca1343289b6 upstream. Old ATF automatically power on pcie phy and does not provide SMC call for phy power on functionality which leads to aardvark initialization failure: [ 0.330134] mvebu-a3700-comphy d0018300.phy: unsupported SMC call, try updating your firmware [ 0.338846] phy phy-d0018300.phy.1: phy poweron failed --> -95 [ 0.344753] advk-pcie d0070000.pcie: Failed to initialize PHY (-95) [ 0.351160] advk-pcie: probe of d0070000.pcie failed with error -95 This patch fixes above failure by ignoring 'not supported' error in aardvark driver. In this case it is expected that phy is already power on. Tested-by: Tomasz Maciej Nowak Link: https://lore.kernel.org/r/20200902144344.16684-3-pali@kernel.org Fixes: 366697018c9a ("PCI: aardvark: Add PHY support") Signed-off-by: Pali Rohár Signed-off-by: Lorenzo Pieralisi Reviewed-by: Rob Herring Cc: # 5.8+: ea17a0f153af: phy: marvell: comphy: Convert internal SMCC firmware return codes to errno Signed-off-by: Greg Kroah-Hartman commit b9cc04b049d85bc1597bfe4512937424b1e60401 Author: Juergen Gross Date: Fri Sep 25 16:07:51 2020 +0200 x86/xen: disable Firmware First mode for correctable memory errors commit d759af38572f97321112a0852353613d18126038 upstream. When running as Xen dom0 the kernel isn't responsible for selecting the error handling mode, this should be handled by the hypervisor. So disable setting FF mode when running as Xen pv guest. Not doing so might result in boot splats like: [ 7.509696] HEST: Enabling Firmware First mode for corrected errors. [ 7.510382] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 2. [ 7.510383] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 3. [ 7.510384] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 4. [ 7.510384] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 5. [ 7.510385] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 6. [ 7.510386] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 7. [ 7.510386] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 8. Reason is that the HEST ACPI table contains the real number of MCA banks, while the hypervisor is emulating only 2 banks for guests. Cc: stable@vger.kernel.org Signed-off-by: Juergen Gross Reviewed-by: Boris Ostrovsky Link: https://lore.kernel.org/r/20200925140751.31381-1-jgross@suse.com Signed-off-by: Boris Ostrovsky Signed-off-by: Greg Kroah-Hartman commit ea4e8cf5072ef8e0d549f2f22c6936249aefa0b3 Author: Thomas Gleixner Date: Mon Oct 12 15:11:47 2020 +0200 x86/traps: Fix #DE Oops message regression commit 5f1ec1fd32252af5130dac23b5542e8e66fe0bcb upstream. The conversion of #DE to the idtentry mechanism introduced a change in the Ooops message which confuses tools which parse crash information in dmesg. Remove the underscore from 'divide_error' to restore previous behaviour. Fixes: 9d06c4027f21 ("x86/entry: Convert Divide Error to IDTENTRY") Reported-by: Dmitry Vyukov Signed-off-by: Thomas Gleixner Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/CACT4Y+bTZFkuZd7+bPArowOv-7Die+WZpfOWnEO_Wgs3U59+oA@mail.gmail.com Signed-off-by: Greg Kroah-Hartman commit 085f6be2fe8896d46b8e3c64d64e003dd5f22bdd Author: Kim Phillips Date: Tue Sep 8 16:47:36 2020 -0500 arch/x86/amd/ibs: Fix re-arming IBS Fetch commit 221bfce5ebbdf72ff08b3bf2510ae81058ee568b upstream. Stephane Eranian found a bug in that IBS' current Fetch counter was not being reset when the driver would write the new value to clear it along with the enable bit set, and found that adding an MSR write that would first disable IBS Fetch would make IBS Fetch reset its current count. Indeed, the PPR for AMD Family 17h Model 31h B0 55803 Rev 0.54 - Sep 12, 2019 states "The periodic fetch counter is set to IbsFetchCnt [...] when IbsFetchEn is changed from 0 to 1." Explicitly set IbsFetchEn to 0 and then to 1 when re-enabling IBS Fetch, so the driver properly resets the internal counter to 0 and IBS Fetch starts counting again. A family 15h machine tested does not have this problem, and the extra wrmsr is also not needed on Family 19h, so only do the extra wrmsr on families 16h through 18h. Reported-by: Stephane Eranian Signed-off-by: Kim Phillips [peterz: optimized] Signed-off-by: Peter Zijlstra (Intel) Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 Signed-off-by: Greg Kroah-Hartman commit b4818cfc3f9c69142477242179de35b062d3f33e Author: Gao Xiang Date: Tue Aug 11 15:00:20 2020 +0800 erofs: avoid duplicated permission check for "trusted." xattrs commit d578b46db69d125a654f509bdc9091d84e924dc8 upstream. Don't recheck it since xattr_permission() already checks CAP_SYS_ADMIN capability. Just follow 5d3ce4f70172 ("f2fs: avoid duplicated permission check for "trusted." xattrs") Reported-by: Hongyu Jin [ Gao Xiang: since it could cause some complex Android overlay permission issue as well on android-5.4+, it'd be better to backport to 5.4+ rather than pure cleanup on mainline. ] Cc: # 5.4+ Link: https://lore.kernel.org/r/20200811070020.6339-1-hsiangkao@redhat.com Reviewed-by: Chao Yu Signed-off-by: Gao Xiang Signed-off-by: Greg Kroah-Hartman commit 3a9e7db9a40e57eb3610d956bc330b151c68f62c Author: Leon Romanovsky Date: Mon Oct 26 14:33:27 2020 +0200 net: protect tcf_block_unbind with block lock [ Upstream commit d6535dca28859d8d9ef80894eb287b2ac35a32e8 ] The tcf_block_unbind() expects that the caller will take block->cb_lock before calling it, however the code took RTNL lock and dropped cb_lock instead. This causes to the following kernel panic. WARNING: CPU: 1 PID: 13524 at net/sched/cls_api.c:1488 tcf_block_unbind+0x2db/0x420 Modules linked in: mlx5_ib mlx5_core mlxfw ptp pps_core act_mirred act_tunnel_key cls_flower vxlan ip6_udp_tunnel udp_tunnel dummy sch_ingress openvswitch nsh xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad ib_ipoib rdma_cm iw_cm ib_cm ib_uverbs ib_core overlay [last unloaded: mlxfw] CPU: 1 PID: 13524 Comm: test-ecmp-add-v Tainted: G W 5.9.0+ #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:tcf_block_unbind+0x2db/0x420 Code: ff 48 83 c4 40 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8d bc 24 30 01 00 00 be ff ff ff ff e8 7d 7f 70 00 85 c0 0f 85 7b fd ff ff <0f> 0b e9 74 fd ff ff 48 c7 c7 dc 6a 24 84 e8 02 ec fe fe e9 55 fd RSP: 0018:ffff888117d17968 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88812f713c00 RCX: 1ffffffff0848d5b RDX: 0000000000000001 RSI: ffff88814fbc8130 RDI: ffff888107f2b878 RBP: 1ffff11022fa2f3f R08: 0000000000000000 R09: ffffffff84115a87 R10: fffffbfff0822b50 R11: ffff888107f2b898 R12: ffff88814fbc8000 R13: ffff88812f713c10 R14: ffff888117d17a38 R15: ffff88814fbc80c0 FS: 00007f6593d36740(0000) GS:ffff8882a4f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005607a00758f8 CR3: 0000000131aea006 CR4: 0000000000170ea0 Call Trace: tc_block_indr_cleanup+0x3e0/0x5a0 ? tcf_block_unbind+0x420/0x420 ? __mutex_unlock_slowpath+0xe7/0x610 flow_indr_dev_unregister+0x5e2/0x930 ? mlx5e_restore_tunnel+0xdf0/0xdf0 [mlx5_core] ? mlx5e_restore_tunnel+0xdf0/0xdf0 [mlx5_core] ? flow_indr_block_cb_alloc+0x3c0/0x3c0 ? mlx5_db_free+0x37c/0x4b0 [mlx5_core] mlx5e_cleanup_rep_tx+0x8b/0xc0 [mlx5_core] mlx5e_detach_netdev+0xe5/0x120 [mlx5_core] mlx5e_vport_rep_unload+0x155/0x260 [mlx5_core] esw_offloads_disable+0x227/0x2b0 [mlx5_core] mlx5_eswitch_disable_locked.cold+0x38e/0x699 [mlx5_core] mlx5_eswitch_disable+0x94/0xf0 [mlx5_core] mlx5_device_disable_sriov+0x183/0x1f0 [mlx5_core] mlx5_core_sriov_configure+0xfd/0x230 [mlx5_core] sriov_numvfs_store+0x261/0x2f0 ? sriov_drivers_autoprobe_store+0x110/0x110 ? sysfs_file_ops+0x170/0x170 ? sysfs_file_ops+0x117/0x170 ? sysfs_file_ops+0x170/0x170 kernfs_fop_write+0x1ff/0x3f0 ? rcu_read_lock_any_held+0x6e/0x90 vfs_write+0x1f3/0x620 ksys_write+0xf9/0x1d0 ? __x64_sys_read+0xb0/0xb0 ? lockdep_hardirqs_on_prepare+0x273/0x3f0 ? syscall_enter_from_user_mode+0x1d/0x50 do_syscall_64+0x2d/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 <...> ---[ end trace bfdd028ada702879 ]--- Fixes: 0fdcf78d5973 ("net: use flow_indr_dev_setup_offload()") Signed-off-by: Leon Romanovsky Link: https://lore.kernel.org/r/20201026123327.1141066-1-leon@kernel.org Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit af5d5b8afd1235517b2ea1590dbf145a458839f4 Author: Tung Nguyen Date: Tue Oct 27 10:24:03 2020 +0700 tipc: fix memory leak caused by tipc_buf_append() [ Upstream commit ceb1eb2fb609c88363e06618b8d4bbf7815a4e03 ] Commit ed42989eab57 ("tipc: fix the skb_unshare() in tipc_buf_append()") replaced skb_unshare() with skb_copy() to not reduce the data reference counter of the original skb intentionally. This is not the correct way to handle the cloned skb because it causes memory leak in 2 following cases: 1/ Sending multicast messages via broadcast link The original skb list is cloned to the local skb list for local destination. After that, the data reference counter of each skb in the original list has the value of 2. This causes each skb not to be freed after receiving ACK: tipc_link_advance_transmq() { ... /* release skb */ __skb_unlink(skb, &l->transmq); kfree_skb(skb); <-- memory exists after being freed } 2/ Sending multicast messages via replicast link Similar to the above case, each skb cannot be freed after purging the skb list: tipc_mcast_xmit() { ... __skb_queue_purge(pkts); <-- memory exists after being freed } This commit fixes this issue by using skb_unshare() instead. Besides, to avoid use-after-free error reported by KASAN, the pointer to the fragment is set to NULL before calling skb_unshare() to make sure that the original skb is not freed after freeing the fragment 2 times in case skb_unshare() returns NULL. Fixes: ed42989eab57 ("tipc: fix the skb_unshare() in tipc_buf_append()") Acked-by: Jon Maloy Reported-by: Thang Hoang Ngo Signed-off-by: Tung Nguyen Reviewed-by: Xin Long Acked-by: Cong Wang Link: https://lore.kernel.org/r/20201027032403.1823-1-tung.q.nguyen@dektech.com.au Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 519366f64c272746dd339574e4be6eb5e29d8303 Author: Arjun Roy Date: Fri Oct 23 11:47:09 2020 -0700 tcp: Prevent low rmem stalls with SO_RCVLOWAT. [ Upstream commit 435ccfa894e35e3d4a1799e6ac030e48a7b69ef5 ] With SO_RCVLOWAT, under memory pressure, it is possible to enter a state where: 1. We have not received enough bytes to satisfy SO_RCVLOWAT. 2. We have not entered buffer pressure (see tcp_rmem_pressure()). 3. But, we do not have enough buffer space to accept more packets. In this case, we advertise 0 rwnd (due to #3) but the application does not drain the receive queue (no wakeup because of #1 and #2) so the flow stalls. Modify the heuristic for SO_RCVLOWAT so that, if we are advertising rwnd<=rcv_mss, force a wakeup to prevent a stall. Without this patch, setting tcp_rmem to 6143 and disabling TCP autotune causes a stalled flow. With this patch, no stall occurs. This is with RPC-style traffic with large messages. Fixes: 03f45c883c6f ("tcp: avoid extra wakeups for SO_RCVLOWAT users") Signed-off-by: Arjun Roy Acked-by: Soheil Hassas Yeganeh Acked-by: Neal Cardwell Signed-off-by: Eric Dumazet Link: https://lore.kernel.org/r/20201023184709.217614-1-arjunroy.kdev@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 9ceecfdba70119b77d77f787c80b356f37988da7 Author: Andrew Gabbasov Date: Mon Oct 26 05:21:30 2020 -0500 ravb: Fix bit fields checking in ravb_hwtstamp_get() [ Upstream commit 68b9f0865b1ef545da180c57d54b82c94cb464a4 ] In the function ravb_hwtstamp_get() in ravb_main.c with the existing values for RAVB_RXTSTAMP_TYPE_V2_L2_EVENT (0x2) and RAVB_RXTSTAMP_TYPE_ALL (0x6) if (priv->tstamp_rx_ctrl & RAVB_RXTSTAMP_TYPE_V2_L2_EVENT) config.rx_filter = HWTSTAMP_FILTER_PTP_V2_L2_EVENT; else if (priv->tstamp_rx_ctrl & RAVB_RXTSTAMP_TYPE_ALL) config.rx_filter = HWTSTAMP_FILTER_ALL; if the test on RAVB_RXTSTAMP_TYPE_ALL should be true, it will never be reached. This issue can be verified with 'hwtstamp_config' testing program (tools/testing/selftests/net/hwtstamp_config.c). Setting filter type to ALL and subsequent retrieving it gives incorrect value: $ hwtstamp_config eth0 OFF ALL flags = 0 tx_type = OFF rx_filter = ALL $ hwtstamp_config eth0 flags = 0 tx_type = OFF rx_filter = PTP_V2_L2_EVENT Correct this by converting if-else's to switch. Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Reported-by: Julia Lawall Signed-off-by: Andrew Gabbasov Reviewed-by: Sergei Shtylyov Link: https://lore.kernel.org/r/20201026102130.29368-1-andrew_gabbasov@mentor.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit fa67cc69a8c858c7527aa024f2e2b3fb1460a0c9 Author: Heiner Kallweit Date: Thu Oct 29 10:18:53 2020 +0100 r8169: fix issue with forced threading in combination with shared interrupts [ Upstream commit 2734a24e6e5d18522fbf599135c59b82ec9b2c9e ] As reported by Serge flag IRQF_NO_THREAD causes an error if the interrupt is actually shared and the other driver(s) don't have this flag set. This situation can occur if a PCI(e) legacy interrupt is used in combination with forced threading. There's no good way to deal with this properly, therefore we have to remove flag IRQF_NO_THREAD. For fixing the original forced threading issue switch to napi_schedule(). Fixes: 424a646e072a ("r8169: fix operation under forced interrupt threading") Link: https://www.spinics.net/lists/netdev/msg694960.html Reported-by: Serge Belyshev Signed-off-by: Heiner Kallweit Tested-by: Serge Belyshev Link: https://lore.kernel.org/r/b5b53bfe-35ac-3768-85bf-74d1290cf394@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 62d9cec6f928f984c143f2a5ba8d6b37203e1bd0 Author: Guillaume Nault Date: Mon Oct 26 11:29:45 2020 +0100 net/sched: act_mpls: Add softdep on mpls_gso.ko TCA_MPLS_ACT_PUSH and TCA_MPLS_ACT_MAC_PUSH might be used on gso packets. Such packets will thus require mpls_gso.ko for segmentation. v2: Drop dependency on CONFIG_NET_MPLS_GSO in Kconfig (from Jakub and David). Fixes: 2a2ea50870ba ("net: sched: add mpls manipulation actions to TC") Signed-off-by: Guillaume Nault Link: https://lore.kernel.org/r/1f6cab15bbd15666795061c55563aaf6a386e90e.1603708007.git.gnault@redhat.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 2bc5d5c373efbe9b6f45793d6b96bbd4687c3d01 Author: Alex Elder Date: Wed Oct 21 20:00:29 2020 -0500 net: ipa: command payloads already mapped [ Upstream commit df833050cced27e1b343cc8bc41f90191b289334 ] IPA transactions describe actions to be performed by the IPA hardware. Three cases use IPA transactions: transmitting a socket buffer; providing a page to receive packet data; and issuing an IPA immediate command. An IPA transaction contains a scatter/gather list (SGL) to hold the set of actions to be performed. We map buffers in the SGL for DMA at the time they are added to the transaction. For skb TX transactions, we fill the SGL with a call to skb_to_sgvec(). Page RX transactions involve a single page pointer, and that is recorded in the SGL with sg_set_page(). In both of these cases we then map the SGL for DMA with a call to dma_map_sg(). Immediate commands are different. The payload for an immediate command comes from a region of coherent DMA memory, which must *not* be mapped for DMA. For that reason, gsi_trans_cmd_add() sort of hand-crafts each SGL entry added to a command transaction. This patch fixes a problem with the code that crafts the SGL entry for an immediate command. Previously a portion of the SGL entry was updated using sg_set_buf(). However this is not valid because it includes a call to virt_to_page() on the buffer, but the command buffer pointer is not a linear address. Since we never actually map the SGL for command transactions, there are very few fields in the SGL we need to fill. Specifically, we only need to record the DMA address and the length, so they can be used by __gsi_trans_commit() to fill a TRE. We additionally need to preserve the SGL flags so for_each_sg() still works. For that we can simply assign a null page pointer for command SGL entries. Fixes: 9dd441e4ed575 ("soc: qcom: ipa: GSI transactions") Reported-by: Stephen Boyd Tested-by: Stephen Boyd Signed-off-by: Alex Elder Link: https://lore.kernel.org/r/20201022010029.11877-1-elder@linaro.org Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 1336d288b3538961f074fadc62de75f042cf1a34 Author: Zenghui Yu Date: Fri Oct 23 13:15:50 2020 +0800 net: hns3: Clear the CMDQ registers before unmapping BAR region [ Upstream commit e3364c5ff3ff975b943a7bf47e21a2a4bf20f3fe ] When unbinding the hns3 driver with the HNS3 VF, I got the following kernel panic: [ 265.709989] Unable to handle kernel paging request at virtual address ffff800054627000 [ 265.717928] Mem abort info: [ 265.720740] ESR = 0x96000047 [ 265.723810] EC = 0x25: DABT (current EL), IL = 32 bits [ 265.729126] SET = 0, FnV = 0 [ 265.732195] EA = 0, S1PTW = 0 [ 265.735351] Data abort info: [ 265.738227] ISV = 0, ISS = 0x00000047 [ 265.742071] CM = 0, WnR = 1 [ 265.745055] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000009b54000 [ 265.751753] [ffff800054627000] pgd=0000202ffffff003, p4d=0000202ffffff003, pud=00002020020eb003, pmd=00000020a0dfc003, pte=0000000000000000 [ 265.764314] Internal error: Oops: 96000047 [#1] SMP [ 265.830357] CPU: 61 PID: 20319 Comm: bash Not tainted 5.9.0+ #206 [ 265.836423] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS 1.05 09/18/2019 [ 265.843873] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO BTYPE=--) [ 265.843890] pc : hclgevf_cmd_uninit+0xbc/0x300 [ 265.861988] lr : hclgevf_cmd_uninit+0xb0/0x300 [ 265.861992] sp : ffff80004c983b50 [ 265.881411] pmr_save: 000000e0 [ 265.884453] x29: ffff80004c983b50 x28: ffff20280bbce500 [ 265.889744] x27: 0000000000000000 x26: 0000000000000000 [ 265.895034] x25: ffff800011a1f000 x24: ffff800011a1fe90 [ 265.900325] x23: ffff0020ce9b00d8 x22: ffff0020ce9b0150 [ 265.905616] x21: ffff800010d70e90 x20: ffff800010d70e90 [ 265.910906] x19: ffff0020ce9b0080 x18: 0000000000000004 [ 265.916198] x17: 0000000000000000 x16: ffff800011ae32e8 [ 265.916201] x15: 0000000000000028 x14: 0000000000000002 [ 265.916204] x13: ffff800011ae32e8 x12: 0000000000012ad8 [ 265.946619] x11: ffff80004c983b50 x10: 0000000000000000 [ 265.951911] x9 : ffff8000115d0888 x8 : 0000000000000000 [ 265.951914] x7 : ffff800011890b20 x6 : c0000000ffff7fff [ 265.951917] x5 : ffff80004c983930 x4 : 0000000000000001 [ 265.951919] x3 : ffffa027eec1b000 x2 : 2b78ccbbff369100 [ 265.964487] x1 : 0000000000000000 x0 : ffff800054627000 [ 265.964491] Call trace: [ 265.964494] hclgevf_cmd_uninit+0xbc/0x300 [ 265.964496] hclgevf_uninit_ae_dev+0x9c/0xe8 [ 265.964501] hnae3_unregister_ae_dev+0xb0/0x130 [ 265.964516] hns3_remove+0x34/0x88 [hns3] [ 266.009683] pci_device_remove+0x48/0xf0 [ 266.009692] device_release_driver_internal+0x114/0x1e8 [ 266.030058] device_driver_detach+0x28/0x38 [ 266.034224] unbind_store+0xd4/0x108 [ 266.037784] drv_attr_store+0x40/0x58 [ 266.041435] sysfs_kf_write+0x54/0x80 [ 266.045081] kernfs_fop_write+0x12c/0x250 [ 266.049076] vfs_write+0xc4/0x248 [ 266.052378] ksys_write+0x74/0xf8 [ 266.055677] __arm64_sys_write+0x24/0x30 [ 266.059584] el0_svc_common.constprop.3+0x84/0x270 [ 266.064354] do_el0_svc+0x34/0xa0 [ 266.067658] el0_svc+0x38/0x40 [ 266.070700] el0_sync_handler+0x8c/0xb0 [ 266.074519] el0_sync+0x140/0x180 It looks like the BAR memory region had already been unmapped before we start clearing CMDQ registers in it, which is pretty bad and the kernel happily kills itself because of a Current EL Data Abort (on arm64). Moving the CMDQ uninitialization a bit early fixes the issue for me. Fixes: 862d969a3a4d ("net: hns3: do VF's pci re-initialization while PF doing FLR") Signed-off-by: Zenghui Yu Link: https://lore.kernel.org/r/20201023051550.793-1-yuzenghui@huawei.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 7fb8fbceb0e396cf51623d2f5d2ef789b0ff1430 Author: Aleksandr Nogikh Date: Wed Oct 28 17:07:31 2020 +0000 netem: fix zero division in tabledist [ Upstream commit eadd1befdd778a1eca57fad058782bd22b4db804 ] Currently it is possible to craft a special netlink RTM_NEWQDISC command that can result in jitter being equal to 0x80000000. It is enough to set the 32 bit jitter to 0x02000000 (it will later be multiplied by 2^6) or just set the 64 bit jitter via TCA_NETEM_JITTER64. This causes an overflow during the generation of uniformly distributed numbers in tabledist(), which in turn leads to division by zero (sigma != 0, but sigma * 2 is 0). The related fragment of code needs 32-bit division - see commit 9b0ed89 ("netem: remove unnecessary 64 bit modulus"), so switching to 64 bit is not an option. Fix the issue by keeping the value of jitter within the range that can be adequately handled by tabledist() - [0;INT_MAX]. As negative std deviation makes no sense, take the absolute value of the passed value and cap it at INT_MAX. Inside tabledist(), switch to unsigned 32 bit arithmetic in order to prevent overflows. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Aleksandr Nogikh Reported-by: syzbot+ec762a6342ad0d3c0d8f@syzkaller.appspotmail.com Acked-by: Stephen Hemminger Link: https://lore.kernel.org/r/20201028170731.1383332-1-aleksandrnogikh@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 25259932e1bb8fe34e047cf087544aeb853652af Author: Ido Schimmel Date: Sat Oct 24 16:37:32 2020 +0300 mlxsw: core: Fix memory leak on module removal [ Upstream commit adc80b6cfedff6dad8b93d46a5ea2775fd5af9ec ] Free the devlink instance during the teardown sequence in the non-reload case to avoid the following memory leak. unreferenced object 0xffff888232895000 (size 2048): comm "modprobe", pid 1073, jiffies 4295568857 (age 164.871s) hex dump (first 32 bytes): 00 01 00 00 00 00 ad de 22 01 00 00 00 00 ad de ........"....... 10 50 89 32 82 88 ff ff 10 50 89 32 82 88 ff ff .P.2.....P.2.... backtrace: [<00000000c704e9a6>] __kmalloc+0x13a/0x2a0 [<00000000ee30129d>] devlink_alloc+0xff/0x760 [<0000000092ab3e5d>] 0xffffffffa042e5b0 [<000000004f3f8a31>] 0xffffffffa042f6ad [<0000000092800b4b>] 0xffffffffa0491df3 [<00000000c4843903>] local_pci_probe+0xcb/0x170 [<000000006993ded7>] pci_device_probe+0x2c2/0x4e0 [<00000000a8e0de75>] really_probe+0x2c5/0xf90 [<00000000d42ba75d>] driver_probe_device+0x1eb/0x340 [<00000000bcc95e05>] device_driver_attach+0x294/0x300 [<000000000e2bc177>] __driver_attach+0x167/0x2f0 [<000000007d44cd6e>] bus_for_each_dev+0x148/0x1f0 [<000000003cd5a91e>] driver_attach+0x45/0x60 [<000000000041ce51>] bus_add_driver+0x3b8/0x720 [<00000000f5215476>] driver_register+0x230/0x4e0 [<00000000d79356f5>] __pci_register_driver+0x190/0x200 Fixes: a22712a96291 ("mlxsw: core: Fix devlink unregister flow") Signed-off-by: Ido Schimmel Reported-by: Vadim Pasternak Tested-by: Oleksandr Shamray Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit d6f6e3f978853a180631b541072051509665694d Author: Lijun Pan Date: Tue Oct 27 17:04:56 2020 -0500 ibmvnic: fix ibmvnic_set_mac [ Upstream commit 8fc3672a8ad3e782bac80e979bc2a2c10960cbe9 ] Jakub Kicinski brought up a concern in ibmvnic_set_mac(). ibmvnic_set_mac() does this: ether_addr_copy(adapter->mac_addr, addr->sa_data); if (adapter->state != VNIC_PROBED) rc = __ibmvnic_set_mac(netdev, addr->sa_data); So if state == VNIC_PROBED, the user can assign an invalid address to adapter->mac_addr, and ibmvnic_set_mac() will still return 0. The fix is to validate ethernet address at the beginning of ibmvnic_set_mac(), and move the ether_addr_copy to the case of "adapter->state != VNIC_PROBED". Fixes: c26eba03e407 ("ibmvnic: Update reset infrastructure to support tunable parameters") Signed-off-by: Lijun Pan Link: https://lore.kernel.org/r/20201027220456.71450-1-ljp@linux.ibm.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 4606d351204369e96de2632d0e9fb7505e5bddb7 Author: Thomas Bogendoerfer Date: Mon Oct 26 11:42:21 2020 +0100 ibmveth: Fix use of ibmveth in a bridge. [ Upstream commit 2ac8af0967aaa2b67cb382727e784900d2f4d0da ] The check for src mac address in ibmveth_is_packet_unsupported is wrong. Commit 6f2275433a2f wanted to shut down messages for loopback packets, but now suppresses bridged frames, which are accepted by the hypervisor otherwise bridging won't work at all. Fixes: 6f2275433a2f ("ibmveth: Detect unsupported packets before sending to the hypervisor") Signed-off-by: Michal Suchanek Signed-off-by: Thomas Bogendoerfer Link: https://lore.kernel.org/r/20201026104221.26570-1-msuchanek@suse.de Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit b520e574fdbffdef4bfb166515febfff167eaf24 Author: Masahiro Fujiwara Date: Tue Oct 27 20:48:46 2020 +0900 gtp: fix an use-before-init in gtp_newlink() [ Upstream commit 51467431200b91682b89d31317e35dcbca1469ce ] *_pdp_find() from gtp_encap_recv() would trigger a crash when a peer sends GTP packets while creating new GTP device. RIP: 0010:gtp1_pdp_find.isra.0+0x68/0x90 [gtp] Call Trace: gtp_encap_recv+0xc2/0x2e0 [gtp] ? gtp1_pdp_find.isra.0+0x90/0x90 [gtp] udp_queue_rcv_one_skb+0x1fe/0x530 udp_queue_rcv_skb+0x40/0x1b0 udp_unicast_rcv_skb.isra.0+0x78/0x90 __udp4_lib_rcv+0x5af/0xc70 udp_rcv+0x1a/0x20 ip_protocol_deliver_rcu+0xc5/0x1b0 ip_local_deliver_finish+0x48/0x50 ip_local_deliver+0xe5/0xf0 ? ip_protocol_deliver_rcu+0x1b0/0x1b0 gtp_encap_enable() should be called after gtp_hastable_new() otherwise *_pdp_find() will access the uninitialized hash table. Fixes: 1e3a3abd8b28 ("gtp: make GTP sockets in gtp_newlink optional") Signed-off-by: Masahiro Fujiwara Link: https://lore.kernel.org/r/20201027114846.3924-1-fujiwara.masahiro@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 9921e777a3476be6d8cbd47cec1bb26219732283 Author: Raju Rangoju Date: Fri Oct 23 17:28:52 2020 +0530 cxgb4: set up filter action after rewrites [ Upstream commit 937d8420588421eaa5c7aa5c79b26b42abb288ef ] The current code sets up the filter action field before rewrites are set up. When the action 'switch' is used with rewrites, this may result in initial few packets that get switched out don't have rewrites applied on them. So, make sure filter action is set up along with rewrites or only after everything else is set up for rewrites. Fixes: 12b276fbf6e0 ("cxgb4: add support to create hash filters") Signed-off-by: Raju Rangoju Link: https://lore.kernel.org/r/20201023115852.18262-1-rajur@chelsio.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit b97638e0f3be40e7729780ab6ff109e9d21c41d9 Author: Vinay Kumar Yadav Date: Fri Oct 23 00:35:57 2020 +0530 chelsio/chtls: fix tls record info to user [ Upstream commit 4f3391ce8f5a69e7e6d66d0a3fc654eb6dbdc919 ] chtls_pt_recvmsg() receives a skb with tls header and subsequent skb with data, need to finalize the data copy whenever next skb with tls header is available. but here current tls header is overwritten by next available tls header, ends up corrupting user buffer data. fixing it by finalizing current record whenever next skb contains tls header. v1->v2: - Improved commit message. Fixes: 17a7d24aa89d ("crypto: chtls - generic handling of data and hdr") Signed-off-by: Vinay Kumar Yadav Link: https://lore.kernel.org/r/20201022190556.21308-1-vinay.yadav@chelsio.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit eb592f2ae4786230611360000cd6844f741f8de9 Author: Vinay Kumar Yadav Date: Mon Oct 26 01:12:29 2020 +0530 chelsio/chtls: fix memory leaks in CPL handlers [ Upstream commit 6daa1da4e262b0cd52ef0acc1989ff22b5540264 ] CPL handler functions chtls_pass_open_rpl() and chtls_close_listsrv_rpl() should return CPL_RET_BUF_DONE so that caller function will do skb free to avoid leak. Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition") Signed-off-by: Vinay Kumar Yadav Link: https://lore.kernel.org/r/20201025194228.31271-1-vinay.yadav@chelsio.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit c3208dec446ac89598e27a1dd4c5443c3417f42b Author: Vinay Kumar Yadav Date: Mon Oct 26 01:05:39 2020 +0530 chelsio/chtls: fix deadlock issue [ Upstream commit 28e9dcd9172028263c8225c15c4e329e08475e89 ] In chtls_pass_establish() we hold child socket lock using bh_lock_sock and we are again trying bh_lock_sock in add_to_reap_list, causing deadlock. Remove bh_lock_sock in add_to_reap_list() as lock is already held. Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition") Signed-off-by: Vinay Kumar Yadav Link: https://lore.kernel.org/r/20201025193538.31112-1-vinay.yadav@chelsio.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit b334112f20b7f98252936cc89cebbeacbeb575db Author: Vasundhara Volam Date: Mon Oct 26 00:18:21 2020 -0400 bnxt_en: Send HWRM_FUNC_RESET fw command unconditionally. [ Upstream commit 825741b071722f1c8ad692cead562c4b5f5eaa93 ] In the AER or firmware reset flow, if we are in fatal error state or if pci_channel_offline() is true, we don't send any commands to the firmware because the commands will likely not reach the firmware and most commands don't matter much because the firmware is likely to be reset imminently. However, the HWRM_FUNC_RESET command is different and we should always attempt to send it. In the AER flow for example, the .slot_reset() call will trigger this fw command and we need to try to send it to effect the proper reset. Fixes: b340dc680ed4 ("bnxt_en: Avoid sending firmware messages when AER error is detected.") Reviewed-by: Edwin Peer Signed-off-by: Vasundhara Volam Signed-off-by: Michael Chan Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit f739fc7e10725363a5281e9d576647024e3d839e Author: Vasundhara Volam Date: Mon Oct 26 00:18:19 2020 -0400 bnxt_en: Re-write PCI BARs after PCI fatal error. [ Upstream commit f75d9a0aa96721d20011cd5f8c7a24eb32728589 ] When a PCIe fatal error occurs, the internal latched BAR addresses in the chip get reset even though the BAR register values in config space are retained. pci_restore_state() will not rewrite the BAR addresses if the BAR address values are valid, causing the chip's internal BAR addresses to stay invalid. So we need to zero the BAR registers during PCIe fatal error to force pci_restore_state() to restore the BAR addresses. These write cycles to the BAR registers will cause the proper BAR addresses to latch internally. Fixes: 6316ea6db93d ("bnxt_en: Enable AER support.") Signed-off-by: Vasundhara Volam Signed-off-by: Michael Chan Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 7fe9514cfe68e75700798a254bb62084fa20e1d9 Author: Vasundhara Volam Date: Mon Oct 26 00:18:18 2020 -0400 bnxt_en: Invoke cancel_delayed_work_sync() for PFs also. [ Upstream commit 631ce27a3006fc0b732bfd589c6df505f62eadd9 ] As part of the commit b148bb238c02 ("bnxt_en: Fix possible crash in bnxt_fw_reset_task()."), cancel_delayed_work_sync() is called only for VFs to fix a possible crash by cancelling any pending delayed work items. It was assumed by mistake that the flush_workqueue() call on the PF would flush delayed work items as well. As flush_workqueue() does not cancel the delayed workqueue, extend the fix for PFs. This fix will avoid the system crash, if there are any pending delayed work items in fw_reset_task() during driver's .remove() call. Unify the workqueue cleanup logic for both PF and VF by calling cancel_work_sync() and cancel_delayed_work_sync() directly in bnxt_remove_one(). Fixes: b148bb238c02 ("bnxt_en: Fix possible crash in bnxt_fw_reset_task().") Reviewed-by: Pavan Chebbi Reviewed-by: Andy Gospodarek Signed-off-by: Vasundhara Volam Signed-off-by: Michael Chan Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit bfbbfb501e7455a37dc54eb5b02ee2009947dbc9 Author: Vasundhara Volam Date: Mon Oct 26 00:18:17 2020 -0400 bnxt_en: Fix regression in workqueue cleanup logic in bnxt_remove_one(). [ Upstream commit 21d6a11e2cadfb8446265a3efff0e2aad206e15e ] A recent patch has moved the workqueue cleanup logic before calling unregister_netdev() in bnxt_remove_one(). This caused a regression because the workqueue can be restarted if the device is still open. Workqueue cleanup must be done after unregister_netdev(). The workqueue will not restart itself after the device is closed. Call bnxt_cancel_sp_work() after unregister_netdev() and call bnxt_dl_fw_reporters_destroy() after that. This fixes the regession and the original NULL ptr dereference issue. Fixes: b16939b59cc0 ("bnxt_en: Fix NULL ptr dereference crash in bnxt_fw_reset_task()") Signed-off-by: Vasundhara Volam Signed-off-by: Michael Chan Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 0b17de4d67bf6b3a753c66805bfaa4af52e3b931 Author: Michael Chan Date: Mon Oct 26 00:18:20 2020 -0400 bnxt_en: Check abort error state in bnxt_open_nic(). [ Upstream commit a1301f08c5acf992d9c1fafddc84c3a822844b04 ] bnxt_open_nic() is called during configuration changes that require the NIC to be closed and then opened. This call is protected by rtnl_lock. Firmware reset can be happening at the same time. Only critical portions of the entire firmware reset sequence are protected by the rtnl_lock. It is possible that bnxt_open_nic() can be called when the firmware reset sequence is aborting. In that case, bnxt_open_nic() needs to check if the ABORT_ERR flag is set and abort if it is. The configuration change that resulted in the bnxt_open_nic() call will fail but the NIC will be brought to a consistent IF_DOWN state. Without this patch, if bnxt_open_nic() were to continue in this error state, it may crash like this: [ 1648.659736] BUG: unable to handle kernel NULL pointer dereference at (null) [ 1648.659768] IP: [] bnxt_alloc_mem+0x50a/0x1140 [bnxt_en] [ 1648.659796] PGD 101e1b3067 PUD 101e1b2067 PMD 0 [ 1648.659813] Oops: 0000 [#1] SMP [ 1648.659825] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc dell_smbios dell_wmi_descriptor dcdbas amd64_edac_mod edac_mce_amd kvm_amd kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper vfat cryptd fat pcspkr ipmi_ssif sg k10temp i2c_piix4 wmi ipmi_si ipmi_devintf ipmi_msghandler tpm_crb acpi_power_meter sch_fq_codel ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci drm libahci megaraid_sas crct10dif_pclmul crct10dif_common [ 1648.660063] tg3 libata crc32c_intel bnxt_en(OE) drm_panel_orientation_quirks devlink ptp pps_core dm_mirror dm_region_hash dm_log dm_mod fuse [ 1648.660105] CPU: 13 PID: 3867 Comm: ethtool Kdump: loaded Tainted: G OE ------------ 3.10.0-1152.el7.x86_64 #1 [ 1648.660911] Hardware name: Dell Inc. PowerEdge R7515/0R4CNN, BIOS 1.2.14 01/28/2020 [ 1648.661662] task: ffff94e64cbc9080 ti: ffff94f55df1c000 task.ti: ffff94f55df1c000 [ 1648.662409] RIP: 0010:[] [] bnxt_alloc_mem+0x50a/0x1140 [bnxt_en] [ 1648.663171] RSP: 0018:ffff94f55df1fba8 EFLAGS: 00010202 [ 1648.663927] RAX: 0000000000000000 RBX: ffff94e6827e0000 RCX: 0000000000000000 [ 1648.664684] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff94e6827e08c0 [ 1648.665433] RBP: ffff94f55df1fc20 R08: 00000000000001ff R09: 0000000000000008 [ 1648.666184] R10: 0000000000000d53 R11: ffff94f55df1f7ce R12: ffff94e6827e08c0 [ 1648.666940] R13: ffff94e6827e08c0 R14: ffff94e6827e08c0 R15: ffffffffb9115e40 [ 1648.667695] FS: 00007f8aadba5740(0000) GS:ffff94f57eb40000(0000) knlGS:0000000000000000 [ 1648.668447] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1648.669202] CR2: 0000000000000000 CR3: 0000001022772000 CR4: 0000000000340fe0 [ 1648.669966] Call Trace: [ 1648.670730] [] ? bnxt_need_reserve_rings+0x9d/0x170 [bnxt_en] [ 1648.671496] [] __bnxt_open_nic+0x8a/0x9a0 [bnxt_en] [ 1648.672263] [] ? bnxt_close_nic+0x59/0x1b0 [bnxt_en] [ 1648.673031] [] bnxt_open_nic+0x1b/0x50 [bnxt_en] [ 1648.673793] [] bnxt_set_ringparam+0x6c/0xa0 [bnxt_en] [ 1648.674550] [] dev_ethtool+0x1334/0x21a0 [ 1648.675306] [] dev_ioctl+0x1ef/0x5f0 [ 1648.676061] [] sock_do_ioctl+0x4d/0x60 [ 1648.676810] [] sock_ioctl+0x1eb/0x2d0 [ 1648.677548] [] do_vfs_ioctl+0x3a0/0x5b0 [ 1648.678282] [] ? __do_page_fault+0x238/0x500 [ 1648.679016] [] SyS_ioctl+0xa1/0xc0 [ 1648.679745] [] system_call_fastpath+0x25/0x2a [ 1648.680461] Code: 9e 60 01 00 00 0f 1f 40 00 45 8b 8e 48 01 00 00 31 c9 45 85 c9 0f 8e 73 01 00 00 66 0f 1f 44 00 00 49 8b 86 a8 00 00 00 48 63 d1 <48> 8b 14 d0 48 85 d2 0f 84 46 01 00 00 41 8b 86 44 01 00 00 c7 [ 1648.681986] RIP [] bnxt_alloc_mem+0x50a/0x1140 [bnxt_en] [ 1648.682724] RSP [ 1648.683451] CR2: 0000000000000000 Fixes: ec5d31e3c15d ("bnxt_en: Handle firmware reset status during IF_UP.") Reviewed-by: Vasundhara Volam Reviewed-by: Pavan Chebbi Signed-off-by: Michael Chan Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit c328793e21fb5ff72775476fdf184931a2671e56 Author: Michael Schaller Date: Fri Sep 25 09:45:02 2020 +0200 efivarfs: Replace invalid slashes with exclamation marks in dentries. commit 336af6a4686d885a067ecea8c3c3dd129ba4fc75 upstream. Without this patch efivarfs_alloc_dentry creates dentries with slashes in their name if the respective EFI variable has slashes in its name. This in turn causes EIO on getdents64, which prevents a complete directory listing of /sys/firmware/efi/efivars/. This patch replaces the invalid shlashes with exclamation marks like kobject_set_name_vargs does for /sys/firmware/efi/vars/ to have consistently named dentries under /sys/firmware/efi/vars/ and /sys/firmware/efi/efivars/. Signed-off-by: Michael Schaller Link: https://lore.kernel.org/r/20200925074502.150448-1-misch@google.com Signed-off-by: Ard Biesheuvel Signed-off-by: dann frazier Signed-off-by: Greg Kroah-Hartman commit 61ececc852746242042da22059e8720317ad1424 Author: Dan Williams Date: Mon Oct 5 20:40:25 2020 -0700 x86/copy_mc: Introduce copy_mc_enhanced_fast_string() commit 5da8e4a658109e3b7e1f45ae672b7c06ac3e7158 upstream. The motivations to go rework memcpy_mcsafe() are that the benefit of doing slow and careful copies is obviated on newer CPUs, and that the current opt-in list of CPUs to instrument recovery is broken relative to those CPUs. There is no need to keep an opt-in list up to date on an ongoing basis if pmem/dax operations are instrumented for recovery by default. With recovery enabled by default the old "mcsafe_key" opt-in to careful copying can be made a "fragile" opt-out. Where the "fragile" list takes steps to not consume poison across cachelines. The discussion with Linus made clear that the current "_mcsafe" suffix was imprecise to a fault. The operations that are needed by pmem/dax are to copy from a source address that might throw #MC to a destination that may write-fault, if it is a user page. So copy_to_user_mcsafe() becomes copy_mc_to_user() to indicate the separate precautions taken on source and destination. copy_mc_to_kernel() is introduced as a non-SMAP version that does not expect write-faults on the destination, but is still prepared to abort with an error code upon taking #MC. The original copy_mc_fragile() implementation had negative performance implications since it did not use the fast-string instruction sequence to perform copies. For this reason copy_mc_to_kernel() fell back to plain memcpy() to preserve performance on platforms that did not indicate the capability to recover from machine check exceptions. However, that capability detection was not architectural and now that some platforms can recover from fast-string consumption of memory errors the memcpy() fallback now causes these more capable platforms to fail. Introduce copy_mc_enhanced_fast_string() as the fast default implementation of copy_mc_to_kernel() and finalize the transition of copy_mc_fragile() to be a platform quirk to indicate 'copy-carefully'. With this in place, copy_mc_to_kernel() is fast and recovery-ready by default regardless of hardware capability. Thanks to Vivek for identifying that copy_user_generic() is not suitable as the copy_mc_to_user() backend since the #MC handler explicitly checks ex_has_fault_handler(). Thanks to the 0day robot for catching a performance bug in the x86/copy_mc_to_user implementation. [ bp: Add the "why" for this change from the 0/2th message, massage. ] Fixes: 92b0729c34ca ("x86/mm, x86/mce: Add memcpy_mcsafe()") Reported-by: Erwin Tsaur Reported-by: 0day robot Signed-off-by: Dan Williams Signed-off-by: Borislav Petkov Reviewed-by: Tony Luck Tested-by: Erwin Tsaur Cc: Link: https://lkml.kernel.org/r/160195562556.2163339.18063423034951948973.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Greg Kroah-Hartman commit a092869e0351eae21cbe3f62a3cf254c6f9c000a Author: Dan Williams Date: Mon Oct 5 20:40:16 2020 -0700 x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}() commit ec6347bb43395cb92126788a1a5b25302543f815 upstream. In reaction to a proposal to introduce a memcpy_mcsafe_fast() implementation Linus points out that memcpy_mcsafe() is poorly named relative to communicating the scope of the interface. Specifically what addresses are valid to pass as source, destination, and what faults / exceptions are handled. Of particular concern is that even though x86 might be able to handle the semantics of copy_mc_to_user() with its common copy_user_generic() implementation other archs likely need / want an explicit path for this case: On Fri, May 1, 2020 at 11:28 AM Linus Torvalds wrote: > > On Thu, Apr 30, 2020 at 6:21 PM Dan Williams wrote: > > > > However now I see that copy_user_generic() works for the wrong reason. > > It works because the exception on the source address due to poison > > looks no different than a write fault on the user address to the > > caller, it's still just a short copy. So it makes copy_to_user() work > > for the wrong reason relative to the name. > > Right. > > And it won't work that way on other architectures. On x86, we have a > generic function that can take faults on either side, and we use it > for both cases (and for the "in_user" case too), but that's an > artifact of the architecture oddity. > > In fact, it's probably wrong even on x86 - because it can hide bugs - > but writing those things is painful enough that everybody prefers > having just one function. Replace a single top-level memcpy_mcsafe() with either copy_mc_to_user(), or copy_mc_to_kernel(). Introduce an x86 copy_mc_fragile() name as the rename for the low-level x86 implementation formerly named memcpy_mcsafe(). It is used as the slow / careful backend that is supplanted by a fast copy_mc_generic() in a follow-on patch. One side-effect of this reorganization is that separating copy_mc_64.S to its own file means that perf no longer needs to track dependencies for its memcpy_64.S benchmarks. [ bp: Massage a bit. ] Signed-off-by: Dan Williams Signed-off-by: Borislav Petkov Reviewed-by: Tony Luck Acked-by: Michael Ellerman Cc: Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Greg Kroah-Hartman commit 18703f749e99352ce5a53781d70b9b5e606b060f Author: Randy Dunlap Date: Fri Aug 21 17:10:27 2020 -0700 x86/PCI: Fix intel_mid_pci.c build error when ACPI is not enabled commit 035fff1f7aab43e420e0098f0854470a5286fb83 upstream. Fix build error when CONFIG_ACPI is not set/enabled by adding the header file which contains a stub for the function in the build error. ../arch/x86/pci/intel_mid_pci.c: In function ‘intel_mid_pci_init’: ../arch/x86/pci/intel_mid_pci.c:303:2: error: implicit declaration of function ‘acpi_noirq_set’; did you mean ‘acpi_irq_get’? [-Werror=implicit-function-declaration] acpi_noirq_set(); Fixes: a912a7584ec3 ("x86/platform/intel-mid: Move PCI initialization to arch_init()") Link: https://lore.kernel.org/r/ea903917-e51b-4cc9-2680-bc1e36efa026@infradead.org Signed-off-by: Randy Dunlap Signed-off-by: Bjorn Helgaas Reviewed-by: Andy Shevchenko Reviewed-by: Jesse Barnes Acked-by: Thomas Gleixner Cc: stable@vger.kernel.org # v4.16+ Cc: Jacob Pan Cc: Len Brown Cc: Jesse Barnes Cc: Arjan van de Ven Signed-off-by: Greg Kroah-Hartman commit 4b0a9591dd78125550a158c300da0e470e41629f Author: Nick Desaulniers Date: Fri Oct 16 10:53:39 2020 -0700 arm64: link with -z norelro regardless of CONFIG_RELOCATABLE commit 3b92fa7485eba16b05166fddf38ab42f2ff6ab95 upstream. With CONFIG_EXPERT=y, CONFIG_KASAN=y, CONFIG_RANDOMIZE_BASE=n, CONFIG_RELOCATABLE=n, we observe the following failure when trying to link the kernel image with LD=ld.lld: error: section: .exit.data is not contiguous with other relro sections ld.lld defaults to -z relro while ld.bfd defaults to -z norelro. This was previously fixed, but only for CONFIG_RELOCATABLE=y. Fixes: 3bbd3db86470 ("arm64: relocatable: fix inconsistencies in linker script and options") Signed-off-by: Nick Desaulniers Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201016175339.2429280-1-ndesaulniers@google.com Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit dfaa0f7d0832fdfd915f1bab5887323401d9b021 Author: Marc Zyngier Date: Thu Jul 16 17:11:10 2020 +0100 arm64: Run ARCH_WORKAROUND_2 enabling code on all CPUs commit 39533e12063be7f55e3d6ae21ffe067799d542a4 upstream. Commit 606f8e7b27bf ("arm64: capabilities: Use linear array for detection and verification") changed the way we deal with per-CPU errata by only calling the .matches() callback until one CPU is found to be affected. At this point, .matches() stop being called, and .cpu_enable() will be called on all CPUs. This breaks the ARCH_WORKAROUND_2 handling, as only a single CPU will be mitigated. In order to address this, forcefully call the .matches() callback from a .cpu_enable() callback, which brings us back to the original behaviour. Fixes: 606f8e7b27bf ("arm64: capabilities: Use linear array for detection and verification") Cc: Reviewed-by: Suzuki K Poulose Signed-off-by: Marc Zyngier Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit 0ccd5c2c60e0eba3f9cf624fb6b4d28c4aa8c84a Author: Marc Zyngier Date: Thu Jul 16 17:11:09 2020 +0100 arm64: Run ARCH_WORKAROUND_1 enabling code on all CPUs commit 18fce56134c987e5b4eceddafdbe4b00c07e2ae1 upstream. Commit 73f381660959 ("arm64: Advertise mitigation of Spectre-v2, or lack thereof") changed the way we deal with ARCH_WORKAROUND_1, by moving most of the enabling code to the .matches() callback. This has the unfortunate effect that the workaround gets only enabled on the first affected CPU, and no other. In order to address this, forcefully call the .matches() callback from a .cpu_enable() callback, which brings us back to the original behaviour. Fixes: 73f381660959 ("arm64: Advertise mitigation of Spectre-v2, or lack thereof") Cc: Reviewed-by: Suzuki K Poulose Signed-off-by: Marc Zyngier Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit 4720b25e4ca38bfd05e4dbf541d5358accf9adf8 Author: Kees Cook Date: Fri Oct 2 10:38:14 2020 -0700 fs/kernel_read_file: Remove FIRMWARE_EFI_EMBEDDED enum commit 06e67b849ab910a49a629445f43edb074153d0eb upstream. The "FIRMWARE_EFI_EMBEDDED" enum is a "where", not a "what". It should not be distinguished separately from just "FIRMWARE", as this confuses the LSMs about what is being loaded. Additionally, there was no actual validation of the firmware contents happening. Fixes: e4c2c0ff00ec ("firmware: Add new platform fallback mechanism and firmware_request_platform()") Signed-off-by: Kees Cook Reviewed-by: Luis Chamberlain Acked-by: Scott Branden Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201002173828.2099543-3-keescook@chromium.org Signed-off-by: Greg Kroah-Hartman commit 8b23af0ef2f7ad76a3b0b734e06244e64f9bd829 Author: Ard Biesheuvel Date: Sat Sep 26 10:52:42 2020 +0200 efi/arm64: libstub: Deal gracefully with EFI_RNG_PROTOCOL failure commit d32de9130f6c79533508e2c7879f18997bfbe2a0 upstream. Currently, on arm64, we abort on any failure from efi_get_random_bytes() other than EFI_NOT_FOUND when it comes to setting the physical seed for KASLR, but ignore such failures when obtaining the seed for virtual KASLR or for early seeding of the kernel's entropy pool via the config table. This is inconsistent, and may lead to unexpected boot failures. So let's permit any failure for the physical seed, and simply report the error code if it does not equal EFI_NOT_FOUND. Cc: # v5.8+ Reported-by: Heinrich Schuchardt Signed-off-by: Ard Biesheuvel Signed-off-by: Greg Kroah-Hartman commit 865013fcf4c3c672ee463c7ead62e9f3b89e87d7 Author: Rasmus Villemoes Date: Thu Sep 17 08:56:11 2020 +0200 scripts/setlocalversion: make git describe output more reliable commit 548b8b5168c90c42e88f70fcf041b4ce0b8e7aa8 upstream. When building for an embedded target using Yocto, we're sometimes observing that the version string that gets built into vmlinux (and thus what uname -a reports) differs from the path under /lib/modules/ where modules get installed in the rootfs, but only in the length of the -gabc123def suffix. Hence modprobe always fails. The problem is that Yocto has the concept of "sstate" (shared state), which allows different developers/buildbots/etc. to share build artifacts, based on a hash of all the metadata that went into building that artifact - and that metadata includes all dependencies (e.g. the compiler used etc.). That normally works quite well; usually a clean build (without using any sstate cache) done by one developer ends up being binary identical to a build done on another host. However, one thing that can cause two developers to end up with different builds [and thus make one's vmlinux package incompatible with the other's kernel-dev package], which is not captured by the metadata hashing, is this `git describe`: The output of that can be affected by (1) git version: before 2.11 git defaulted to a minimum of 7, since 2.11 (git.git commit e6c587) the default is dynamic based on the number of objects in the repo (2) hence even if both run the same git version, the output can differ based on how many remotes are being tracked (or just lots of local development branches or plain old garbage) (3) and of course somebody could have a core.abbrev config setting in ~/.gitconfig So in order to avoid `uname -a` output relying on such random details of the build environment which are rather hard to ensure are consistent between developers and buildbots, make sure the abbreviated sha1 always consists of exactly 12 hex characters. That is consistent with the current rule for -stable patches, and is almost always enough to identify the head commit unambigously - in the few cases where it does not, the v5.4.3-00021- prefix would certainly nail it down. Signed-off-by: Rasmus Villemoes Signed-off-by: Masahiro Yamada Signed-off-by: Greg Kroah-Hartman commit 6f4c9772e195792501481f87ee48d1edee7b521b Author: Matthew Wilcox (Oracle) Date: Fri Oct 9 13:49:53 2020 +0100 io_uring: Convert advanced XArray uses to the normal API commit 5e2ed8c4f45093698855b1f45cdf43efbf6dd498 upstream. There are no bugs here that I've spotted, it's just easier to use the normal API and there are no performance advantages to using the more verbose advanced API. Signed-off-by: Matthew Wilcox (Oracle) Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit f7b24bee5e6ef939d8c312483cd4ea2648df81d0 Author: Matthew Wilcox (Oracle) Date: Fri Oct 9 13:49:52 2020 +0100 io_uring: Fix XArray usage in io_uring_add_task_file commit 236434c3438c4da3dfbd6aeeab807577b85e951a upstream. The xas_store() wasn't paired with an xas_nomem() loop, so if it couldn't allocate memory using GFP_NOWAIT, it would leak the reference to the file descriptor. Also the node pointed to by the xas could be freed between the call to xas_load() under the rcu_read_lock() and the acquisition of the xa_lock. It's easier to just use the normal xa_load/xa_store interface here. Signed-off-by: Matthew Wilcox (Oracle) [axboe: fix missing assign after alloc, cur_uring -> tctx rename] Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit efce965a49f118521da9a0e60e7c3a91bdce4cd9 Author: Matthew Wilcox (Oracle) Date: Fri Oct 9 13:49:51 2020 +0100 io_uring: Fix use of XArray in __io_uring_files_cancel commit ce765372bc443573d1d339a2bf4995de385dea3a upstream. We have to drop the lock during each iteration, so there's no advantage to using the advanced API. Convert this to a standard xa_for_each() loop. Reported-by: syzbot+27c12725d8ff0bfe1a13@syzkaller.appspotmail.com Signed-off-by: Matthew Wilcox (Oracle) Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 5ee3fea0c227a89f6423a89e41331485c0a91755 Author: Jens Axboe Date: Thu Oct 8 07:46:52 2020 -0600 io_uring: no need to call xa_destroy() on empty xarray commit ca6484cd308a671811bf39f3119e81966eb476e3 upstream. The kernel test robot reports this lockdep issue: [child1:659] mbind (274) returned ENOSYS, marking as inactive. [child1:659] mq_timedsend (279) returned ENOSYS, marking as inactive. [main] 10175 iterations. [F:7781 S:2344 HI:2397] [ 24.610601] [ 24.610743] ================================ [ 24.611083] WARNING: inconsistent lock state [ 24.611437] 5.9.0-rc7-00017-g0f2122045b9462 #5 Not tainted [ 24.611861] -------------------------------- [ 24.612193] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. [ 24.612660] ksoftirqd/0/7 [HC0[0]:SC1[3]:HE0:SE0] takes: [ 24.613086] f00ed998 (&xa->xa_lock#4){+.?.}-{2:2}, at: xa_destroy+0x43/0xc1 [ 24.613642] {SOFTIRQ-ON-W} state was registered at: [ 24.614024] lock_acquire+0x20c/0x29b [ 24.614341] _raw_spin_lock+0x21/0x30 [ 24.614636] io_uring_add_task_file+0xe8/0x13a [ 24.614987] io_uring_create+0x535/0x6bd [ 24.615297] io_uring_setup+0x11d/0x136 [ 24.615606] __ia32_sys_io_uring_setup+0xd/0xf [ 24.615977] do_int80_syscall_32+0x53/0x6c [ 24.616306] restore_all_switch_stack+0x0/0xb1 [ 24.616677] irq event stamp: 939881 [ 24.616968] hardirqs last enabled at (939880): [<8105592d>] __local_bh_enable_ip+0x13c/0x145 [ 24.617642] hardirqs last disabled at (939881): [<81b6ace3>] _raw_spin_lock_irqsave+0x1b/0x4e [ 24.618321] softirqs last enabled at (939738): [<81b6c7c8>] __do_softirq+0x3f0/0x45a [ 24.618924] softirqs last disabled at (939743): [<81055741>] run_ksoftirqd+0x35/0x61 [ 24.619521] [ 24.619521] other info that might help us debug this: [ 24.620028] Possible unsafe locking scenario: [ 24.620028] [ 24.620492] CPU0 [ 24.620685] ---- [ 24.620894] lock(&xa->xa_lock#4); [ 24.621168] [ 24.621381] lock(&xa->xa_lock#4); [ 24.621695] [ 24.621695] *** DEADLOCK *** [ 24.621695] [ 24.622154] 1 lock held by ksoftirqd/0/7: [ 24.622468] #0: 823bfb94 (rcu_callback){....}-{0:0}, at: rcu_process_callbacks+0xc0/0x155 [ 24.623106] [ 24.623106] stack backtrace: [ 24.623454] CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 5.9.0-rc7-00017-g0f2122045b9462 #5 [ 24.624090] Call Trace: [ 24.624284] ? show_stack+0x40/0x46 [ 24.624551] dump_stack+0x1b/0x1d [ 24.624809] print_usage_bug+0x17a/0x185 [ 24.625142] mark_lock+0x11d/0x1db [ 24.625474] ? print_shortest_lock_dependencies+0x121/0x121 [ 24.625905] __lock_acquire+0x41e/0x7bf [ 24.626206] lock_acquire+0x20c/0x29b [ 24.626517] ? xa_destroy+0x43/0xc1 [ 24.626810] ? lock_acquire+0x20c/0x29b [ 24.627110] _raw_spin_lock_irqsave+0x3e/0x4e [ 24.627450] ? xa_destroy+0x43/0xc1 [ 24.627725] xa_destroy+0x43/0xc1 [ 24.627989] __io_uring_free+0x57/0x71 [ 24.628286] ? get_pid+0x22/0x22 [ 24.628544] __put_task_struct+0xf2/0x163 [ 24.628865] put_task_struct+0x1f/0x2a [ 24.629161] delayed_put_task_struct+0xe2/0xe9 [ 24.629509] rcu_process_callbacks+0x128/0x155 [ 24.629860] __do_softirq+0x1a3/0x45a [ 24.630151] run_ksoftirqd+0x35/0x61 [ 24.630443] smpboot_thread_fn+0x304/0x31a [ 24.630763] kthread+0x124/0x139 [ 24.631016] ? sort_range+0x18/0x18 [ 24.631290] ? kthread_create_worker_on_cpu+0x17/0x17 [ 24.631682] ret_from_fork+0x1c/0x28 which is complaining about xa_destroy() grabbing the xa lock in an IRQ disabling fashion, whereas the io_uring uses cases aren't interrupt safe. This is really an xarray issue, since it should not assume the lock type. But for our use case, since we know the xarray is empty at this point, there's no need to actually call xa_destroy(). So just get rid of it. Fixes: 0f2122045b94 ("io_uring: don't rely on weak ->files references") Reported-by: kernel test robot Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 0ca6ce23f4f6f4d2169fcb47edd641e5c2bfd7f3 Author: Hillf Danton Date: Sat Sep 26 21:26:55 2020 +0800 io-wq: fix use-after-free in io_wq_worker_running commit c4068bf898ddaef791049a366828d9b84b467bda upstream. The smart syzbot has found a reproducer for the following issue: ================================================================== BUG: KASAN: use-after-free in instrument_atomic_write include/linux/instrumented.h:71 [inline] BUG: KASAN: use-after-free in atomic_inc include/asm-generic/atomic-instrumented.h:240 [inline] BUG: KASAN: use-after-free in io_wqe_inc_running fs/io-wq.c:301 [inline] BUG: KASAN: use-after-free in io_wq_worker_running+0xde/0x110 fs/io-wq.c:613 Write of size 4 at addr ffff8882183db08c by task io_wqe_worker-0/7771 CPU: 0 PID: 7771 Comm: io_wqe_worker-0 Not tainted 5.9.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x198/0x1fd lib/dump_stack.c:118 print_address_description.constprop.0.cold+0xae/0x497 mm/kasan/report.c:383 __kasan_report mm/kasan/report.c:513 [inline] kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530 check_memory_region_inline mm/kasan/generic.c:186 [inline] check_memory_region+0x13d/0x180 mm/kasan/generic.c:192 instrument_atomic_write include/linux/instrumented.h:71 [inline] atomic_inc include/asm-generic/atomic-instrumented.h:240 [inline] io_wqe_inc_running fs/io-wq.c:301 [inline] io_wq_worker_running+0xde/0x110 fs/io-wq.c:613 schedule_timeout+0x148/0x250 kernel/time/timer.c:1879 io_wqe_worker+0x517/0x10e0 fs/io-wq.c:580 kthread+0x3b5/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 Allocated by task 7768: kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48 kasan_set_track mm/kasan/common.c:56 [inline] __kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:461 kmem_cache_alloc_node_trace+0x17b/0x3f0 mm/slab.c:3594 kmalloc_node include/linux/slab.h:572 [inline] kzalloc_node include/linux/slab.h:677 [inline] io_wq_create+0x57b/0xa10 fs/io-wq.c:1064 io_init_wq_offload fs/io_uring.c:7432 [inline] io_sq_offload_start fs/io_uring.c:7504 [inline] io_uring_create fs/io_uring.c:8625 [inline] io_uring_setup+0x1836/0x28e0 fs/io_uring.c:8694 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 21: kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48 kasan_set_track+0x1c/0x30 mm/kasan/common.c:56 kasan_set_free_info+0x1b/0x30 mm/kasan/generic.c:355 __kasan_slab_free+0xd8/0x120 mm/kasan/common.c:422 __cache_free mm/slab.c:3418 [inline] kfree+0x10e/0x2b0 mm/slab.c:3756 __io_wq_destroy fs/io-wq.c:1138 [inline] io_wq_destroy+0x2af/0x460 fs/io-wq.c:1146 io_finish_async fs/io_uring.c:6836 [inline] io_ring_ctx_free fs/io_uring.c:7870 [inline] io_ring_exit_work+0x1e4/0x6d0 fs/io_uring.c:7954 process_one_work+0x94c/0x1670 kernel/workqueue.c:2269 worker_thread+0x64c/0x1120 kernel/workqueue.c:2415 kthread+0x3b5/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 The buggy address belongs to the object at ffff8882183db000 which belongs to the cache kmalloc-1k of size 1024 The buggy address is located 140 bytes inside of 1024-byte region [ffff8882183db000, ffff8882183db400) The buggy address belongs to the page: page:000000009bada22b refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2183db flags: 0x57ffe0000000200(slab) raw: 057ffe0000000200 ffffea0008604c48 ffffea00086a8648 ffff8880aa040700 raw: 0000000000000000 ffff8882183db000 0000000100000002 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8882183daf80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff8882183db000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff8882183db080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8882183db100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8882183db180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== which is down to the comment below, /* all workers gone, wq exit can proceed */ if (!nr_workers && refcount_dec_and_test(&wqe->wq->refs)) complete(&wqe->wq->done); because there might be multiple cases of wqe in a wq and we would wait for every worker in every wqe to go home before releasing wq's resources on destroying. To that end, rework wq's refcount by making it independent of the tracking of workers because after all they are two different things, and keeping it balanced when workers come and go. Note the manager kthread, like other workers, now holds a grab to wq during its lifetime. Finally to help destroy wq, check IO_WQ_BIT_EXIT upon creating worker and do nothing for exiting wq. Cc: stable@vger.kernel.org # v5.5+ Reported-by: syzbot+45fa0a195b941764e0f0@syzkaller.appspotmail.com Reported-by: syzbot+9af99580130003da82b1@syzkaller.appspotmail.com Cc: Pavel Begunkov Signed-off-by: Hillf Danton Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 4863be6534250e0adb3236af65f1a7ec92c6b6bd Author: Sebastian Andrzej Siewior Date: Tue Sep 1 10:41:46 2020 +0200 io_wq: Make io_wqe::lock a raw_spinlock_t commit 95da84659226d75698a1ab958be0af21d9cc2a9c upstream. During a context switch the scheduler invokes wq_worker_sleeping() with disabled preemption. Disabling preemption is needed because it protects access to `worker->sleeping'. As an optimisation it avoids invoking schedule() within the schedule path as part of possible wake up (thus preempt_enable_no_resched() afterwards). The io-wq has been added to the mix in the same section with disabled preemption. This breaks on PREEMPT_RT because io_wq_worker_sleeping() acquires a spinlock_t. Also within the schedule() the spinlock_t must be acquired after tsk_is_pi_blocked() otherwise it will block on the sleeping lock again while scheduling out. While playing with `io_uring-bench' I didn't notice a significant latency spike after converting io_wqe::lock to a raw_spinlock_t. The latency was more or less the same. In order to keep the spinlock_t it would have to be moved after the tsk_is_pi_blocked() check which would introduce a branch instruction into the hot path. The lock is used to maintain the `work_list' and wakes one task up at most. Should io_wqe_cancel_pending_work() cause latency spikes, while searching for a specific item, then it would need to drop the lock during iterations. revert_creds() is also invoked under the lock. According to debug cred::non_rcu is 0. Otherwise it should be moved outside of the locked section because put_cred_rcu()->free_uid() acquires a sleeping lock. Convert io_wqe::lock to a raw_spinlock_t.c Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit b6a6d1df552bbca5216595ad79f245484b98c5e7 Author: Jens Axboe Date: Fri Sep 18 20:13:06 2020 -0600 io_uring: reference ->nsproxy for file table commands commit 9b8284921513fc1ea57d87777283a59b05862f03 upstream. If we don't get and assign the namespace for the async work, then certain paths just don't work properly (like /dev/stdin, /proc/mounts, etc). Anything that references the current namespace of the given task should be assigned for async work on behalf of that task. Cc: stable@vger.kernel.org # v5.5+ Reported-by: Al Viro Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 511abceaf0a00cb75f13bdc78f210a7b015e0478 Author: Jens Axboe Date: Sun Sep 13 13:09:39 2020 -0600 io_uring: don't rely on weak ->files references commit 0f2122045b946241a9e549c2a76cea54fa58a7ff upstream. Grab actual references to the files_struct. To avoid circular references issues due to this, we add a per-task note that keeps track of what io_uring contexts a task has used. When the tasks execs or exits its assigned files, we cancel requests based on this tracking. With that, we can grab proper references to the files table, and no longer need to rely on stashing away ring_fd and ring_file to check if the ring_fd may have been closed. Cc: stable@vger.kernel.org # v5.5+ Reviewed-by: Pavel Begunkov Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit fdc84c9bf1316fb427b4d3c6478766b67370628f Author: Jens Axboe Date: Mon Sep 28 13:10:13 2020 -0600 io_uring: enable task/files specific overflow flushing commit e6c8aa9ac33bd7c968af7816240fc081401fddcd upstream. This allows us to selectively flush out pending overflows, depending on the task and/or files_struct being passed in. No intended functional changes in this patch. Reviewed-by: Pavel Begunkov Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 3de61f9bcc1c1c8a272e0ed92c2bc3923f68ce1e Author: Jens Axboe Date: Sat Sep 26 15:05:03 2020 -0600 io_uring: return cancelation status from poll/timeout/files handlers commit 76e1b6427fd8246376a97e3227049d49188dfb9c upstream. Return whether we found and canceled requests or not. This is in preparation for using this information, no functional changes in this patch. Reviewed-by: Pavel Begunkov Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit f34e674fbe6dbb1aceeb2c43d917f00b0f7b5002 Author: Jens Axboe Date: Mon Oct 12 11:25:39 2020 -0600 io_uring: unconditionally grab req->task commit e3bc8e9dad7f2f83cc807111d4472164c9210153 upstream. Sometimes we assign a weak reference to it, sometimes we grab a reference to it. Clean this up and make it unconditional, and drop the flag related to tracking this state. Reviewed-by: Pavel Begunkov Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit bf03059892415cd40eeabc0fed13bbcd05ba2602 Author: Jens Axboe Date: Mon Oct 12 11:15:07 2020 -0600 io_uring: stash ctx task reference for SQPOLL commit 2aede0e417db846793c276c7a1bbf7262c8349b0 upstream. We can grab a reference to the task instead of stashing away the task files_struct. This is doable without creating a circular reference between the ring fd and the task itself. Reviewed-by: Pavel Begunkov Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit dd1acc182c85633bebc222d14acaf3f816503f0d Author: Jens Axboe Date: Mon Oct 12 11:03:18 2020 -0600 io_uring: move dropping of files into separate helper commit f573d384456b3025d3f8e58b3eafaeeb0f510784 upstream. No functional changes in this patch, prep patch for grabbing references to the files_struct. Reviewed-by: Pavel Begunkov Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit cecf78cc08906890fe03e2517254c32a2d6a7d33 Author: Jens Axboe Date: Tue Sep 22 08:18:24 2020 -0600 io_uring: allow timeout/poll/files killing to take task into account commit f3606e3a92ddd36299642c78592fc87609abb1f6 upstream. We currently cancel these when the ring exits, and we cancel all of them. This is in preparation for killing only the ones associated with a given task. Reviewed-by: Pavel Begunkov Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 07463d7da99985d8254f6e5c66cd9843037a9bbd Author: Jens Axboe Date: Mon Oct 12 11:53:29 2020 -0600 io_uring: don't run task work on an exiting task commit 6200b0ae4ea28a4bfd8eb434e33e6201b7a6a282 upstream. This isn't safe, and isn't needed either. We are guaranteed that any work we queue is on a live task (and will be run), or it goes to our backup io-wq threads if the task is exiting. Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 6e1f770fbc0aea308be1d5b6f3dffee89d0717ce Author: Saeed Mirzamohammadi Date: Tue Oct 20 13:41:36 2020 +0200 netfilter: nftables_offload: KASAN slab-out-of-bounds Read in nft_flow_rule_create commit 31cc578ae2de19c748af06d859019dced68e325d upstream. This patch fixes the issue due to: BUG: KASAN: slab-out-of-bounds in nft_flow_rule_create+0x622/0x6a2 net/netfilter/nf_tables_offload.c:40 Read of size 8 at addr ffff888103910b58 by task syz-executor227/16244 The error happens when expr->ops is accessed early on before performing the boundary check and after nft_expr_next() moves the expr to go out-of-bounds. This patch checks the boundary condition before expr->ops that fixes the slab-out-of-bounds Read issue. Add nft_expr_more() and use it to fix this problem. Signed-off-by: Saeed Mirzamohammadi Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman