--- linux-riscv-5.4.0.orig/Documentation/ABI/stable/sysfs-driver-mlxreg-io +++ linux-riscv-5.4.0/Documentation/ABI/stable/sysfs-driver-mlxreg-io @@ -29,13 +29,13 @@ The files are read only. -What: /sys/devices/platform/mlxplat/mlxreg-io/hwmon/hwmon*/jtag_enable +What: /sys/devices/platform/mlxplat/mlxreg-io/hwmon/hwmon*/cpld3_version Date: November 2018 KernelVersion: 5.0 Contact: Vadim Pasternak Description: These files show with which CPLD versions have been burned - on LED board. + on LED or Gearbox board. The files are read only. @@ -121,6 +121,15 @@ The files are read only. +What: /sys/devices/platform/mlxplat/mlxreg-io/hwmon/hwmon*/cpld4_version +Date: November 2018 +KernelVersion: 5.0 +Contact: Vadim Pasternak +Description: These files show with which CPLD versions have been burned + on LED board. + + The files are read only. + Date: June 2019 KernelVersion: 5.3 Contact: Vadim Pasternak --- linux-riscv-5.4.0.orig/Documentation/ABI/testing/debugfs-aufs +++ linux-riscv-5.4.0/Documentation/ABI/testing/debugfs-aufs @@ -0,0 +1,55 @@ +What: /debug/aufs/si_/ +Date: March 2009 +Contact: J. R. Okajima +Description: + Under /debug/aufs, a directory named si_ is created + per aufs mount, where is a unique id generated + internally. + +What: /debug/aufs/si_/plink +Date: Apr 2013 +Contact: J. R. Okajima +Description: + It has three lines and shows the information about the + pseudo-link. The first line is a single number + representing a number of buckets. The second line is a + number of pseudo-links per buckets (separated by a + blank). The last line is a single number representing a + total number of psedo-links. + When the aufs mount option 'noplink' is specified, it + will show "1\n0\n0\n". + +What: /debug/aufs/si_/xib +Date: March 2009 +Contact: J. R. Okajima +Description: + It shows the consumed blocks by xib (External Inode Number + Bitmap), its block size and file size. + When the aufs mount option 'noxino' is specified, it + will be empty. About XINO files, see the aufs manual. + +What: /debug/aufs/si_/xi0, xi1 ... xiN and xiN-N +Date: March 2009 +Contact: J. R. Okajima +Description: + It shows the consumed blocks by xino (External Inode Number + Translation Table), its link count, block size and file + size. + Due to the file size limit, there may exist multiple + xino files per branch. In this case, "-N" is added to + the filename and it corresponds to the index of the + internal xino array. "-0" is omitted. + When the aufs mount option 'noxino' is specified, Those + entries won't exist. About XINO files, see the aufs + manual. + +What: /debug/aufs/si_/xigen +Date: March 2009 +Contact: J. R. Okajima +Description: + It shows the consumed blocks by xigen (External Inode + Generation Table), its block size and file size. + If CONFIG_AUFS_EXPORT is disabled, this entry will not + be created. + When the aufs mount option 'noxino' is specified, it + will be empty. About XINO files, see the aufs manual. --- linux-riscv-5.4.0.orig/Documentation/ABI/testing/debugfs-hisi-hpre +++ linux-riscv-5.4.0/Documentation/ABI/testing/debugfs-hisi-hpre @@ -0,0 +1,57 @@ +What: /sys/kernel/debug/hisi_hpre//cluster[0-3]/regs +Date: Sep 2019 +Contact: linux-crypto@vger.kernel.org +Description: Dump debug registers from the HPRE cluster. + Only available for PF. + +What: /sys/kernel/debug/hisi_hpre//cluster[0-3]/cluster_ctrl +Date: Sep 2019 +Contact: linux-crypto@vger.kernel.org +Description: Write the HPRE core selection in the cluster into this file, + and then we can read the debug information of the core. + Only available for PF. + +What: /sys/kernel/debug/hisi_hpre//rdclr_en +Date: Sep 2019 +Contact: linux-crypto@vger.kernel.org +Description: HPRE cores debug registers read clear control. 1 means enable + register read clear, otherwise 0. Writing to this file has no + functional effect, only enable or disable counters clear after + reading of these registers. + Only available for PF. + +What: /sys/kernel/debug/hisi_hpre//current_qm +Date: Sep 2019 +Contact: linux-crypto@vger.kernel.org +Description: One HPRE controller has one PF and multiple VFs, each function + has a QM. Select the QM which below qm refers to. + Only available for PF. + +What: /sys/kernel/debug/hisi_hpre//regs +Date: Sep 2019 +Contact: linux-crypto@vger.kernel.org +Description: Dump debug registers from the HPRE. + Only available for PF. + +What: /sys/kernel/debug/hisi_hpre//qm/qm_regs +Date: Sep 2019 +Contact: linux-crypto@vger.kernel.org +Description: Dump debug registers from the QM. + Available for PF and VF in host. VF in guest currently only + has one debug register. + +What: /sys/kernel/debug/hisi_hpre//qm/current_q +Date: Sep 2019 +Contact: linux-crypto@vger.kernel.org +Description: One QM may contain multiple queues. Select specific queue to + show its debug registers in above qm_regs. + Only available for PF. + +What: /sys/kernel/debug/hisi_hpre//qm/clear_enable +Date: Sep 2019 +Contact: linux-crypto@vger.kernel.org +Description: QM debug registers(qm_regs) read clear control. 1 means enable + register read clear, otherwise 0. + Writing to this file has no functional effect, only enable or + disable counters clear after reading of these registers. + Only available for PF. --- linux-riscv-5.4.0.orig/Documentation/ABI/testing/debugfs-hisi-sec +++ linux-riscv-5.4.0/Documentation/ABI/testing/debugfs-hisi-sec @@ -0,0 +1,43 @@ +What: /sys/kernel/debug/hisi_sec//sec_dfx +Date: Oct 2019 +Contact: linux-crypto@vger.kernel.org +Description: Dump the debug registers of SEC cores. + Only available for PF. + +What: /sys/kernel/debug/hisi_sec//clear_enable +Date: Oct 2019 +Contact: linux-crypto@vger.kernel.org +Description: Enabling/disabling of clear action after reading + the SEC debug registers. + 0: disable, 1: enable. + Only available for PF, and take no other effect on SEC. + +What: /sys/kernel/debug/hisi_sec//current_qm +Date: Oct 2019 +Contact: linux-crypto@vger.kernel.org +Description: One SEC controller has one PF and multiple VFs, each function + has a QM. This file can be used to select the QM which below + qm refers to. + Only available for PF. + +What: /sys/kernel/debug/hisi_sec//qm/qm_regs +Date: Oct 2019 +Contact: linux-crypto@vger.kernel.org +Description: Dump of QM related debug registers. + Available for PF and VF in host. VF in guest currently only + has one debug register. + +What: /sys/kernel/debug/hisi_sec//qm/current_q +Date: Oct 2019 +Contact: linux-crypto@vger.kernel.org +Description: One QM of SEC may contain multiple queues. Select specific + queue to show its debug registers in above 'qm_regs'. + Only available for PF. + +What: /sys/kernel/debug/hisi_sec//qm/clear_enable +Date: Oct 2019 +Contact: linux-crypto@vger.kernel.org +Description: Enabling/disabling of clear action after reading + the SEC's QM debug registers. + 0: disable, 1: enable. + Only available for PF, and take no other effect on SEC. --- linux-riscv-5.4.0.orig/Documentation/ABI/testing/ima_policy +++ linux-riscv-5.4.0/Documentation/ABI/testing/ima_policy @@ -25,6 +25,7 @@ lsm: [[subj_user=] [subj_role=] [subj_type=] [obj_user=] [obj_role=] [obj_type=]] option: [[appraise_type=]] [template=] [permit_directio] + [appraise_flag=] base: func:= [BPRM_CHECK][MMAP_CHECK][CREDS_CHECK][FILE_CHECK][MODULE_CHECK] [FIRMWARE_CHECK] [KEXEC_KERNEL_CHECK] [KEXEC_INITRAMFS_CHECK] @@ -38,6 +39,9 @@ fowner:= decimal value lsm: are LSM specific option: appraise_type:= [imasig] [imasig|modsig] + appraise_flag:= [check_blacklist] + Currently, blacklist check is only for files signed with appended + signature. template:= name of a defined IMA template type (eg, ima-ng). Only valid when action is "measure". pcr:= decimal value --- linux-riscv-5.4.0.orig/Documentation/ABI/testing/sysfs-aufs +++ linux-riscv-5.4.0/Documentation/ABI/testing/sysfs-aufs @@ -0,0 +1,31 @@ +What: /sys/fs/aufs/si_/ +Date: March 2009 +Contact: J. R. Okajima +Description: + Under /sys/fs/aufs, a directory named si_ is created + per aufs mount, where is a unique id generated + internally. + +What: /sys/fs/aufs/si_/br0, br1 ... brN +Date: March 2009 +Contact: J. R. Okajima +Description: + It shows the abolute path of a member directory (which + is called branch) in aufs, and its permission. + +What: /sys/fs/aufs/si_/brid0, brid1 ... bridN +Date: July 2013 +Contact: J. R. Okajima +Description: + It shows the id of a member directory (which is called + branch) in aufs. + +What: /sys/fs/aufs/si_/xi_path +Date: March 2009 +Contact: J. R. Okajima +Description: + It shows the abolute path of XINO (External Inode Number + Bitmap, Translation Table and Generation Table) file + even if it is the default path. + When the aufs mount option 'noxino' is specified, it + will be empty. About XINO files, see the aufs manual. --- linux-riscv-5.4.0.orig/Documentation/ABI/testing/sysfs-bus-mei +++ linux-riscv-5.4.0/Documentation/ABI/testing/sysfs-bus-mei @@ -4,7 +4,7 @@ Contact: Samuel Ortiz linux-mei@linux.intel.com Description: Stores the same MODALIAS value emitted by uevent - Format: mei::: + Format: mei::: What: /sys/bus/mei/devices/.../name Date: May 2015 --- linux-riscv-5.4.0.orig/Documentation/ABI/testing/sysfs-bus-pci +++ linux-riscv-5.4.0/Documentation/ABI/testing/sysfs-bus-pci @@ -347,3 +347,16 @@ If the device has any Peer-to-Peer memory registered, this file contains a '1' if the memory has been published for use outside the driver that owns the device. + +What: /sys/bus/pci/devices/.../link/clkpm + /sys/bus/pci/devices/.../link/l0s_aspm + /sys/bus/pci/devices/.../link/l1_aspm + /sys/bus/pci/devices/.../link/l1_1_aspm + /sys/bus/pci/devices/.../link/l1_2_aspm + /sys/bus/pci/devices/.../link/l1_1_pcipm + /sys/bus/pci/devices/.../link/l1_2_pcipm +Date: October 2019 +Contact: Heiner Kallweit +Description: If ASPM is supported for an endpoint, these files can be + used to disable or enable the individual power management + states. Write y/1/on to enable, n/0/off to disable. --- linux-riscv-5.4.0.orig/Documentation/ABI/testing/sysfs-class-devfreq +++ linux-riscv-5.4.0/Documentation/ABI/testing/sysfs-class-devfreq @@ -7,6 +7,13 @@ The name of devfreq object denoted as ... is same as the name of device using devfreq. +What: /sys/class/devfreq/.../name +Date: November 2019 +Contact: Chanwoo Choi +Description: + The /sys/class/devfreq/.../name shows the name of device + of the corresponding devfreq object. + What: /sys/class/devfreq/.../governor Date: September 2011 Contact: MyungJoo Ham --- linux-riscv-5.4.0.orig/Documentation/ABI/testing/sysfs-secvar +++ linux-riscv-5.4.0/Documentation/ABI/testing/sysfs-secvar @@ -0,0 +1,46 @@ +What: /sys/firmware/secvar +Date: August 2019 +Contact: Nayna Jain +Description: This directory is created if the POWER firmware supports OS + secureboot, thereby secure variables. It exposes interface + for reading/writing the secure variables + +What: /sys/firmware/secvar/vars +Date: August 2019 +Contact: Nayna Jain +Description: This directory lists all the secure variables that are supported + by the firmware. + +What: /sys/firmware/secvar/format +Date: August 2019 +Contact: Nayna Jain +Description: A string indicating which backend is in use by the firmware. + This determines the format of the variable and the accepted + format of variable updates. + +What: /sys/firmware/secvar/vars/ +Date: August 2019 +Contact: Nayna Jain +Description: Each secure variable is represented as a directory named as + . The variable name is unique and is in ASCII + representation. The data and size can be determined by reading + their respective attribute files. + +What: /sys/firmware/secvar/vars//size +Date: August 2019 +Contact: Nayna Jain +Description: An integer representation of the size of the content of the + variable. In other words, it represents the size of the data. + +What: /sys/firmware/secvar/vars//data +Date: August 2019 +Contact: Nayna Jain h +Description: A read-only file containing the value of the variable. The size + of the file represents the maximum size of the variable data. + +What: /sys/firmware/secvar/vars//update +Date: August 2019 +Contact: Nayna Jain +Description: A write-only file that is used to submit the new value for the + variable. The size of the file represents the maximum size of + the variable data that can be written. --- linux-riscv-5.4.0.orig/Documentation/DMA-attributes.txt +++ linux-riscv-5.4.0/Documentation/DMA-attributes.txt @@ -5,24 +5,6 @@ This document describes the semantics of the DMA attributes that are defined in linux/dma-mapping.h. -DMA_ATTR_WRITE_BARRIER ----------------------- - -DMA_ATTR_WRITE_BARRIER is a (write) barrier attribute for DMA. DMA -to a memory region with the DMA_ATTR_WRITE_BARRIER attribute forces -all pending DMA writes to complete, and thus provides a mechanism to -strictly order DMA from a device across all intervening busses and -bridges. This barrier is not specific to a particular type of -interconnect, it applies to the system as a whole, and so its -implementation must account for the idiosyncrasies of the system all -the way from the DMA device to memory. - -As an example of a situation where DMA_ATTR_WRITE_BARRIER would be -useful, suppose that a device does a DMA write to indicate that data is -ready and available in memory. The DMA of the "completion indication" -could race with data DMA. Mapping the memory used for completion -indications with DMA_ATTR_WRITE_BARRIER would prevent the race. - DMA_ATTR_WEAK_ORDERING ---------------------- --- linux-riscv-5.4.0.orig/Documentation/admin-guide/device-mapper/index.rst +++ linux-riscv-5.4.0/Documentation/admin-guide/device-mapper/index.rst @@ -8,6 +8,7 @@ cache-policies cache delay + dm-clone dm-crypt dm-flakey dm-init --- linux-riscv-5.4.0.orig/Documentation/admin-guide/hw-vuln/mds.rst +++ linux-riscv-5.4.0/Documentation/admin-guide/hw-vuln/mds.rst @@ -265,8 +265,11 @@ ============ ============================================================= -Not specifying this option is equivalent to "mds=full". - +Not specifying this option is equivalent to "mds=full". For processors +that are affected by both TAA (TSX Asynchronous Abort) and MDS, +specifying just "mds=off" without an accompanying "tsx_async_abort=off" +will have no effect as the same mitigation is used for both +vulnerabilities. Mitigation selection guide -------------------------- --- linux-riscv-5.4.0.orig/Documentation/admin-guide/hw-vuln/tsx_async_abort.rst +++ linux-riscv-5.4.0/Documentation/admin-guide/hw-vuln/tsx_async_abort.rst @@ -174,7 +174,10 @@ CPU is not vulnerable to cross-thread TAA attacks. ============ ============================================================= -Not specifying this option is equivalent to "tsx_async_abort=full". +Not specifying this option is equivalent to "tsx_async_abort=full". For +processors that are affected by both TAA and MDS, specifying just +"tsx_async_abort=off" without an accompanying "mds=off" will have no +effect as the same mitigation is used for both vulnerabilities. The kernel command line also allows to control the TSX feature using the parameter "tsx=" on CPUs which support TSX control. MSR_IA32_TSX_CTRL is used --- linux-riscv-5.4.0.orig/Documentation/admin-guide/kernel-parameters.txt +++ linux-riscv-5.4.0/Documentation/admin-guide/kernel-parameters.txt @@ -113,7 +113,7 @@ the GPE dispatcher. This facility can be used to prevent such uncontrolled GPE floodings. - Format: + Format: acpi_no_auto_serialize [HW,ACPI] Disable auto-serialization of AML methods @@ -136,6 +136,10 @@ dynamic table installation which will install SSDT tables to /sys/firmware/acpi/tables/dynamic. + acpi_no_watchdog [HW,ACPI,WDT] + Ignore the ACPI-based watchdog interface (WDAT) and let + a native driver control the watchdog device instead. + acpi_rsdp= [ACPI,EFI,KEXEC] Pass the RSDP address to the kernel, mostly used on machines running EFI runtime service to boot the @@ -680,6 +684,10 @@ 0: default value, disable debugging 1: enable debugging at boot time + cpufreq_driver= [X86] Allow only the named cpu frequency scaling driver + to register. Example: cpufreq_driver=powernow-k8 + Format: { none | STRING } + cpuidle.off=1 [CPU_IDLE] disable the cpuidle sub-system @@ -836,6 +844,18 @@ dump out devices still on the deferred probe list after retrying. + dfltcc= [HW,S390] + Format: { on | off | def_only | inf_only | always } + on: s390 zlib hardware support for compression on + level 1 and decompression (default) + off: No s390 zlib hardware support + def_only: s390 zlib hardware support for deflate + only (compression on level 1) + inf_only: s390 zlib hardware support for inflate + only (decompression) + always: Same as 'on' but ignores the selected compression + level always using hardware support (used for debugging) + dhash_entries= [KNL] Set number of hash buckets for dentry cache. @@ -2473,6 +2493,12 @@ SMT on vulnerable CPUs off - Unconditionally disable MDS mitigation + On TAA-affected machines, mds=off can be prevented by + an active TAA mitigation as both vulnerabilities are + mitigated with the same mechanism so in order to disable + this mitigation, you need to specify tsx_async_abort=off + too. + Not specifying this option is equivalent to mds=full. @@ -3402,6 +3428,12 @@ nomsi [MSI] If the PCI_MSI kernel config parameter is enabled, this kernel boot option can be used to disable the use of MSI interrupts system-wide. + clearmsi [X86] Clears MSI/MSI-X enable bits early in boot + time in order to avoid issues like adapters + screaming irqs and preventing boot progress. + Also, it enforces the PCI Local Bus spec + rule that those bits should be 0 in system reset + events (useful for kexec/kdump cases). noioapicquirk [APIC] Disable all boot interrupt quirks. Safety option to keep boot IRQs enabled. This should never be necessary. @@ -3567,6 +3599,8 @@ even if the platform doesn't give the OS permission to use them. This may cause conflicts if the platform also tries to use these services. + dpc-native Use native PCIe service for DPC only. May + cause conflicts if firmware uses AER or DPC. compat Disable native PCIe services (PME, AER, DPC, PCIe hotplug). @@ -3720,6 +3754,11 @@ before loading. See Documentation/admin-guide/blockdev/ramdisk.rst. + prot_virt= [S390] enable hosting protected virtual machines + isolated from the hypervisor (if hardware supports + that). + Format: + psi= [KNL] Enable or disable pressure stall information tracking. Format: @@ -4931,6 +4970,11 @@ vulnerable to cross-thread TAA attacks. off - Unconditionally disable TAA mitigation + On MDS-affected machines, tsx_async_abort=off can be + prevented by an active MDS mitigation as both vulnerabilities + are mitigated with the same mechanism so in order to disable + this mitigation, you need to specify mds=off too. + Not specifying this option is equivalent to tsx_async_abort=full. On CPUs which are MDS affected and deploy MDS mitigation, TAA mitigation is not @@ -5090,13 +5134,13 @@ Flags is a set of characters, each corresponding to a common usb-storage quirk flag as follows: a = SANE_SENSE (collect more than 18 bytes - of sense data); + of sense data, not on uas); b = BAD_SENSE (don't collect more than 18 - bytes of sense data); + bytes of sense data, not on uas); c = FIX_CAPACITY (decrease the reported device capacity by one sector); d = NO_READ_DISC_INFO (don't use - READ_DISC_INFO command); + READ_DISC_INFO command, not on uas); e = NO_READ_CAPACITY_16 (don't use READ_CAPACITY_16 command); f = NO_REPORT_OPCODES (don't use report opcodes @@ -5111,17 +5155,18 @@ j = NO_REPORT_LUNS (don't use report luns command, uas only); l = NOT_LOCKABLE (don't try to lock and - unlock ejectable media); + unlock ejectable media, not on uas); m = MAX_SECTORS_64 (don't transfer more - than 64 sectors = 32 KB at a time); + than 64 sectors = 32 KB at a time, + not on uas); n = INITIAL_READ10 (force a retry of the - initial READ(10) command); + initial READ(10) command, not on uas); o = CAPACITY_OK (accept the capacity - reported by the device); + reported by the device, not on uas); p = WRITE_CACHE (the device cache is ON - by default); + by default, not on uas); r = IGNORE_RESIDUE (the device reports - bogus residue values); + bogus residue values, not on uas); s = SINGLE_LUN (the device has only one Logical Unit); t = NO_ATA_1X (don't allow ATA(12) and ATA(16) @@ -5130,7 +5175,8 @@ w = NO_WP_DETECT (don't test whether the medium is write-protected). y = ALWAYS_SYNC (issue a SYNCHRONIZE_CACHE - even if the device claims no cache) + even if the device claims no cache, + not on uas) Example: quirks=0419:aaf5:rl,0421:0433:rc user_debug= [KNL,ARM] --- linux-riscv-5.4.0.orig/Documentation/arm64/tagged-address-abi.rst +++ linux-riscv-5.4.0/Documentation/arm64/tagged-address-abi.rst @@ -44,8 +44,15 @@ how the user addresses are used by the kernel: 1. User addresses not accessed by the kernel but used for address space - management (e.g. ``mmap()``, ``mprotect()``, ``madvise()``). The use - of valid tagged pointers in this context is always allowed. + management (e.g. ``mprotect()``, ``madvise()``). The use of valid + tagged pointers in this context is allowed with the exception of + ``brk()``, ``mmap()`` and the ``new_address`` argument to + ``mremap()`` as these have the potential to alias with existing + user addresses. + + NOTE: This behaviour changed in v5.6 and so some earlier kernels may + incorrectly accept valid tagged pointers for the ``brk()``, + ``mmap()`` and ``mremap()`` system calls. 2. User addresses accessed by the kernel (e.g. ``write()``). This ABI relaxation is disabled by default and the application thread needs to --- linux-riscv-5.4.0.orig/Documentation/cgroups/namespace.txt +++ linux-riscv-5.4.0/Documentation/cgroups/namespace.txt @@ -0,0 +1,142 @@ + CGroup Namespaces + +CGroup Namespace provides a mechanism to virtualize the view of the +/proc//cgroup file. The CLONE_NEWCGROUP clone-flag can be used with +clone() and unshare() syscalls to create a new cgroup namespace. +The process running inside the cgroup namespace will have its /proc//cgroup +output restricted to cgroupns-root. cgroupns-root is the cgroup of the process +at the time of creation of the cgroup namespace. + +Prior to CGroup Namespace, the /proc//cgroup file used to show complete +path of the cgroup of a process. In a container setup (where a set of cgroups +and namespaces are intended to isolate processes), the /proc//cgroup file +may leak potential system level information to the isolated processes. + +For Example: + $ cat /proc/self/cgroup + 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/batchjobs/container_id1 + +The path '/batchjobs/container_id1' can generally be considered as system-data +and its desirable to not expose it to the isolated process. + +CGroup Namespaces can be used to restrict visibility of this path. +For Example: + # Before creating cgroup namespace + $ ls -l /proc/self/ns/cgroup + lrwxrwxrwx 1 root root 0 2014-07-15 10:37 /proc/self/ns/cgroup -> cgroup:[4026531835] + $ cat /proc/self/cgroup + 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/batchjobs/container_id1 + + # unshare(CLONE_NEWCGROUP) and exec /bin/bash + $ ~/unshare -c + [ns]$ ls -l /proc/self/ns/cgroup + lrwxrwxrwx 1 root root 0 2014-07-15 10:35 /proc/self/ns/cgroup -> cgroup:[4026532183] + # From within new cgroupns, process sees that its in the root cgroup + [ns]$ cat /proc/self/cgroup + 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/ + + # From global cgroupns: + $ cat /proc//cgroup + 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/batchjobs/container_id1 + + # Unshare cgroupns along with userns and mountns + # Following calls unshare(CLONE_NEWCGROUP|CLONE_NEWUSER|CLONE_NEWNS), then + # sets up uid/gid map and execs /bin/bash + $ ~/unshare -c -u -m + # Originally, we were in /batchjobs/container_id1 cgroup. Mount our own cgroup + # hierarchy. + [ns]$ mount -t cgroup cgroup /tmp/cgroup + [ns]$ ls -l /tmp/cgroup + total 0 + -r--r--r-- 1 root root 0 2014-10-13 09:32 cgroup.controllers + -r--r--r-- 1 root root 0 2014-10-13 09:32 cgroup.populated + -rw-r--r-- 1 root root 0 2014-10-13 09:25 cgroup.procs + -rw-r--r-- 1 root root 0 2014-10-13 09:32 cgroup.subtree_control + +The cgroupns-root (/batchjobs/container_id1 in above example) becomes the +filesystem root for the namespace specific cgroupfs mount. + +The virtualization of /proc/self/cgroup file combined with restricting +the view of cgroup hierarchy by namespace-private cgroupfs mount +should provide a completely isolated cgroup view inside the container. + +In its current form, the cgroup namespaces patcheset provides following +behavior: + +(1) The 'cgroupns-root' for a cgroup namespace is the cgroup in which + the process calling unshare is running. + For ex. if a process in /batchjobs/container_id1 cgroup calls unshare, + cgroup /batchjobs/container_id1 becomes the cgroupns-root. + For the init_cgroup_ns, this is the real root ('/') cgroup + (identified in code as cgrp_dfl_root.cgrp). + +(2) The cgroupns-root cgroup does not change even if the namespace + creator process later moves to a different cgroup. + $ ~/unshare -c # unshare cgroupns in some cgroup + [ns]$ cat /proc/self/cgroup + 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/ + [ns]$ mkdir sub_cgrp_1 + [ns]$ echo 0 > sub_cgrp_1/cgroup.procs + [ns]$ cat /proc/self/cgroup + 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/sub_cgrp_1 + +(3) Each process gets its CGROUPNS specific view of /proc//cgroup +(a) Processes running inside the cgroup namespace will be able to see + cgroup paths (in /proc/self/cgroup) only inside their root cgroup + [ns]$ sleep 100000 & # From within unshared cgroupns + [1] 7353 + [ns]$ echo 7353 > sub_cgrp_1/cgroup.procs + [ns]$ cat /proc/7353/cgroup + 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/sub_cgrp_1 + +(b) From global cgroupns, the real cgroup path will be visible: + $ cat /proc/7353/cgroup + 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/batchjobs/container_id1/sub_cgrp_1 + +(c) From a sibling cgroupns (cgroupns root-ed at a different cgroup), cgroup + path relative to its own cgroupns-root will be shown: + # ns2's cgroupns-root is at '/batchjobs/container_id2' + [ns2]$ cat /proc/7353/cgroup + 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/../container_id2/sub_cgrp_1 + + Note that the relative path always starts with '/' to indicate that its + relative to the cgroupns-root of the caller. + +(4) Processes inside a cgroupns can move in-and-out of the cgroupns-root + (if they have proper access to external cgroups). + # From inside cgroupns (with cgroupns-root at /batchjobs/container_id1), and + # assuming that the global hierarchy is still accessible inside cgroupns: + $ cat /proc/7353/cgroup + 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/sub_cgrp_1 + $ echo 7353 > batchjobs/container_id2/cgroup.procs + $ cat /proc/7353/cgroup + 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/../container_id2 + + Note that this kind of setup is not encouraged. A task inside cgroupns + should only be exposed to its own cgroupns hierarchy. Otherwise it makes + the virtualization of /proc//cgroup less useful. + +(5) Setns to another cgroup namespace is allowed when: + (a) the process has CAP_SYS_ADMIN in its current userns + (b) the process has CAP_SYS_ADMIN in the target cgroupns' userns + No implicit cgroup changes happen with attaching to another cgroupns. It + is expected that the somone moves the attaching process under the target + cgroupns-root. + +(6) When some thread from a multi-threaded process unshares its + cgroup-namespace, the new cgroupns gets applied to the entire process (all + the threads). For the unified-hierarchy this is expected as it only allows + process-level containerization. For the legacy hierarchies this may be + unexpected. So all the threads in the process will have the same cgroup. + +(7) The cgroup namespace is alive as long as there is atleast 1 + process inside it. When the last process exits, the cgroup + namespace is destroyed. The cgroupns-root and the actual cgroups + remain though. + +(8) Namespace specific cgroup hierarchy can be mounted by a process running + inside cgroupns: + $ mount -t cgroup -o __DEVEL__sane_behavior cgroup $MOUNT_POINT + + This will mount the unified cgroup hierarchy with cgroupns-root as the + filesystem root. The process needs CAP_SYS_ADMIN in its userns and mntns. --- linux-riscv-5.4.0.orig/Documentation/core-api/printk-formats.rst +++ linux-riscv-5.4.0/Documentation/core-api/printk-formats.rst @@ -508,6 +508,12 @@ Thanks ====== +Kernel messages: + + %pj 123456 + + For generating the jhash of a string truncated to six digits + If you add other %p extensions, please extend with one or more test cases, if at all feasible. --- linux-riscv-5.4.0.orig/Documentation/devicetree/bindings/Makefile +++ linux-riscv-5.4.0/Documentation/devicetree/bindings/Makefile @@ -12,7 +12,6 @@ $(call if_changed,chk_binding) DT_TMP_SCHEMA := processed-schema.yaml -extra-y += $(DT_TMP_SCHEMA) quiet_cmd_mk_schema = SCHEMA $@ cmd_mk_schema = $(DT_MK_SCHEMA) $(DT_MK_SCHEMA_FLAGS) -o $@ $(real-prereqs) @@ -26,8 +25,12 @@ DT_SCHEMA_FILES ?= $(addprefix $(src)/,$(DT_DOCS)) +ifeq ($(CHECK_DTBS),) extra-y += $(patsubst $(src)/%.yaml,%.example.dts, $(DT_SCHEMA_FILES)) extra-y += $(patsubst $(src)/%.yaml,%.example.dt.yaml, $(DT_SCHEMA_FILES)) +endif $(obj)/$(DT_TMP_SCHEMA): $(DT_SCHEMA_FILES) FORCE $(call if_changed,mk_schema) + +extra-y += $(DT_TMP_SCHEMA) --- linux-riscv-5.4.0.orig/Documentation/devicetree/bindings/clock/renesas,rcar-usb2-clock-sel.txt +++ linux-riscv-5.4.0/Documentation/devicetree/bindings/clock/renesas,rcar-usb2-clock-sel.txt @@ -46,7 +46,7 @@ Example (R-Car H3): usb2_clksel: clock-controller@e6590630 { - compatible = "renesas,r8a77950-rcar-usb2-clock-sel", + compatible = "renesas,r8a7795-rcar-usb2-clock-sel", "renesas,rcar-gen3-usb2-clock-sel"; reg = <0 0xe6590630 0 0x02>; clocks = <&cpg CPG_MOD 703>, <&usb_extal>, <&usb_xtal>; --- linux-riscv-5.4.0.orig/Documentation/devicetree/bindings/iio/adc/adi,ad7606.yaml +++ linux-riscv-5.4.0/Documentation/devicetree/bindings/iio/adc/adi,ad7606.yaml @@ -85,7 +85,7 @@ Must be the device tree identifier of the over-sampling mode pins. As the line is active high, it should be marked GPIO_ACTIVE_HIGH. - maxItems: 1 + maxItems: 3 adi,sw-mode: description: @@ -128,9 +128,9 @@ adi,conversion-start-gpios = <&gpio 17 GPIO_ACTIVE_HIGH>; reset-gpios = <&gpio 27 GPIO_ACTIVE_HIGH>; adi,first-data-gpios = <&gpio 22 GPIO_ACTIVE_HIGH>; - adi,oversampling-ratio-gpios = <&gpio 18 GPIO_ACTIVE_HIGH - &gpio 23 GPIO_ACTIVE_HIGH - &gpio 26 GPIO_ACTIVE_HIGH>; + adi,oversampling-ratio-gpios = <&gpio 18 GPIO_ACTIVE_HIGH>, + <&gpio 23 GPIO_ACTIVE_HIGH>, + <&gpio 26 GPIO_ACTIVE_HIGH>; standby-gpios = <&gpio 24 GPIO_ACTIVE_LOW>; adi,sw-mode; }; --- linux-riscv-5.4.0.orig/Documentation/devicetree/bindings/net/fsl-fman.txt +++ linux-riscv-5.4.0/Documentation/devicetree/bindings/net/fsl-fman.txt @@ -110,6 +110,13 @@ Usage: required Definition: See soc/fsl/qman.txt and soc/fsl/bman.txt +- fsl,erratum-a050385 + Usage: optional + Value type: boolean + Definition: A boolean property. Indicates the presence of the + erratum A050385 which indicates that DMA transactions that are + split can result in a FMan lock. + ============================================================================= FMan MURAM Node --- linux-riscv-5.4.0.orig/Documentation/devicetree/bindings/net/snps,dwmac.yaml +++ linux-riscv-5.4.0/Documentation/devicetree/bindings/net/snps,dwmac.yaml @@ -347,6 +347,7 @@ - st,spear600-gmac then: + properties: snps,tso: $ref: /schemas/types.yaml#definitions/flag description: --- linux-riscv-5.4.0.orig/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.txt +++ linux-riscv-5.4.0/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.txt @@ -81,6 +81,12 @@ Definition: Name of external front end module used. Some valid FEM names for example: "microsemi-lx5586", "sky85703-11" and "sky85803" etc. +- qcom,snoc-host-cap-8bit-quirk: + Usage: Optional + Value type: + Definition: Quirk specifying that the firmware expects the 8bit version + of the host capability QMI request + Example (to supply PCI based wifi block details): --- linux-riscv-5.4.0.orig/Documentation/devicetree/bindings/reset/brcm,brcmstb-reset.txt +++ linux-riscv-5.4.0/Documentation/devicetree/bindings/reset/brcm,brcmstb-reset.txt @@ -22,6 +22,6 @@ }; ðernet_switch { - resets = <&reset>; + resets = <&reset 26>; reset-names = "switch"; }; --- linux-riscv-5.4.0.orig/Documentation/devicetree/bindings/rng/omap3_rom_rng.txt +++ linux-riscv-5.4.0/Documentation/devicetree/bindings/rng/omap3_rom_rng.txt @@ -0,0 +1,27 @@ +OMAP ROM RNG driver binding + +Secure SoCs may provide RNG via secure ROM calls like Nokia N900 does. The +implementation can depend on the SoC secure ROM used. + +- compatible: + Usage: required + Value type: + Definition: must be "nokia,n900-rom-rng" + +- clocks: + Usage: required + Value type: + Definition: reference to the the RNG interface clock + +- clock-names: + Usage: required + Value type: + Definition: must be "ick" + +Example: + + rom_rng: rng { + compatible = "nokia,n900-rom-rng"; + clocks = <&rng_ick>; + clock-names = "ick"; + }; --- linux-riscv-5.4.0.orig/Documentation/devicetree/bindings/sound/mt8183-mt6358-ts3a227-max98357.txt +++ linux-riscv-5.4.0/Documentation/devicetree/bindings/sound/mt8183-mt6358-ts3a227-max98357.txt @@ -2,9 +2,11 @@ Required properties: - compatible : "mediatek,mt8183_mt6358_ts3a227_max98357" -- mediatek,headset-codec: the phandles of ts3a227 codecs - mediatek,platform: the phandle of MT8183 ASoC platform +Optional properties: +- mediatek,headset-codec: the phandles of ts3a227 codecs + Example: sound { --- linux-riscv-5.4.0.orig/Documentation/devicetree/writing-schema.rst +++ linux-riscv-5.4.0/Documentation/devicetree/writing-schema.rst @@ -130,11 +130,13 @@ make dt_binding_check -In order to perform validation of DT source files, use the `dtbs_check` target:: +In order to perform validation of DT source files, use the ``dtbs_check`` target:: make dtbs_check -This will first run the `dt_binding_check` which generates the processed schema. +Note that ``dtbs_check`` will skip any binding schema files with errors. It is +necessary to use ``dt_binding_check`` to get all the validation errors in the +binding schema files. It is also possible to run checks with a single schema file by setting the ``DT_SCHEMA_FILES`` variable to a specific schema file. --- linux-riscv-5.4.0.orig/Documentation/fb/fbcon.rst +++ linux-riscv-5.4.0/Documentation/fb/fbcon.rst @@ -127,7 +127,7 @@ is typically located on the same video card. Thus, the consoles that are controlled by the VGA console will be garbled. -4. fbcon=rotate: +5. fbcon=rotate: This option changes the orientation angle of the console display. The value 'n' accepts the following: @@ -152,21 +152,21 @@ Actually, the underlying fb driver is totally ignorant of console rotation. -5. fbcon=margin: +6. fbcon=margin: This option specifies the color of the margins. The margins are the leftover area at the right and the bottom of the screen that are not used by text. By default, this area will be black. The 'color' value is an integer number that depends on the framebuffer driver being used. -6. fbcon=nodefer +7. fbcon=nodefer If the kernel is compiled with deferred fbcon takeover support, normally the framebuffer contents, left in place by the firmware/bootloader, will be preserved until there actually is some text is output to the console. This option causes fbcon to bind immediately to the fbdev device. -7. fbcon=logo-pos: +8. fbcon=logo-pos: The only possible 'location' is 'center' (without quotes), and when given, the bootup logo is moved from the default top-left corner --- linux-riscv-5.4.0.orig/Documentation/features/debug/kprobes-on-ftrace/arch-support.txt +++ linux-riscv-5.4.0/Documentation/features/debug/kprobes-on-ftrace/arch-support.txt @@ -24,7 +24,7 @@ | parisc: | ok | | powerpc: | ok | | riscv: | TODO | - | s390: | TODO | + | s390: | ok | | sh: | TODO | | sparc: | TODO | | um: | TODO | --- linux-riscv-5.4.0.orig/Documentation/filesystems/aufs/README +++ linux-riscv-5.4.0/Documentation/filesystems/aufs/README @@ -0,0 +1,401 @@ + +Aufs5 -- advanced multi layered unification filesystem version 5.x +http://aufs.sf.net +Junjiro R. Okajima + + +0. Introduction +---------------------------------------- +In the early days, aufs was entirely re-designed and re-implemented +Unionfs Version 1.x series. Adding many original ideas, approaches, +improvements and implementations, it became totally different from +Unionfs while keeping the basic features. +Later, Unionfs Version 2.x series began taking some of the same +approaches to aufs1's. +Unionfs was being developed by Professor Erez Zadok at Stony Brook +University and his team. + +Aufs5 supports linux-v5.0 and later, If you want older kernel version +support, +- for linux-v4.x series, try aufs4-linux.git or aufs4-standalone.git +- for linux-v3.x series, try aufs3-linux.git or aufs3-standalone.git +- for linux-v2.6.16 and later, try aufs2-2.6.git, aufs2-standalone.git + or aufs1 from CVS on SourceForge. + +Note: it becomes clear that "Aufs was rejected. Let's give it up." + According to Christoph Hellwig, linux rejects all union-type + filesystems but UnionMount. + + +PS. Al Viro seems have a plan to merge aufs as well as overlayfs and + UnionMount, and he pointed out an issue around a directory mutex + lock and aufs addressed it. But it is still unsure whether aufs will + be merged (or any other union solution). + + + +1. Features +---------------------------------------- +- unite several directories into a single virtual filesystem. The member + directory is called as a branch. +- you can specify the permission flags to the branch, which are 'readonly', + 'readwrite' and 'whiteout-able.' +- by upper writable branch, internal copyup and whiteout, files/dirs on + readonly branch are modifiable logically. +- dynamic branch manipulation, add, del. +- etc... + +Also there are many enhancements in aufs, such as: +- test only the highest one for the directory permission (dirperm1) +- copyup on open (coo=) +- 'move' policy for copy-up between two writable branches, after + checking free space. +- xattr, acl +- readdir(3) in userspace. +- keep inode number by external inode number table +- keep the timestamps of file/dir in internal copyup operation +- seekable directory, supporting NFS readdir. +- whiteout is hardlinked in order to reduce the consumption of inodes + on branch +- do not copyup, nor create a whiteout when it is unnecessary +- revert a single systemcall when an error occurs in aufs +- remount interface instead of ioctl +- maintain /etc/mtab by an external command, /sbin/mount.aufs. +- loopback mounted filesystem as a branch +- kernel thread for removing the dir who has a plenty of whiteouts +- support copyup sparse file (a file which has a 'hole' in it) +- default permission flags for branches +- selectable permission flags for ro branch, whether whiteout can + exist or not +- export via NFS. +- support /fs/aufs and /aufs. +- support multiple writable branches, some policies to select one + among multiple writable branches. +- a new semantics for link(2) and rename(2) to support multiple + writable branches. +- no glibc changes are required. +- pseudo hardlink (hardlink over branches) +- allow a direct access manually to a file on branch, e.g. bypassing aufs. + including NFS or remote filesystem branch. +- userspace wrapper for pathconf(3)/fpathconf(3) with _PC_LINK_MAX. +- and more... + +Currently these features are dropped temporary from aufs5. +See design/08plan.txt in detail. +- nested mount, i.e. aufs as readonly no-whiteout branch of another aufs + (robr) +- statistics of aufs thread (/sys/fs/aufs/stat) + +Features or just an idea in the future (see also design/*.txt), +- reorder the branch index without del/re-add. +- permanent xino files for NFSD +- an option for refreshing the opened files after add/del branches +- light version, without branch manipulation. (unnecessary?) +- copyup in userspace +- inotify in userspace +- readv/writev + + +2. Download +---------------------------------------- +There are three GIT trees for aufs5, aufs5-linux.git, +aufs5-standalone.git, and aufs-util.git. Note that there is no "5" in +"aufs-util.git." +While the aufs-util is always necessary, you need either of aufs5-linux +or aufs5-standalone. + +The aufs5-linux tree includes the whole linux mainline GIT tree, +git://git.kernel.org/.../torvalds/linux.git. +And you cannot select CONFIG_AUFS_FS=m for this version, eg. you cannot +build aufs5 as an external kernel module. +Several extra patches are not included in this tree. Only +aufs5-standalone tree contains them. They are described in the later +section "Configuration and Compilation." + +On the other hand, the aufs5-standalone tree has only aufs source files +and necessary patches, and you can select CONFIG_AUFS_FS=m. +But you need to apply all aufs patches manually. + +You will find GIT branches whose name is in form of "aufs5.x" where "x" +represents the linux kernel version, "linux-5.x". For instance, +"aufs5.0" is for linux-5.0. For latest "linux-5.x-rcN", use +"aufs5.x-rcN" branch. + +o aufs5-linux tree +$ git clone --reference /your/linux/git/tree \ + git://github.com/sfjro/aufs5-linux.git aufs5-linux.git +- if you don't have linux GIT tree, then remove "--reference ..." +$ cd aufs5-linux.git +$ git checkout origin/aufs5.0 + +Or You may want to directly git-pull aufs into your linux GIT tree, and +leave the patch-work to GIT. +$ cd /your/linux/git/tree +$ git remote add aufs5 git://github.com/sfjro/aufs5-linux.git +$ git fetch aufs5 +$ git checkout -b my5.0 v5.0 +$ (add your local change...) +$ git pull aufs5 aufs5.0 +- now you have v5.0 + your_changes + aufs5.0 in you my5.0 branch. +- you may need to solve some conflicts between your_changes and + aufs5.0. in this case, git-rerere is recommended so that you can + solve the similar conflicts automatically when you upgrade to 5.1 or + later in the future. + +o aufs5-standalone tree +$ git clone git://github.com/sfjro/aufs5-standalone.git aufs5-standalone.git +$ cd aufs5-standalone.git +$ git checkout origin/aufs5.0 + +o aufs-util tree +$ git clone git://git.code.sf.net/p/aufs/aufs-util aufs-util.git +- note that the public aufs-util.git is on SourceForge instead of + GitHUB. +$ cd aufs-util.git +$ git checkout origin/aufs5.0 + +Note: The 5.x-rcN branch is to be used with `rc' kernel versions ONLY. +The minor version number, 'x' in '5.x', of aufs may not always +follow the minor version number of the kernel. +Because changes in the kernel that cause the use of a new +minor version number do not always require changes to aufs-util. + +Since aufs-util has its own minor version number, you may not be +able to find a GIT branch in aufs-util for your kernel's +exact minor version number. +In this case, you should git-checkout the branch for the +nearest lower number. + +For (an unreleased) example: +If you are using "linux-5.10" and the "aufs5.10" branch +does not exist in aufs-util repository, then "aufs5.9", "aufs5.8" +or something numerically smaller is the branch for your kernel. + +Also you can view all branches by + $ git branch -a + + +3. Configuration and Compilation +---------------------------------------- +Make sure you have git-checkout'ed the correct branch. + +For aufs5-linux tree, +- enable CONFIG_AUFS_FS. +- set other aufs configurations if necessary. + +For aufs5-standalone tree, +There are several ways to build. + +1. +- apply ./aufs5-kbuild.patch to your kernel source files. +- apply ./aufs5-base.patch too. +- apply ./aufs5-mmap.patch too. +- apply ./aufs5-standalone.patch too, if you have a plan to set + CONFIG_AUFS_FS=m. otherwise you don't need ./aufs5-standalone.patch. +- copy ./{Documentation,fs,include/uapi/linux/aufs_type.h} files to your + kernel source tree. Never copy $PWD/include/uapi/linux/Kbuild. +- enable CONFIG_AUFS_FS, you can select either + =m or =y. +- and build your kernel as usual. +- install the built kernel. +- install the header files too by "make headers_install" to the + directory where you specify. By default, it is $PWD/usr. + "make help" shows a brief note for headers_install. +- and reboot your system. + +2. +- module only (CONFIG_AUFS_FS=m). +- apply ./aufs5-base.patch to your kernel source files. +- apply ./aufs5-mmap.patch too. +- apply ./aufs5-standalone.patch too. +- build your kernel, don't forget "make headers_install", and reboot. +- edit ./config.mk and set other aufs configurations if necessary. + Note: You should read $PWD/fs/aufs/Kconfig carefully which describes + every aufs configurations. +- build the module by simple "make". +- you can specify ${KDIR} make variable which points to your kernel + source tree. +- install the files + + run "make install" to install the aufs module, or copy the built + $PWD/aufs.ko to /lib/modules/... and run depmod -a (or reboot simply). + + run "make install_headers" (instead of headers_install) to install + the modified aufs header file (you can specify DESTDIR which is + available in aufs standalone version's Makefile only), or copy + $PWD/usr/include/linux/aufs_type.h to /usr/include/linux or wherever + you like manually. By default, the target directory is $PWD/usr. +- no need to apply aufs5-kbuild.patch, nor copying source files to your + kernel source tree. + +Note: The header file aufs_type.h is necessary to build aufs-util + as well as "make headers_install" in the kernel source tree. + headers_install is subject to be forgotten, but it is essentially + necessary, not only for building aufs-util. + You may not meet problems without headers_install in some older + version though. + +And then, +- read README in aufs-util, build and install it +- note that your distribution may contain an obsoleted version of + aufs_type.h in /usr/include/linux or something. When you build aufs + utilities, make sure that your compiler refers the correct aufs header + file which is built by "make headers_install." +- if you want to use readdir(3) in userspace or pathconf(3) wrapper, + then run "make install_ulib" too. And refer to the aufs manual in + detail. + +There several other patches in aufs5-standalone.git. They are all +optional. When you meet some problems, they will help you. +- aufs5-loopback.patch + Supports a nested loopback mount in a branch-fs. This patch is + unnecessary until aufs produces a message like "you may want to try + another patch for loopback file". +- proc_mounts.patch + When there are many mountpoints and many mount(2)/umount(2) are + running, then /proc/mounts may not show the all mountpoints. This + patch makes /proc/mounts always show the full mountpoints list. + If you don't want to apply this patch and meet such problem, then you + need to increase the value of 'ProcMounts_Times' make-variable in + aufs-util.git as a second best solution. +- vfs-ino.patch + Modifies a system global kernel internal function get_next_ino() in + order to stop assigning 0 for an inode-number. Not directly related to + aufs, but recommended generally. +- tmpfs-idr.patch + Keeps the tmpfs inode number as the lowest value. Effective to reduce + the size of aufs XINO files for tmpfs branch. Also it prevents the + duplication of inode number, which is important for backup tools and + other utilities. When you find aufs XINO files for tmpfs branch + growing too much, try this patch. +- lockdep-debug.patch + Because aufs is not only an ordinary filesystem (callee of VFS), but + also a caller of VFS functions for branch filesystems, subclassing of + the internal locks for LOCKDEP is necessary. LOCKDEP is a debugging + feature of linux kernel. If you enable CONFIG_LOCKDEP, then you will + need to apply this debug patch to expand several constant values. + If you don't know what LOCKDEP is, then you don't have apply this + patch. + + +4. Usage +---------------------------------------- +At first, make sure aufs-util are installed, and please read the aufs +manual, aufs.5 in aufs-util.git tree. +$ man -l aufs.5 + +And then, +$ mkdir /tmp/rw /tmp/aufs +# mount -t aufs -o br=/tmp/rw:${HOME} none /tmp/aufs + +Here is another example. The result is equivalent. +# mount -t aufs -o br=/tmp/rw=rw:${HOME}=ro none /tmp/aufs + Or +# mount -t aufs -o br:/tmp/rw none /tmp/aufs +# mount -o remount,append:${HOME} /tmp/aufs + +Then, you can see whole tree of your home dir through /tmp/aufs. If +you modify a file under /tmp/aufs, the one on your home directory is +not affected, instead the same named file will be newly created under +/tmp/rw. And all of your modification to a file will be applied to +the one under /tmp/rw. This is called the file based Copy on Write +(COW) method. +Aufs mount options are described in aufs.5. +If you run chroot or something and make your aufs as a root directory, +then you need to customize the shutdown script. See the aufs manual in +detail. + +Additionally, there are some sample usages of aufs which are a +diskless system with network booting, and LiveCD over NFS. +See sample dir in CVS tree on SourceForge. + + +5. Contact +---------------------------------------- +When you have any problems or strange behaviour in aufs, please let me +know with: +- /proc/mounts (instead of the output of mount(8)) +- /sys/module/aufs/* +- /sys/fs/aufs/* (if you have them) +- /debug/aufs/* (if you have them) +- linux kernel version + if your kernel is not plain, for example modified by distributor, + the url where i can download its source is necessary too. +- aufs version which was printed at loading the module or booting the + system, instead of the date you downloaded. +- configuration (define/undefine CONFIG_AUFS_xxx) +- kernel configuration or /proc/config.gz (if you have it) +- LSM (linux security module, if you are using) +- behaviour which you think to be incorrect +- actual operation, reproducible one is better +- mailto: aufs-users at lists.sourceforge.net + +Usually, I don't watch the Public Areas(Bugs, Support Requests, Patches, +and Feature Requests) on SourceForge. Please join and write to +aufs-users ML. + + +6. Acknowledgements +---------------------------------------- +Thanks to everyone who have tried and are using aufs, whoever +have reported a bug or any feedback. + +Especially donators: +Tomas Matejicek(slax.org) made a donation (much more than once). + Since Apr 2010, Tomas M (the author of Slax and Linux Live + scripts) is making "doubling" donations. + Unfortunately I cannot list all of the donators, but I really + appreciate. + It ends Aug 2010, but the ordinary donation URL is still available. + +Dai Itasaka made a donation (2007/8). +Chuck Smith made a donation (2008/4, 10 and 12). +Henk Schoneveld made a donation (2008/9). +Chih-Wei Huang, ASUS, CTC donated Eee PC 4G (2008/10). +Francois Dupoux made a donation (2008/11). +Bruno Cesar Ribas and Luis Carlos Erpen de Bona, C3SL serves public + aufs2 GIT tree (2009/2). +William Grant made a donation (2009/3). +Patrick Lane made a donation (2009/4). +The Mail Archive (mail-archive.com) made donations (2009/5). +Nippy Networks (Ed Wildgoose) made a donation (2009/7). +New Dream Network, LLC (www.dreamhost.com) made a donation (2009/11). +Pavel Pronskiy made a donation (2011/2). +Iridium and Inmarsat satellite phone retailer (www.mailasail.com), Nippy + Networks (Ed Wildgoose) made a donation for hardware (2011/3). +Max Lekomcev (DOM-TV project) made a donation (2011/7, 12, 2012/3, 6 and +11). +Sam Liddicott made a donation (2011/9). +Era Scarecrow made a donation (2013/4). +Bor Ratajc made a donation (2013/4). +Alessandro Gorreta made a donation (2013/4). +POIRETTE Marc made a donation (2013/4). +Alessandro Gorreta made a donation (2013/4). +lauri kasvandik made a donation (2013/5). +"pemasu from Finland" made a donation (2013/7). +The Parted Magic Project made a donation (2013/9 and 11). +Pavel Barta made a donation (2013/10). +Nikolay Pertsev made a donation (2014/5). +James B made a donation (2014/7 and 2015/7). +Stefano Di Biase made a donation (2014/8). +Daniel Epellei made a donation (2015/1). +OmegaPhil made a donation (2016/1, 2018/4). +Tomasz Szewczyk made a donation (2016/4). +James Burry made a donation (2016/12). +Carsten Rose made a donation (2018/9). +Porteus Kiosk made a donation (2018/10). + +Thank you very much. +Donations are always, including future donations, very important and +helpful for me to keep on developing aufs. + + +7. +---------------------------------------- +If you are an experienced user, no explanation is needed. Aufs is +just a linux filesystem. + + +Enjoy! + +# Local variables: ; +# mode: text; +# End: ; --- linux-riscv-5.4.0.orig/Documentation/filesystems/aufs/design/01intro.txt +++ linux-riscv-5.4.0/Documentation/filesystems/aufs/design/01intro.txt @@ -0,0 +1,171 @@ + +# Copyright (C) 2005-2020 Junjiro R. Okajima +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +Introduction +---------------------------------------- + +aufs [ei ju: ef es] | /ey-yoo-ef-es/ | [a u f s] +1. abbrev. for "advanced multi-layered unification filesystem". +2. abbrev. for "another unionfs". +3. abbrev. for "auf das" in German which means "on the" in English. + Ex. "Butter aufs Brot"(G) means "butter onto bread"(E). + But "Filesystem aufs Filesystem" is hard to understand. +4. abbrev. for "African Urban Fashion Show". + +AUFS is a filesystem with features: +- multi layered stackable unification filesystem, the member directory + is called as a branch. +- branch permission and attribute, 'readonly', 'real-readonly', + 'readwrite', 'whiteout-able', 'link-able whiteout', etc. and their + combination. +- internal "file copy-on-write". +- logical deletion, whiteout. +- dynamic branch manipulation, adding, deleting and changing permission. +- allow bypassing aufs, user's direct branch access. +- external inode number translation table and bitmap which maintains the + persistent aufs inode number. +- seekable directory, including NFS readdir. +- file mapping, mmap and sharing pages. +- pseudo-link, hardlink over branches. +- loopback mounted filesystem as a branch. +- several policies to select one among multiple writable branches. +- revert a single systemcall when an error occurs in aufs. +- and more... + + +Multi Layered Stackable Unification Filesystem +---------------------------------------------------------------------- +Most people already knows what it is. +It is a filesystem which unifies several directories and provides a +merged single directory. When users access a file, the access will be +passed/re-directed/converted (sorry, I am not sure which English word is +correct) to the real file on the member filesystem. The member +filesystem is called 'lower filesystem' or 'branch' and has a mode +'readonly' and 'readwrite.' And the deletion for a file on the lower +readonly branch is handled by creating 'whiteout' on the upper writable +branch. + +On LKML, there have been discussions about UnionMount (Jan Blunck, +Bharata B Rao and Valerie Aurora) and Unionfs (Erez Zadok). They took +different approaches to implement the merged-view. +The former tries putting it into VFS, and the latter implements as a +separate filesystem. +(If I misunderstand about these implementations, please let me know and +I shall correct it. Because it is a long time ago when I read their +source files last time). + +UnionMount's approach will be able to small, but may be hard to share +branches between several UnionMount since the whiteout in it is +implemented in the inode on branch filesystem and always +shared. According to Bharata's post, readdir does not seems to be +finished yet. +There are several missing features known in this implementations such as +- for users, the inode number may change silently. eg. copy-up. +- link(2) may break by copy-up. +- read(2) may get an obsoleted filedata (fstat(2) too). +- fcntl(F_SETLK) may be broken by copy-up. +- unnecessary copy-up may happen, for example mmap(MAP_PRIVATE) after + open(O_RDWR). + +In linux-3.18, "overlay" filesystem (formerly known as "overlayfs") was +merged into mainline. This is another implementation of UnionMount as a +separated filesystem. All the limitations and known problems which +UnionMount are equally inherited to "overlay" filesystem. + +Unionfs has a longer history. When I started implementing a stackable +filesystem (Aug 2005), it already existed. It has virtual super_block, +inode, dentry and file objects and they have an array pointing lower +same kind objects. After contributing many patches for Unionfs, I +re-started my project AUFS (Jun 2006). + +In AUFS, the structure of filesystem resembles to Unionfs, but I +implemented my own ideas, approaches and enhancements and it became +totally different one. + +Comparing DM snapshot and fs based implementation +- the number of bytes to be copied between devices is much smaller. +- the type of filesystem must be one and only. +- the fs must be writable, no readonly fs, even for the lower original + device. so the compression fs will not be usable. but if we use + loopback mount, we may address this issue. + for instance, + mount /cdrom/squashfs.img /sq + losetup /sq/ext2.img + losetup /somewhere/cow + dmsetup "snapshot /dev/loop0 /dev/loop1 ..." +- it will be difficult (or needs more operations) to extract the + difference between the original device and COW. +- DM snapshot-merge may help a lot when users try merging. in the + fs-layer union, users will use rsync(1). + +You may want to read my old paper "Filesystems in LiveCD" +(http://aufs.sourceforge.net/aufs2/report/sq/sq.pdf). + + +Several characters/aspects/persona of aufs +---------------------------------------------------------------------- + +Aufs has several characters, aspects or persona. +1. a filesystem, callee of VFS helper +2. sub-VFS, caller of VFS helper for branches +3. a virtual filesystem which maintains persistent inode number +4. reader/writer of files on branches such like an application + +1. Callee of VFS Helper +As an ordinary linux filesystem, aufs is a callee of VFS. For instance, +unlink(2) from an application reaches sys_unlink() kernel function and +then vfs_unlink() is called. vfs_unlink() is one of VFS helper and it +calls filesystem specific unlink operation. Actually aufs implements the +unlink operation but it behaves like a redirector. + +2. Caller of VFS Helper for Branches +aufs_unlink() passes the unlink request to the branch filesystem as if +it were called from VFS. So the called unlink operation of the branch +filesystem acts as usual. As a caller of VFS helper, aufs should handle +every necessary pre/post operation for the branch filesystem. +- acquire the lock for the parent dir on a branch +- lookup in a branch +- revalidate dentry on a branch +- mnt_want_write() for a branch +- vfs_unlink() for a branch +- mnt_drop_write() for a branch +- release the lock on a branch + +3. Persistent Inode Number +One of the most important issue for a filesystem is to maintain inode +numbers. This is particularly important to support exporting a +filesystem via NFS. Aufs is a virtual filesystem which doesn't have a +backend block device for its own. But some storage is necessary to +keep and maintain the inode numbers. It may be a large space and may not +suit to keep in memory. Aufs rents some space from its first writable +branch filesystem (by default) and creates file(s) on it. These files +are created by aufs internally and removed soon (currently) keeping +opened. +Note: Because these files are removed, they are totally gone after + unmounting aufs. It means the inode numbers are not persistent + across unmount or reboot. I have a plan to make them really + persistent which will be important for aufs on NFS server. + +4. Read/Write Files Internally (copy-on-write) +Because a branch can be readonly, when you write a file on it, aufs will +"copy-up" it to the upper writable branch internally. And then write the +originally requested thing to the file. Generally kernel doesn't +open/read/write file actively. In aufs, even a single write may cause a +internal "file copy". This behaviour is very similar to cp(1) command. + +Some people may think it is better to pass such work to user space +helper, instead of doing in kernel space. Actually I am still thinking +about it. But currently I have implemented it in kernel space. --- linux-riscv-5.4.0.orig/Documentation/filesystems/aufs/design/02struct.txt +++ linux-riscv-5.4.0/Documentation/filesystems/aufs/design/02struct.txt @@ -0,0 +1,258 @@ + +# Copyright (C) 2005-2020 Junjiro R. Okajima +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +Basic Aufs Internal Structure + +Superblock/Inode/Dentry/File Objects +---------------------------------------------------------------------- +As like an ordinary filesystem, aufs has its own +superblock/inode/dentry/file objects. All these objects have a +dynamically allocated array and store the same kind of pointers to the +lower filesystem, branch. +For example, when you build a union with one readwrite branch and one +readonly, mounted /au, /rw and /ro respectively. +- /au = /rw + /ro +- /ro/fileA exists but /rw/fileA + +Aufs lookup operation finds /ro/fileA and gets dentry for that. These +pointers are stored in a aufs dentry. The array in aufs dentry will be, +- [0] = NULL (because /rw/fileA doesn't exist) +- [1] = /ro/fileA + +This style of an array is essentially same to the aufs +superblock/inode/dentry/file objects. + +Because aufs supports manipulating branches, ie. add/delete/change +branches dynamically, these objects has its own generation. When +branches are changed, the generation in aufs superblock is +incremented. And a generation in other object are compared when it is +accessed. When a generation in other objects are obsoleted, aufs +refreshes the internal array. + + +Superblock +---------------------------------------------------------------------- +Additionally aufs superblock has some data for policies to select one +among multiple writable branches, XIB files, pseudo-links and kobject. +See below in detail. +About the policies which supports copy-down a directory, see +wbr_policy.txt too. + + +Branch and XINO(External Inode Number Translation Table) +---------------------------------------------------------------------- +Every branch has its own xino (external inode number translation table) +file. The xino file is created and unlinked by aufs internally. When two +members of a union exist on the same filesystem, they share the single +xino file. +The struct of a xino file is simple, just a sequence of aufs inode +numbers which is indexed by the lower inode number. +In the above sample, assume the inode number of /ro/fileA is i111 and +aufs assigns the inode number i999 for fileA. Then aufs writes 999 as +4(8) bytes at 111 * 4(8) bytes offset in the xino file. + +When the inode numbers are not contiguous, the xino file will be sparse +which has a hole in it and doesn't consume as much disk space as it +might appear. If your branch filesystem consumes disk space for such +holes, then you should specify 'xino=' option at mounting aufs. + +Aufs has a mount option to free the disk blocks for such holes in XINO +files on tmpfs or ramdisk. But it is not so effective actually. If you +meet a problem of disk shortage due to XINO files, then you should try +"tmpfs-ino.patch" (and "vfs-ino.patch" too) in aufs4-standalone.git. +The patch localizes the assignment inumbers per tmpfs-mount and avoid +the holes in XINO files. + +Also a writable branch has three kinds of "whiteout bases". All these +are existed when the branch is joined to aufs, and their names are +whiteout-ed doubly, so that users will never see their names in aufs +hierarchy. +1. a regular file which will be hardlinked to all whiteouts. +2. a directory to store a pseudo-link. +3. a directory to store an "orphan"-ed file temporary. + +1. Whiteout Base + When you remove a file on a readonly branch, aufs handles it as a + logical deletion and creates a whiteout on the upper writable branch + as a hardlink of this file in order not to consume inode on the + writable branch. +2. Pseudo-link Dir + See below, Pseudo-link. +3. Step-Parent Dir + When "fileC" exists on the lower readonly branch only and it is + opened and removed with its parent dir, and then user writes + something into it, then aufs copies-up fileC to this + directory. Because there is no other dir to store fileC. After + creating a file under this dir, the file is unlinked. + +Because aufs supports manipulating branches, ie. add/delete/change +dynamically, a branch has its own id. When the branch order changes, +aufs finds the new index by searching the branch id. + + +Pseudo-link +---------------------------------------------------------------------- +Assume "fileA" exists on the lower readonly branch only and it is +hardlinked to "fileB" on the branch. When you write something to fileA, +aufs copies-up it to the upper writable branch. Additionally aufs +creates a hardlink under the Pseudo-link Directory of the writable +branch. The inode of a pseudo-link is kept in aufs super_block as a +simple list. If fileB is read after unlinking fileA, aufs returns +filedata from the pseudo-link instead of the lower readonly +branch. Because the pseudo-link is based upon the inode, to keep the +inode number by xino (see above) is essentially necessary. + +All the hardlinks under the Pseudo-link Directory of the writable branch +should be restored in a proper location later. Aufs provides a utility +to do this. The userspace helpers executed at remounting and unmounting +aufs by default. +During this utility is running, it puts aufs into the pseudo-link +maintenance mode. In this mode, only the process which began the +maintenance mode (and its child processes) is allowed to operate in +aufs. Some other processes which are not related to the pseudo-link will +be allowed to run too, but the rest have to return an error or wait +until the maintenance mode ends. If a process already acquires an inode +mutex (in VFS), it has to return an error. + + +XIB(external inode number bitmap) +---------------------------------------------------------------------- +Addition to the xino file per a branch, aufs has an external inode number +bitmap in a superblock object. It is also an internal file such like a +xino file. +It is a simple bitmap to mark whether the aufs inode number is in-use or +not. +To reduce the file I/O, aufs prepares a single memory page to cache xib. + +As well as XINO files, aufs has a feature to truncate/refresh XIB to +reduce the number of consumed disk blocks for these files. + + +Virtual or Vertical Dir, and Readdir in Userspace +---------------------------------------------------------------------- +In order to support multiple layers (branches), aufs readdir operation +constructs a virtual dir block on memory. For readdir, aufs calls +vfs_readdir() internally for each dir on branches, merges their entries +with eliminating the whiteout-ed ones, and sets it to file (dir) +object. So the file object has its entry list until it is closed. The +entry list will be updated when the file position is zero and becomes +obsoleted. This decision is made in aufs automatically. + +The dynamically allocated memory block for the name of entries has a +unit of 512 bytes (by default) and stores the names contiguously (no +padding). Another block for each entry is handled by kmem_cache too. +During building dir blocks, aufs creates hash list and judging whether +the entry is whiteouted by its upper branch or already listed. +The merged result is cached in the corresponding inode object and +maintained by a customizable life-time option. + +Some people may call it can be a security hole or invite DoS attack +since the opened and once readdir-ed dir (file object) holds its entry +list and becomes a pressure for system memory. But I'd say it is similar +to files under /proc or /sys. The virtual files in them also holds a +memory page (generally) while they are opened. When an idea to reduce +memory for them is introduced, it will be applied to aufs too. +For those who really hate this situation, I've developed readdir(3) +library which operates this merging in userspace. You just need to set +LD_PRELOAD environment variable, and aufs will not consume no memory in +kernel space for readdir(3). + + +Workqueue +---------------------------------------------------------------------- +Aufs sometimes requires privilege access to a branch. For instance, +in copy-up/down operation. When a user process is going to make changes +to a file which exists in the lower readonly branch only, and the mode +of one of ancestor directories may not be writable by a user +process. Here aufs copy-up the file with its ancestors and they may +require privilege to set its owner/group/mode/etc. +This is a typical case of a application character of aufs (see +Introduction). + +Aufs uses workqueue synchronously for this case. It creates its own +workqueue. The workqueue is a kernel thread and has privilege. Aufs +passes the request to call mkdir or write (for example), and wait for +its completion. This approach solves a problem of a signal handler +simply. +If aufs didn't adopt the workqueue and changed the privilege of the +process, then the process may receive the unexpected SIGXFSZ or other +signals. + +Also aufs uses the system global workqueue ("events" kernel thread) too +for asynchronous tasks, such like handling inotify/fsnotify, re-creating a +whiteout base and etc. This is unrelated to a privilege. +Most of aufs operation tries acquiring a rw_semaphore for aufs +superblock at the beginning, at the same time waits for the completion +of all queued asynchronous tasks. + + +Whiteout +---------------------------------------------------------------------- +The whiteout in aufs is very similar to Unionfs's. That is represented +by its filename. UnionMount takes an approach of a file mode, but I am +afraid several utilities (find(1) or something) will have to support it. + +Basically the whiteout represents "logical deletion" which stops aufs to +lookup further, but also it represents "dir is opaque" which also stop +further lookup. + +In aufs, rmdir(2) and rename(2) for dir uses whiteout alternatively. +In order to make several functions in a single systemcall to be +revertible, aufs adopts an approach to rename a directory to a temporary +unique whiteouted name. +For example, in rename(2) dir where the target dir already existed, aufs +renames the target dir to a temporary unique whiteouted name before the +actual rename on a branch, and then handles other actions (make it opaque, +update the attributes, etc). If an error happens in these actions, aufs +simply renames the whiteouted name back and returns an error. If all are +succeeded, aufs registers a function to remove the whiteouted unique +temporary name completely and asynchronously to the system global +workqueue. + + +Copy-up +---------------------------------------------------------------------- +It is a well-known feature or concept. +When user modifies a file on a readonly branch, aufs operate "copy-up" +internally and makes change to the new file on the upper writable branch. +When the trigger systemcall does not update the timestamps of the parent +dir, aufs reverts it after copy-up. + + +Move-down (aufs3.9 and later) +---------------------------------------------------------------------- +"Copy-up" is one of the essential feature in aufs. It copies a file from +the lower readonly branch to the upper writable branch when a user +changes something about the file. +"Move-down" is an opposite action of copy-up. Basically this action is +ran manually instead of automatically and internally. +For desgin and implementation, aufs has to consider these issues. +- whiteout for the file may exist on the lower branch. +- ancestor directories may not exist on the lower branch. +- diropq for the ancestor directories may exist on the upper branch. +- free space on the lower branch will reduce. +- another access to the file may happen during moving-down, including + UDBA (see "Revalidate Dentry and UDBA"). +- the file should not be hard-linked nor pseudo-linked. they should be + handled by auplink utility later. + +Sometimes users want to move-down a file from the upper writable branch +to the lower readonly or writable branch. For instance, +- the free space of the upper writable branch is going to run out. +- create a new intermediate branch between the upper and lower branch. +- etc. + +For this purpose, use "aumvdown" command in aufs-util.git. --- linux-riscv-5.4.0.orig/Documentation/filesystems/aufs/design/03atomic_open.txt +++ linux-riscv-5.4.0/Documentation/filesystems/aufs/design/03atomic_open.txt @@ -0,0 +1,85 @@ + +# Copyright (C) 2015-2020 Junjiro R. Okajima +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +Support for a branch who has its ->atomic_open() +---------------------------------------------------------------------- +The filesystems who implement its ->atomic_open() are not majority. For +example NFSv4 does, and aufs should call NFSv4 ->atomic_open, +particularly for open(O_CREAT|O_EXCL, 0400) case. Other than +->atomic_open(), NFSv4 returns an error for this open(2). While I am not +sure whether all filesystems who have ->atomic_open() behave like this, +but NFSv4 surely returns the error. + +In order to support ->atomic_open() for aufs, there are a few +approaches. + +A. Introduce aufs_atomic_open() + - calls one of VFS:do_last(), lookup_open() or atomic_open() for + branch fs. +B. Introduce aufs_atomic_open() calling create, open and chmod. this is + an aufs user Pip Cet's approach + - calls aufs_create(), VFS finish_open() and notify_change(). + - pass fake-mode to finish_open(), and then correct the mode by + notify_change(). +C. Extend aufs_open() to call branch fs's ->atomic_open() + - no aufs_atomic_open(). + - aufs_lookup() registers the TID to an aufs internal object. + - aufs_create() does nothing when the matching TID is registered, but + registers the mode. + - aufs_open() calls branch fs's ->atomic_open() when the matching + TID is registered. +D. Extend aufs_open() to re-try branch fs's ->open() with superuser's + credential + - no aufs_atomic_open(). + - aufs_create() registers the TID to an internal object. this info + represents "this process created this file just now." + - when aufs gets EACCES from branch fs's ->open(), then confirm the + registered TID and re-try open() with superuser's credential. + +Pros and cons for each approach. + +A. + - straightforward but highly depends upon VFS internal. + - the atomic behavaiour is kept. + - some of parameters such as nameidata are hard to reproduce for + branch fs. + - large overhead. +B. + - easy to implement. + - the atomic behavaiour is lost. +C. + - the atomic behavaiour is kept. + - dirty and tricky. + - VFS checks whether the file is created correctly after calling + ->create(), which means this approach doesn't work. +D. + - easy to implement. + - the atomic behavaiour is lost. + - to open a file with superuser's credential and give it to a user + process is a bad idea, since the file object keeps the credential + in it. It may affect LSM or something. This approach doesn't work + either. + +The approach A is ideal, but it hard to implement. So here is a +variation of A, which is to be implemented. + +A-1. Introduce aufs_atomic_open() + - calls branch fs ->atomic_open() if exists. otherwise calls + vfs_create() and finish_open(). + - the demerit is that the several checks after branch fs + ->atomic_open() are lost. in the ordinary case, the checks are + done by VFS:do_last(), lookup_open() and atomic_open(). some can + be implemented in aufs, but not all I am afraid. --- linux-riscv-5.4.0.orig/Documentation/filesystems/aufs/design/03lookup.txt +++ linux-riscv-5.4.0/Documentation/filesystems/aufs/design/03lookup.txt @@ -0,0 +1,113 @@ + +# Copyright (C) 2005-2020 Junjiro R. Okajima +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +Lookup in a Branch +---------------------------------------------------------------------- +Since aufs has a character of sub-VFS (see Introduction), it operates +lookup for branches as VFS does. It may be a heavy work. But almost all +lookup operation in aufs is the simplest case, ie. lookup only an entry +directly connected to its parent. Digging down the directory hierarchy +is unnecessary. VFS has a function lookup_one_len() for that use, and +aufs calls it. + +When a branch is a remote filesystem, aufs basically relies upon its +->d_revalidate(), also aufs forces the hardest revalidate tests for +them. +For d_revalidate, aufs implements three levels of revalidate tests. See +"Revalidate Dentry and UDBA" in detail. + + +Test Only the Highest One for the Directory Permission (dirperm1 option) +---------------------------------------------------------------------- +Let's try case study. +- aufs has two branches, upper readwrite and lower readonly. + /au = /rw + /ro +- "dirA" exists under /ro, but /rw. and its mode is 0700. +- user invoked "chmod a+rx /au/dirA" +- the internal copy-up is activated and "/rw/dirA" is created and its + permission bits are set to world readable. +- then "/au/dirA" becomes world readable? + +In this case, /ro/dirA is still 0700 since it exists in readonly branch, +or it may be a natively readonly filesystem. If aufs respects the lower +branch, it should not respond readdir request from other users. But user +allowed it by chmod. Should really aufs rejects showing the entries +under /ro/dirA? + +To be honest, I don't have a good solution for this case. So aufs +implements 'dirperm1' and 'nodirperm1' mount options, and leave it to +users. +When dirperm1 is specified, aufs checks only the highest one for the +directory permission, and shows the entries. Otherwise, as usual, checks +every dir existing on all branches and rejects the request. + +As a side effect, dirperm1 option improves the performance of aufs +because the number of permission check is reduced when the number of +branch is many. + + +Revalidate Dentry and UDBA (User's Direct Branch Access) +---------------------------------------------------------------------- +Generally VFS helpers re-validate a dentry as a part of lookup. +0. digging down the directory hierarchy. +1. lock the parent dir by its i_mutex. +2. lookup the final (child) entry. +3. revalidate it. +4. call the actual operation (create, unlink, etc.) +5. unlock the parent dir + +If the filesystem implements its ->d_revalidate() (step 3), then it is +called. Actually aufs implements it and checks the dentry on a branch is +still valid. +But it is not enough. Because aufs has to release the lock for the +parent dir on a branch at the end of ->lookup() (step 2) and +->d_revalidate() (step 3) while the i_mutex of the aufs dir is still +held by VFS. +If the file on a branch is changed directly, eg. bypassing aufs, after +aufs released the lock, then the subsequent operation may cause +something unpleasant result. + +This situation is a result of VFS architecture, ->lookup() and +->d_revalidate() is separated. But I never say it is wrong. It is a good +design from VFS's point of view. It is just not suitable for sub-VFS +character in aufs. + +Aufs supports such case by three level of revalidation which is +selectable by user. +1. Simple Revalidate + Addition to the native flow in VFS's, confirm the child-parent + relationship on the branch just after locking the parent dir on the + branch in the "actual operation" (step 4). When this validation + fails, aufs returns EBUSY. ->d_revalidate() (step 3) in aufs still + checks the validation of the dentry on branches. +2. Monitor Changes Internally by Inotify/Fsnotify + Addition to above, in the "actual operation" (step 4) aufs re-lookup + the dentry on the branch, and returns EBUSY if it finds different + dentry. + Additionally, aufs sets the inotify/fsnotify watch for every dir on branches + during it is in cache. When the event is notified, aufs registers a + function to kernel 'events' thread by schedule_work(). And the + function sets some special status to the cached aufs dentry and inode + private data. If they are not cached, then aufs has nothing to + do. When the same file is accessed through aufs (step 0-3) later, + aufs will detect the status and refresh all necessary data. + In this mode, aufs has to ignore the event which is fired by aufs + itself. +3. No Extra Validation + This is the simplest test and doesn't add any additional revalidation + test, and skip the revalidation in step 4. It is useful and improves + aufs performance when system surely hide the aufs branches from user, + by over-mounting something (or another method). --- linux-riscv-5.4.0.orig/Documentation/filesystems/aufs/design/04branch.txt +++ linux-riscv-5.4.0/Documentation/filesystems/aufs/design/04branch.txt @@ -0,0 +1,74 @@ + +# Copyright (C) 2005-2020 Junjiro R. Okajima +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +Branch Manipulation + +Since aufs supports dynamic branch manipulation, ie. add/remove a branch +and changing its permission/attribute, there are a lot of works to do. + + +Add a Branch +---------------------------------------------------------------------- +o Confirm the adding dir exists outside of aufs, including loopback + mount, and its various attributes. +o Initialize the xino file and whiteout bases if necessary. + See struct.txt. + +o Check the owner/group/mode of the directory + When the owner/group/mode of the adding directory differs from the + existing branch, aufs issues a warning because it may impose a + security risk. + For example, when a upper writable branch has a world writable empty + top directory, a malicious user can create any files on the writable + branch directly, like copy-up and modify manually. If something like + /etc/{passwd,shadow} exists on the lower readonly branch but the upper + writable branch, and the writable branch is world-writable, then a + malicious guy may create /etc/passwd on the writable branch directly + and the infected file will be valid in aufs. + I am afraid it can be a security issue, but aufs can do nothing except + producing a warning. + + +Delete a Branch +---------------------------------------------------------------------- +o Confirm the deleting branch is not busy + To be general, there is one merit to adopt "remount" interface to + manipulate branches. It is to discard caches. At deleting a branch, + aufs checks the still cached (and connected) dentries and inodes. If + there are any, then they are all in-use. An inode without its + corresponding dentry can be alive alone (for example, inotify/fsnotify case). + + For the cached one, aufs checks whether the same named entry exists on + other branches. + If the cached one is a directory, because aufs provides a merged view + to users, as long as one dir is left on any branch aufs can show the + dir to users. In this case, the branch can be removed from aufs. + Otherwise aufs rejects deleting the branch. + + If any file on the deleting branch is opened by aufs, then aufs + rejects deleting. + + +Modify the Permission of a Branch +---------------------------------------------------------------------- +o Re-initialize or remove the xino file and whiteout bases if necessary. + See struct.txt. + +o rw --> ro: Confirm the modifying branch is not busy + Aufs rejects the request if any of these conditions are true. + - a file on the branch is mmap-ed. + - a regular file on the branch is opened for write and there is no + same named entry on the upper branch. --- linux-riscv-5.4.0.orig/Documentation/filesystems/aufs/design/05wbr_policy.txt +++ linux-riscv-5.4.0/Documentation/filesystems/aufs/design/05wbr_policy.txt @@ -0,0 +1,64 @@ + +# Copyright (C) 2005-2020 Junjiro R. Okajima +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +Policies to Select One among Multiple Writable Branches +---------------------------------------------------------------------- +When the number of writable branch is more than one, aufs has to decide +the target branch for file creation or copy-up. By default, the highest +writable branch which has the parent (or ancestor) dir of the target +file is chosen (top-down-parent policy). +By user's request, aufs implements some other policies to select the +writable branch, for file creation several policies, round-robin, +most-free-space, and other policies. For copy-up, top-down-parent, +bottom-up-parent, bottom-up and others. + +As expected, the round-robin policy selects the branch in circular. When +you have two writable branches and creates 10 new files, 5 files will be +created for each branch. mkdir(2) systemcall is an exception. When you +create 10 new directories, all will be created on the same branch. +And the most-free-space policy selects the one which has most free +space among the writable branches. The amount of free space will be +checked by aufs internally, and users can specify its time interval. + +The policies for copy-up is more simple, +top-down-parent is equivalent to the same named on in create policy, +bottom-up-parent selects the writable branch where the parent dir +exists and the nearest upper one from the copyup-source, +bottom-up selects the nearest upper writable branch from the +copyup-source, regardless the existence of the parent dir. + +There are some rules or exceptions to apply these policies. +- If there is a readonly branch above the policy-selected branch and + the parent dir is marked as opaque (a variation of whiteout), or the + target (creating) file is whiteout-ed on the upper readonly branch, + then the result of the policy is ignored and the target file will be + created on the nearest upper writable branch than the readonly branch. +- If there is a writable branch above the policy-selected branch and + the parent dir is marked as opaque or the target file is whiteouted + on the branch, then the result of the policy is ignored and the target + file will be created on the highest one among the upper writable + branches who has diropq or whiteout. In case of whiteout, aufs removes + it as usual. +- link(2) and rename(2) systemcalls are exceptions in every policy. + They try selecting the branch where the source exists as possible + since copyup a large file will take long time. If it can't be, + ie. the branch where the source exists is readonly, then they will + follow the copyup policy. +- There is an exception for rename(2) when the target exists. + If the rename target exists, aufs compares the index of the branches + where the source and the target exists and selects the higher + one. If the selected branch is readonly, then aufs follows the + copyup policy. --- linux-riscv-5.4.0.orig/Documentation/filesystems/aufs/design/06dirren.dot +++ linux-riscv-5.4.0/Documentation/filesystems/aufs/design/06dirren.dot @@ -0,0 +1,31 @@ + +// to view this graph, run dot(1) command in GRAPHVIZ. + +digraph G { +node [shape=box]; +whinfo [label="detailed info file\n(lower_brid_root-hinum, h_inum, namelen, old name)"]; + +node [shape=oval]; + +aufs_rename -> whinfo [label="store/remove"]; + +node [shape=oval]; +inode_list [label="h_inum list in branch\ncache"]; + +node [shape=box]; +whinode [label="h_inum list file"]; + +node [shape=oval]; +brmgmt [label="br_add/del/mod/umount"]; + +brmgmt -> inode_list [label="create/remove"]; +brmgmt -> whinode [label="load/store"]; + +inode_list -> whinode [style=dashed,dir=both]; + +aufs_rename -> inode_list [label="add/del"]; + +aufs_lookup -> inode_list [label="search"]; + +aufs_lookup -> whinfo [label="load/remove"]; +} --- linux-riscv-5.4.0.orig/Documentation/filesystems/aufs/design/06dirren.txt +++ linux-riscv-5.4.0/Documentation/filesystems/aufs/design/06dirren.txt @@ -0,0 +1,102 @@ + +# Copyright (C) 2017-2020 Junjiro R. Okajima +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +Special handling for renaming a directory (DIRREN) +---------------------------------------------------------------------- +First, let's assume we have a simple usecase. + +- /u = /rw + /ro +- /rw/dirA exists +- /ro/dirA and /ro/dirA/file exist too +- there is no dirB on both branches +- a user issues rename("dirA", "dirB") + +Now, what should aufs behave against this rename(2)? +There are a few possible cases. + +A. returns EROFS. + since dirA exists on a readonly branch which cannot be renamed. +B. returns EXDEV. + it is possible to copy-up dirA (only the dir itself), but the child + entries ("file" in this case) should not be. it must be a bad + approach to copy-up recursively. +C. returns a success. + even the branch /ro is readonly, aufs tries renaming it. Obviously it + is a violation of aufs' policy. +D. construct an extra information which indicates that /ro/dirA should + be handled as the name of dirB. + overlayfs has a similar feature called REDIRECT. + +Until now, aufs implements the case B only which returns EXDEV, and +expects the userspace application behaves like mv(1) which tries +issueing rename(2) recursively. + +A new aufs feature called DIRREN is introduced which implements the case +D. There are several "extra information" added. + +1. detailed info per renamed directory + path: /rw/dirB/$AUFS_WH_DR_INFO_PFX. +2. the inode-number list of directories on a branch + path: /rw/dirB/$AUFS_WH_DR_BRHINO + +The filename of "detailed info per directory" represents the lower +branch, and its format is +- a type of the branch id + one of these. + + uuid (not implemented yet) + + fsid + + dev +- the inode-number of the branch root dir + +And it contains these info in a single regular file. +- magic number +- branch's inode-number of the logically renamed dir +- the name of the before-renamed dir + +The "detailed info per directory" file is created in aufs rename(2), and +loaded in any lookup. +The info is considered in lookup for the matching case only. Here +"matching" means that the root of branch (in the info filename) is same +to the current looking-up branch. After looking-up the before-renamed +name, the inode-number is compared. And the matched dentry is used. + +The "inode-number list of directories" is a regular file which contains +simply the inode-numbers on the branch. The file is created or updated +in removing the branch, and loaded in adding the branch. Its lifetime is +equal to the branch. +The list is refered in lookup, and when the current target inode is +found in the list, the aufs tries loading the "detailed info per +directory" and get the changed and valid name of the dir. + +Theoretically these "extra informaiton" may be able to be put into XATTR +in the dir inode. But aufs doesn't choose this way because +1. XATTR may not be supported by the branch (or its configuration) +2. XATTR may have its size limit. +3. XATTR may be less easy to convert than a regular file, when the + format of the info is changed in the future. +At the same time, I agree that the regular file approach is much slower +than XATTR approach. So, in the future, aufs may take the XATTR or other +better approach. + +This DIRREN feature is enabled by aufs configuration, and is activated +by a new mount option. + +For the more complicated case, there is a work with UDBA option, which +is to dected the direct access to the branches (by-passing aufs) and to +maintain the cashes in aufs. Since a single cached aufs dentry may +contains two names, before- and after-rename, the name comparision in +UDBA handler may not work correctly. In this case, the behaviour will be +equivalen to udba=reval case. --- linux-riscv-5.4.0.orig/Documentation/filesystems/aufs/design/06fhsm.txt +++ linux-riscv-5.4.0/Documentation/filesystems/aufs/design/06fhsm.txt @@ -0,0 +1,120 @@ + +# Copyright (C) 2011-2020 Junjiro R. Okajima +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + +File-based Hierarchical Storage Management (FHSM) +---------------------------------------------------------------------- +Hierarchical Storage Management (or HSM) is a well-known feature in the +storage world. Aufs provides this feature as file-based with multiple +writable branches, based upon the principle of "Colder, the Lower". +Here the word "colder" means that the less used files, and "lower" means +that the position in the order of the stacked branches vertically. +These multiple writable branches are prioritized, ie. the topmost one +should be the fastest drive and be used heavily. + +o Characters in aufs FHSM story +- aufs itself and a new branch attribute. +- a new ioctl interface to move-down and to establish a connection with + the daemon ("move-down" is a converse of "copy-up"). +- userspace tool and daemon. + +The userspace daemon establishes a connection with aufs and waits for +the notification. The notified information is very similar to struct +statfs containing the number of consumed blocks and inodes. +When the consumed blocks/inodes of a branch exceeds the user-specified +upper watermark, the daemon activates its move-down process until the +consumed blocks/inodes reaches the user-specified lower watermark. + +The actual move-down is done by aufs based upon the request from +user-space since we need to maintain the inode number and the internal +pointer arrays in aufs. + +Currently aufs FHSM handles the regular files only. Additionally they +must not be hard-linked nor pseudo-linked. + + +o Cowork of aufs and the user-space daemon + During the userspace daemon established the connection, aufs sends a + small notification to it whenever aufs writes something into the + writable branch. But it may cost high since aufs issues statfs(2) + internally. So user can specify a new option to cache the + info. Actually the notification is controlled by these factors. + + the specified cache time. + + classified as "force" by aufs internally. + Until the specified time expires, aufs doesn't send the info + except the forced cases. When aufs decide forcing, the info is always + notified to userspace. + For example, the number of free inodes is generally large enough and + the shortage of it happens rarely. So aufs doesn't force the + notification when creating a new file, directory and others. This is + the typical case which aufs doesn't force. + When aufs writes the actual filedata and the files consumes any of new + blocks, the aufs forces notifying. + + +o Interfaces in aufs +- New branch attribute. + + fhsm + Specifies that the branch is managed by FHSM feature. In other word, + participant in the FHSM. + When nofhsm is set to the branch, it will not be the source/target + branch of the move-down operation. This attribute is set + independently from coo and moo attributes, and if you want full + FHSM, you should specify them as well. +- New mount option. + + fhsm_sec + Specifies a second to suppress many less important info to be + notified. +- New ioctl. + + AUFS_CTL_FHSM_FD + create a new file descriptor which userspace can read the notification + (a subset of struct statfs) from aufs. +- Module parameter 'brs' + It has to be set to 1. Otherwise the new mount option 'fhsm' will not + be set. +- mount helpers /sbin/mount.aufs and /sbin/umount.aufs + When there are two or more branches with fhsm attributes, + /sbin/mount.aufs invokes the user-space daemon and /sbin/umount.aufs + terminates it. As a result of remounting and branch-manipulation, the + number of branches with fhsm attribute can be one. In this case, + /sbin/mount.aufs will terminate the user-space daemon. + + +Finally the operation is done as these steps in kernel-space. +- make sure that, + + no one else is using the file. + + the file is not hard-linked. + + the file is not pseudo-linked. + + the file is a regular file. + + the parent dir is not opaqued. +- find the target writable branch. +- make sure the file is not whiteout-ed by the upper (than the target) + branch. +- make the parent dir on the target branch. +- mutex lock the inode on the branch. +- unlink the whiteout on the target branch (if exists). +- lookup and create the whiteout-ed temporary name on the target branch. +- copy the file as the whiteout-ed temporary name on the target branch. +- rename the whiteout-ed temporary name to the original name. +- unlink the file on the source branch. +- maintain the internal pointer array and the external inode number + table (XINO). +- maintain the timestamps and other attributes of the parent dir and the + file. + +And of course, in every step, an error may happen. So the operation +should restore the original file state after an error happens. --- linux-riscv-5.4.0.orig/Documentation/filesystems/aufs/design/06mmap.txt +++ linux-riscv-5.4.0/Documentation/filesystems/aufs/design/06mmap.txt @@ -0,0 +1,72 @@ + +# Copyright (C) 2005-2020 Junjiro R. Okajima +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +mmap(2) -- File Memory Mapping +---------------------------------------------------------------------- +In aufs, the file-mapped pages are handled by a branch fs directly, no +interaction with aufs. It means aufs_mmap() calls the branch fs's +->mmap(). +This approach is simple and good, but there is one problem. +Under /proc, several entries show the mmapped files by its path (with +device and inode number), and the printed path will be the path on the +branch fs's instead of virtual aufs's. +This is not a problem in most cases, but some utilities lsof(1) (and its +user) may expect the path on aufs. + +To address this issue, aufs adds a new member called vm_prfile in struct +vm_area_struct (and struct vm_region). The original vm_file points to +the file on the branch fs in order to handle everything correctly as +usual. The new vm_prfile points to a virtual file in aufs, and the +show-functions in procfs refers to vm_prfile if it is set. +Also we need to maintain several other places where touching vm_file +such like +- fork()/clone() copies vma and the reference count of vm_file is + incremented. +- merging vma maintains the ref count too. + +This is not a good approach. It just fakes the printed path. But it +leaves all behaviour around f_mapping unchanged. This is surely an +advantage. +Actually aufs had adopted another complicated approach which calls +generic_file_mmap() and handles struct vm_operations_struct. In this +approach, aufs met a hard problem and I could not solve it without +switching the approach. + +There may be one more another approach which is +- bind-mount the branch-root onto the aufs-root internally +- grab the new vfsmount (ie. struct mount) +- lazy-umount the branch-root internally +- in open(2) the aufs-file, open the branch-file with the hidden + vfsmount (instead of the original branch's vfsmount) +- ideally this "bind-mount and lazy-umount" should be done atomically, + but it may be possible from userspace by the mount helper. + +Adding the internal hidden vfsmount and using it in opening a file, the +file path under /proc will be printed correctly. This approach looks +smarter, but is not possible I am afraid. +- aufs-root may be bind-mount later. when it happens, another hidden + vfsmount will be required. +- it is hard to get the chance to bind-mount and lazy-umount + + in kernel-space, FS can have vfsmount in open(2) via + file->f_path, and aufs can know its vfsmount. But several locks are + already acquired, and if aufs tries to bind-mount and lazy-umount + here, then it may cause a deadlock. + + in user-space, bind-mount doesn't invoke the mount helper. +- since /proc shows dev and ino, aufs has to give vma these info. it + means a new member vm_prinode will be necessary. this is essentially + equivalent to vm_prfile described above. + +I have to give up this "looks-smater" approach. --- linux-riscv-5.4.0.orig/Documentation/filesystems/aufs/design/06xattr.txt +++ linux-riscv-5.4.0/Documentation/filesystems/aufs/design/06xattr.txt @@ -0,0 +1,96 @@ + +# Copyright (C) 2014-2020 Junjiro R. Okajima +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + +Listing XATTR/EA and getting the value +---------------------------------------------------------------------- +For the inode standard attributes (owner, group, timestamps, etc.), aufs +shows the values from the topmost existing file. This behaviour is good +for the non-dir entries since the bahaviour exactly matches the shown +information. But for the directories, aufs considers all the same named +entries on the lower branches. Which means, if one of the lower entry +rejects readdir call, then aufs returns an error even if the topmost +entry allows it. This behaviour is necessary to respect the branch fs's +security, but can make users confused since the user-visible standard +attributes don't match the behaviour. +To address this issue, aufs has a mount option called dirperm1 which +checks the permission for the topmost entry only, and ignores the lower +entry's permission. + +A similar issue can happen around XATTR. +getxattr(2) and listxattr(2) families behave as if dirperm1 option is +always set. Otherwise these very unpleasant situation would happen. +- listxattr(2) may return the duplicated entries. +- users may not be able to remove or reset the XATTR forever, + + +XATTR/EA support in the internal (copy,move)-(up,down) +---------------------------------------------------------------------- +Generally the extended attributes of inode are categorized as these. +- "security" for LSM and capability. +- "system" for posix ACL, 'acl' mount option is required for the branch + fs generally. +- "trusted" for userspace, CAP_SYS_ADMIN is required. +- "user" for userspace, 'user_xattr' mount option is required for the + branch fs generally. + +Moreover there are some other categories. Aufs handles these rather +unpopular categories as the ordinary ones, ie. there is no special +condition nor exception. + +In copy-up, the support for XATTR on the dst branch may differ from the +src branch. In this case, the copy-up operation will get an error and +the original user operation which triggered the copy-up will fail. It +can happen that even all copy-up will fail. +When both of src and dst branches support XATTR and if an error occurs +during copying XATTR, then the copy-up should fail obviously. That is a +good reason and aufs should return an error to userspace. But when only +the src branch support that XATTR, aufs should not return an error. +For example, the src branch supports ACL but the dst branch doesn't +because the dst branch may natively un-support it or temporary +un-support it due to "noacl" mount option. Of course, the dst branch fs +may NOT return an error even if the XATTR is not supported. It is +totally up to the branch fs. + +Anyway when the aufs internal copy-up gets an error from the dst branch +fs, then aufs tries removing the just copied entry and returns the error +to the userspace. The worst case of this situation will be all copy-up +will fail. + +For the copy-up operation, there two basic approaches. +- copy the specified XATTR only (by category above), and return the + error unconditionally if it happens. +- copy all XATTR, and ignore the error on the specified category only. + +In order to support XATTR and to implement the correct behaviour, aufs +chooses the latter approach and introduces some new branch attributes, +"icexsec", "icexsys", "icextr", "icexusr", and "icexoth". +They correspond to the XATTR namespaces (see above). Additionally, to be +convenient, "icex" is also provided which means all "icex*" attributes +are set (here the word "icex" stands for "ignore copy-error on XATTR"). + +The meaning of these attributes is to ignore the error from setting +XATTR on that branch. +Note that aufs tries copying all XATTR unconditionally, and ignores the +error from the dst branch according to the specified attributes. + +Some XATTR may have its default value. The default value may come from +the parent dir or the environment. If the default value is set at the +file creating-time, it will be overwritten by copy-up. +Some contradiction may happen I am afraid. +Do we need another attribute to stop copying XATTR? I am unsure. For +now, aufs implements the branch attributes to ignore the error. --- linux-riscv-5.4.0.orig/Documentation/filesystems/aufs/design/07export.txt +++ linux-riscv-5.4.0/Documentation/filesystems/aufs/design/07export.txt @@ -0,0 +1,58 @@ + +# Copyright (C) 2005-2020 Junjiro R. Okajima +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +Export Aufs via NFS +---------------------------------------------------------------------- +Here is an approach. +- like xino/xib, add a new file 'xigen' which stores aufs inode + generation. +- iget_locked(): initialize aufs inode generation for a new inode, and + store it in xigen file. +- destroy_inode(): increment aufs inode generation and store it in xigen + file. it is necessary even if it is not unlinked, because any data of + inode may be changed by UDBA. +- encode_fh(): for a root dir, simply return FILEID_ROOT. otherwise + build file handle by + + branch id (4 bytes) + + superblock generation (4 bytes) + + inode number (4 or 8 bytes) + + parent dir inode number (4 or 8 bytes) + + inode generation (4 bytes)) + + return value of exportfs_encode_fh() for the parent on a branch (4 + bytes) + + file handle for a branch (by exportfs_encode_fh()) +- fh_to_dentry(): + + find the index of a branch from its id in handle, and check it is + still exist in aufs. + + 1st level: get the inode number from handle and search it in cache. + + 2nd level: if not found in cache, get the parent inode number from + the handle and search it in cache. and then open the found parent + dir, find the matching inode number by vfs_readdir() and get its + name, and call lookup_one_len() for the target dentry. + + 3rd level: if the parent dir is not cached, call + exportfs_decode_fh() for a branch and get the parent on a branch, + build a pathname of it, convert it a pathname in aufs, call + path_lookup(). now aufs gets a parent dir dentry, then handle it as + the 2nd level. + + to open the dir, aufs needs struct vfsmount. aufs keeps vfsmount + for every branch, but not itself. to get this, (currently) aufs + searches in current->nsproxy->mnt_ns list. it may not be a good + idea, but I didn't get other approach. + + test the generation of the gotten inode. +- every inode operation: they may get EBUSY due to UDBA. in this case, + convert it into ESTALE for NFSD. +- readdir(): call lockdep_on/off() because filldir in NFSD calls + lookup_one_len(), vfs_getattr(), encode_fh() and others. --- linux-riscv-5.4.0.orig/Documentation/filesystems/aufs/design/08shwh.txt +++ linux-riscv-5.4.0/Documentation/filesystems/aufs/design/08shwh.txt @@ -0,0 +1,52 @@ + +# Copyright (C) 2005-2020 Junjiro R. Okajima +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +Show Whiteout Mode (shwh) +---------------------------------------------------------------------- +Generally aufs hides the name of whiteouts. But in some cases, to show +them is very useful for users. For instance, creating a new middle layer +(branch) by merging existing layers. + +(borrowing aufs1 HOW-TO from a user, Michael Towers) +When you have three branches, +- Bottom: 'system', squashfs (underlying base system), read-only +- Middle: 'mods', squashfs, read-only +- Top: 'overlay', ram (tmpfs), read-write + +The top layer is loaded at boot time and saved at shutdown, to preserve +the changes made to the system during the session. +When larger changes have been made, or smaller changes have accumulated, +the size of the saved top layer data grows. At this point, it would be +nice to be able to merge the two overlay branches ('mods' and 'overlay') +and rewrite the 'mods' squashfs, clearing the top layer and thus +restoring save and load speed. + +This merging is simplified by the use of another aufs mount, of just the +two overlay branches using the 'shwh' option. +# mount -t aufs -o ro,shwh,br:/livesys/overlay=ro+wh:/livesys/mods=rr+wh \ + aufs /livesys/merge_union + +A merged view of these two branches is then available at +/livesys/merge_union, and the new feature is that the whiteouts are +visible! +Note that in 'shwh' mode the aufs mount must be 'ro', which will disable +writing to all branches. Also the default mode for all branches is 'ro'. +It is now possible to save the combined contents of the two overlay +branches to a new squashfs, e.g.: +# mksquashfs /livesys/merge_union /path/to/newmods.squash + +This new squashfs archive can be stored on the boot device and the +initramfs will use it to replace the old one at the next boot. --- linux-riscv-5.4.0.orig/Documentation/filesystems/aufs/design/10dynop.txt +++ linux-riscv-5.4.0/Documentation/filesystems/aufs/design/10dynop.txt @@ -0,0 +1,47 @@ + +# Copyright (C) 2010-2020 Junjiro R. Okajima +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +Dynamically customizable FS operations +---------------------------------------------------------------------- +Generally FS operations (struct inode_operations, struct +address_space_operations, struct file_operations, etc.) are defined as +"static const", but it never means that FS have only one set of +operation. Some FS have multiple sets of them. For instance, ext2 has +three sets, one for XIP, for NOBH, and for normal. +Since aufs overrides and redirects these operations, sometimes aufs has +to change its behaviour according to the branch FS type. More importantly +VFS acts differently if a function (member in the struct) is set or +not. It means aufs should have several sets of operations and select one +among them according to the branch FS definition. + +In order to solve this problem and not to affect the behaviour of VFS, +aufs defines these operations dynamically. For instance, aufs defines +dummy direct_IO function for struct address_space_operations, but it may +not be set to the address_space_operations actually. When the branch FS +doesn't have it, aufs doesn't set it to its address_space_operations +while the function definition itself is still alive. So the behaviour +itself will not change, and it will return an error when direct_IO is +not set. + +The lifetime of these dynamically generated operation object is +maintained by aufs branch object. When the branch is removed from aufs, +the reference counter of the object is decremented. When it reaches +zero, the dynamically generated operation object will be freed. + +This approach is designed to support AIO (io_submit), Direct I/O and +XIP (DAX) mainly. +Currently this approach is applied to address_space_operations for +regular files only. --- linux-riscv-5.4.0.orig/Documentation/filesystems/porting.rst +++ linux-riscv-5.4.0/Documentation/filesystems/porting.rst @@ -850,3 +850,11 @@ d_alloc_pseudo() is internal-only; uses outside of alloc_file_pseudo() are very suspect (and won't work in modules). Such uses are very likely to be misspelled d_alloc_anon(). + +--- + +**mandatory** + +[should've been added in 2016] stale comment in finish_open() nonwithstanding, +failure exits in ->atomic_open() instances should *NOT* fput() the file, +no matter what. Everything is handled by the caller. --- linux-riscv-5.4.0.orig/Documentation/kbuild/makefiles.rst +++ linux-riscv-5.4.0/Documentation/kbuild/makefiles.rst @@ -1115,23 +1115,6 @@ In this example, extra-y is used to list object files that shall be built, but shall not be linked as part of built-in.a. - header-test-y - - header-test-y specifies headers (`*.h`) in the current directory that - should be compile tested to ensure they are self-contained, - i.e. compilable as standalone units. If CONFIG_HEADER_TEST is enabled, - this builds them as part of extra-y. - - header-test-pattern-y - - This works as a weaker version of header-test-y, and accepts wildcard - patterns. The typical usage is:: - - header-test-pattern-y += *.h - - This specifies all the files that matches to `*.h` in the current - directory, but the files in 'header-test-' are excluded. - 6.7 Commands useful for building a boot image --------------------------------------------- --- linux-riscv-5.4.0.orig/Documentation/kbuild/modules.rst +++ linux-riscv-5.4.0/Documentation/kbuild/modules.rst @@ -470,9 +470,9 @@ The syntax of the Module.symvers file is:: - + - 0xe1cc2a05 usb_stor_suspend USB_STORAGE drivers/usb/storage/usb-storage EXPORT_SYMBOL_GPL + 0xe1cc2a05 usb_stor_suspend drivers/usb/storage/usb-storage EXPORT_SYMBOL_GPL USB_STORAGE The fields are separated by tabs and values may be empty (e.g. if no namespace is defined for an exported symbol). --- linux-riscv-5.4.0.orig/Documentation/kmsg/IPVS +++ linux-riscv-5.4.0/Documentation/kmsg/IPVS @@ -0,0 +1,81 @@ +/*? Text: "%s(): NULL arg\n" */ +/*? Text: "%s(): NULL scheduler_name\n" */ +/*? Text: "%s(): [%s] pe already existed in the system\n" */ +/*? Text: "%s(): [%s] pe already linked\n" */ +/*? Text: "%s(): [%s] pe is not in the list. failed\n" */ +/*? Text: "%s(): [%s] scheduler already existed in the system\n" */ +/*? Text: "%s(): [%s] scheduler already linked\n" */ +/*? Text: "%s(): [%s] scheduler is not in the list. failed\n" */ +/*? Text: "%s(): done error\n" */ +/*? Text: "%s(): init error\n" */ +/*? Text: "%s(): lower threshold is higher than upper threshold\n" */ +/*? Text: "%s(): no memory\n" */ +/*? Text: "%s(): request for already hashed, called from %pF\n" */ +/*? Text: "%s(): request for unhash flagged, called from %pF\n" */ +/*? Text: "%s(): server weight less than zero\n" */ +/*? Text: "%s: %s %pI4:%d - %s\n" */ +/*? Text: "%s: %s [%pI6]:%d - %s\n" */ +/*? Text: "%s: %s [%pI6c]:%d - %s\n" */ +/*? Text: "%s: FWM %u 0x%08X - %s\n" */ +/*? Text: "%s: enter\n" */ +/*? Text: "%s: loaded support on port[%d] = %d\n" */ +/*? Text: "BACKUP v0, Dropping buffer bogus conn options\n" */ +/*? Text: "BACKUP v0, bogus conn\n" */ +/*? Text: "BACKUP, Dropping buffer, Err: %d in decoding\n" */ +/*? Text: "BACKUP, Dropping buffer, Unknown version %d\n" */ +/*? Text: "BACKUP, Dropping buffer, msg > buffer\n" */ +/*? Text: "BACKUP, Dropping buffer, to small\n" */ +/*? Text: "BACKUP, Invalid PE parameters\n" */ +/*? Text: "BUG control DEL with n=0 : %s:%d to %s:%d\n" */ +/*? Text: "Connection hash table configured (size=%d, memory=%ldKbytes)\n" */ +/*? Text: "Error binding address of the mcast interface\n" */ +/*? Text: "Error binding to the multicast addr\n" */ +/*? Text: "Error connecting to the multicast addr\n" */ +/*? Text: "Error during creation of socket; terminating\n" */ +/*? Text: "Error joining to the multicast group\n" */ +/*? Text: "Error setting outbound mcast interface\n" */ +/*? Text: "Failed to stop Backup Daemon\n" */ +/*? Text: "Failed to stop Master Daemon\n" */ +/*? Text: "Registered protocols (%s)\n" */ +/*? Text: "SYNC, connection pe_data invalid\n" */ +/*? Text: "Schedule: port zero only supported in persistent services, check your ipvs configuration\n" */ +/*? Text: "Scheduler module ip_vs_%s not found\n" */ +/*? Text: "There is no net ptr to find in the skb in %s() line:%d\n" */ +/*? Text: "UDP no ns data\n" */ +/*? Text: "You probably need to specify IP address on multicast interface.\n" */ +/*? Text: "[%s] pe registered.\n" */ +/*? Text: "[%s] pe unregistered.\n" */ +/*? Text: "[%s] scheduler registered.\n" */ +/*? Text: "[%s] scheduler unregistered.\n" */ +/*? Text: "can't register hooks.\n" */ +/*? Text: "can't register netlink/ioctl.\n" */ +/*? Text: "can't setup connection table.\n" */ +/*? Text: "can't setup control.\n" */ +/*? Text: "cannot register Generic Netlink interface.\n" */ +/*? Text: "cannot register sockopt.\n" */ +/*? Text: "get_ctl: len %u < %u\n" */ +/*? Text: "ip_vs_send_async error %d\n" */ +/*? Text: "ip_vs_sync_buff_create failed.\n" */ +/*? Text: "ipvs loaded.\n" */ +/*? Text: "ipvs unloaded.\n" */ +/*? Text: "length: %u != %u\n" */ +/*? Text: "netif_stop_queue() cannot be called before register_netdev()\n" */ +/*? Text: "not enough space in Netlink message\n" */ +/*? Text: "persistence engine module ip_vs_pe_%s not found\n" */ +/*? Text: "receiving message error\n" */ +/*? Text: "request control ADD for already controlled: %s:%d to %s:%d\n" */ +/*? Text: "request control DEL for uncontrolled: %s:%d to %s:%d\n" */ +/*? Text: "set_ctl: invalid protocol: %d %pI4:%d %s\n" */ +/*? Text: "set_ctl: len %u != %u\n" */ +/*? Text: "shouldn't reach here, because the box is on the half connection in the tun/dr module.\n" */ +/*? Text: "stopping backup sync thread %d ...\n" */ +/*? Text: "stopping master sync thread %d ...\n" */ +/*? Text: "sync thread started: state = BACKUP, mcast_ifn = %s, syncid = %d\n" */ +/*? Text: "sync thread started: state = MASTER, mcast_ifn = %s, syncid = %d\n" */ +/*? Text: "unknown Generic Netlink command\n" */ +/*? Text: "sync thread started: state = MASTER, mcast_ifn = %s, syncid = %d, id = %d\n" */ +/*? Text: "sync thread started: state = BACKUP, mcast_ifn = %s, syncid = %d, id = %d\n" */ +/*? Text: "flen=%u proglen=%u pass=%u image=%pK from=%s pid=%d\n" */ +/*? Text: "%s selects TX queue %d, but real number of TX queues is %d\n" */ +/*? Text: "Unknown mcast interface: %s\n" */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/aes_s390 +++ linux-riscv-5.4.0/Documentation/kmsg/s390/aes_s390 @@ -0,0 +1,45 @@ +/*? + * Text: "Allocating XTS fallback algorithm %s failed\n" + * Severity: Error + * Parameter: + * @1: algorithm name + * Description: + * The aes_s390 module failed to allocate a software fallback for the AES + * modes that are not supported by the hardware. A possible reason for this + * problem is that the aes_generic module that provides the fallback + * algorithms is not available. + * User action: + * Ensure that the aes_generic module is available and loaded and reload + * the aes_s390 module. + */ + +/*? + * Text: "Allocating AES fallback algorithm %s failed\n" + * Severity: Error + * Parameter: + * @1: algorithm name + * Description: + * The advanced encryption standard (AES) algorithm includes three modes with + * 128-bit, 192-bit, and 256-bit keys. Your hardware system only provides + * hardware acceleration for the 128-bit mode. The aes_s390 module failed to + * allocate a software fallback for the AES modes that are not supported by the + * hardware. A possible reason for this problem is that the aes_generic module + * that provides the fallback algorithms is not available. + * User action: + * Use the 128-bit mode only or ensure that the aes_generic module is available + * and loaded and reload the aes_s390 module. + */ + +/*? + * Text: "AES hardware acceleration is only available for 128-bit keys\n" + * Severity: Informational + * Description: + * The advanced encryption standard (AES) algorithm includes three modes with + * 128-bit, 192-bit, and 256-bit keys. Your hardware system only provides + * hardware acceleration for the 128-bit key mode. The aes_s390 module + * will use the less performant software fallback algorithm for the 192-bit + * and 256-bit key modes. + * User action: + * None. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/af_iucv +++ linux-riscv-5.4.0/Documentation/kmsg/s390/af_iucv @@ -0,0 +1,23 @@ +/*? + * Text: "Application %s on z/VM guest %s exceeds message limit\n" + * Severity: Error + * Parameter: + * @1: application name + * @2: z/VM user ID + * Description: + * Messages or packets destined for the application have accumulated and + * reached the maximum value. The default for the message limit is 65535. + * You can specify a different limit as the value for MSGLIMIT within + * the IUCV statement of the z/VM virtual machine on which the application + * runs. + * User action: + * Ensure that you do not send data faster than the application retrieves + * them. Ensure that the message limit on the z/VM guest virtual machine + * on which the application runs is high enough. + */ + +/*? Text: "Attempt to release alive iucv socket %p\n" */ +/*? Text: "netif_stop_queue() cannot be called before register_netdev()\n" */ +/*? Text: "flen=%u proglen=%u pass=%u image=%pK from=%s pid=%d\n" */ +/*? Text: "%s selects TX queue %d, but real number of TX queues is %d\n" */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/ap +++ linux-riscv-5.4.0/Documentation/kmsg/s390/ap @@ -0,0 +1,49 @@ +/*? + * Text: "%d is not a valid cryptographic domain\n" + * Severity: Warning + * Parameter: + * @1: AP domain index + * Description: + * The cryptographic domain specified for the 'domain=' module or kernel + * parameter must be an integer in the range 0 to 15. + * User action: + * Reload the cryptographic device driver with a correct module parameter. + * If the device driver has been compiled into the kernel, correct the value + * in the kernel parameter line and reboot Linux. + */ + +/*? + * Text: "The hardware system does not support AP instructions\n" + * Severity: Warning + * Description: + * The ap module addresses AP adapters through AP instructions. The hardware + * system on which the Linux instance runs does not support AP instructions. + * The ap module cannot detect any AP adapters. + * User action: + * Load the ap module only if your Linux instance runs on hardware that + * supports AP instructions. If the ap module has been compiled into the kernel, + * ignore this message. + */ + +/*? + * Text: "Registering adapter interrupts for AP device %02x.%04x failed\n" + * Severity: Error + * Parameter: + * @1: AP device ID + * @2: AP queue + * Description: + * The hardware system supports AP adapter interrupts but failed to enable + * an adapter for interrupts. Possible causes for this error are: + * i) The AP adapter firmware does not support AP interrupts. + * ii) An AP adapter firmware update to a firmware level that supports AP + * adapter interrupts failed. + * iii) The AP adapter firmware has been successfully updated to a level that + * supports AP interrupts but the new firmware has not been activated. + * User action: + * Ensure that the firmware on your AP adapters support AP interrupts and that + * any firmware updates have completed successfully. If necessary, deconfigure + * your cryptographic adapters and reconfigure them to ensure that any firmware + * updates become active, then reload the ap module. If the ap module has been + * compiled into the kernel, reboot Linux. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/appldata +++ linux-riscv-5.4.0/Documentation/kmsg/s390/appldata @@ -0,0 +1,91 @@ +/*? + * Text: "Starting the data collection for %s failed with rc=%d\n" + * Severity: Error + * Parameter: + * @1: appldata module + * @2: return code + * Description: + * The specified data collection module used the z/VM diagnose call + * DIAG 0xDC to start writing data. z/VM returned an error and the data + * collection could not start. If the return code is 5, your z/VM guest + * virtual machine is not authorized to write data records. + * User action: + * If the return code is 5, ensure that your z/VM guest virtual machine's + * entry in the z/VM directory includes the OPTION APPLMON statement. + * For other return codes see the section about DIAGNOSE Code X'DC' + * in "z/VM CP Programming Services". + */ + +/*? + * Text: "Stopping the data collection for %s failed with rc=%d\n" + * Severity: Error + * Parameter: + * @1: appldata module + * @2: return code + * Description: + * The specified data collection module used the z/VM diagnose call DIAG 0xDC + * to stop writing data. z/VM returned an error and the data collection + * continues. + * User action: + * See the section about DIAGNOSE Code X'DC' in "z/VM CP Programming Services". + */ + +/*? + * Text: "Starting a new OS data collection failed with rc=%d\n" + * Severity: Error + * Parameter: + * @1: return code + * Description: + * After a CPU hotplug event, the record size for the running operating + * system data collection is no longer correct. The appldata_os module tried + * to start a new data collection with the correct record size but received + * an error from the z/VM diagnose call DIAG 0xDC. Any data collected with + * the current record size might be faulty. + * User action: + * Start a new data collection with the cappldata_os module. For information + * about starting data collections see "Device Drivers, Features, and + * Commands". For information about the return codes see the section about + * DIAGNOSE Code X'DC' in "z/VM CP Programming Services". + */ + +/*? + * Text: "Stopping a faulty OS data collection failed with rc=%d\n" + * Severity: Error + * Parameter: + * @1: return code + * Description: + * After a CPU hotplug event, the record size for the running operating + * system data collection is no longer correct. The appldata_os module tried + * to stop the faulty data collection but received an error from the z/VM + * diagnose call DIAG 0xDC. Any data collected with the current record size + * might be faulty. + * User action: + * Try to restart appldata_os monitoring. For information about stopping + * and starting data collections see "Device Drivers, Features, and + * Commands". For information about the return codes see the section about + * DIAGNOSE Code X'DC' in "z/VM CP Programming Services". + */ + +/*? + * Text: "Maximum OS record size %i exceeds the maximum record size %i\n" + * Severity: Error + * Parameter: + * @1: no of bytes + * @2: no of bytes + * Description: + * The OS record size grows with the number of CPUs and is adjusted by the + * appldata_os module in response to CPU hotplug events. For more than 110 + * CPUs the record size would exceed the maximum record size of 4024 bytes + * that is supported by the z/VM hypervisor. To prevent the maximum supported + * record size from being exceeded while data collection is in progress, + * you cannot load the appldata_os module on Linux instances that are + * configured for a maximum of more than 110 CPUs. + * User action: + * If you do not want to collect operating system data, you can ignore this + * message. If you want to collect operating system data, reconfigure your + * Linux instance to support less than 110 CPUs. + */ + +/*? Text: "netif_stop_queue() cannot be called before register_netdev()\n" */ +/*? Text: "%s selects TX queue %d, but real number of TX queues is %d\n" */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/bpf_jit +++ linux-riscv-5.4.0/Documentation/kmsg/s390/bpf_jit @@ -0,0 +1,16 @@ +/*? Text: "flen=%u proglen=%u pass=%u image=%pK from=%s pid=%d\n" */ +/*? Text: "%s selects TX queue %d, but real number of TX queues is %d\n" */ +/*? Text: "netif_stop_queue() cannot be called before register_netdev()\n" */ + +/*? + * Text: "Unknown opcode %02x\n" + * Severity: Error + * Parameter: + * @1: Instruction opcode + * Description: + * The BPF JIT compiler has found an unknown instruction in the BPF program + * and therefore stops the compilation. As a fallback, the interpreter is used. + * User action: + * Report this problem and the error message to your support organization. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/cio +++ linux-riscv-5.4.0/Documentation/kmsg/s390/cio @@ -0,0 +1,247 @@ +/*? + * Text: "%s is not a valid device for the cio_ignore kernel parameter\n" + * Severity: Warning + * Parameter: + * @1: device bus-ID + * Description: + * The device specification for the cio_ignore kernel parameter is + * syntactically incorrect or specifies an unknown device. This device is not + * excluded from being sensed and analyzed. + * User action: + * Correct your device specification in the kernel parameter line to have the + * device excluded when you next reboot Linux. You can write the correct + * device specification to /proc/cio_ignore to add the device to the list of + * devices to be excluded. This does not immediately make the device + * inaccessible but the device is ignored if it disappears and later reappears. + */ + +/*? + * Text: "0.%x.%04x to 0.%x.%04x is not a valid range for cio_ignore\n" + * Severity: Warning + * Parameter: + * @1: from subchannel set ID + * @2: from device number + * @3: to subchannel set ID + * @4: to device number + * Description: + * The device range specified for the cio_ignore kernel parameter is + * syntactically incorrect. No devices specified with this range are + * excluded from being sensed and analyzed. + * User action: + * Correct your range specification in the kernel parameter line to have the + * range of devices excluded when you next reboot Linux. You can write the + * correct range specification to /proc/cio_ignore to add the range of devices + * to the list of devices to be excluded. This does not immediately make the + * devices in the range inaccessible but any of these devices are ignored if + * they disappear and later reappear. + */ + +/*? + * Text: "Processing %s for channel path %x.%02x\n" + * Severity: Notice + * Parameter: + * @1: configuration change + * @2: channel subsystem ID + * @3: CHPID + * Description: + * A configuration change is in progress for the given channel path. + * User action: + * None. + */ + +/*? + * Text: "No CCW console was found\n" + * Severity: Warning + * Description: + * Linux did not find the expected CCW console and tries to use an alternative + * console. A possible reason why the console was not found is that the console + * has been specified in the cio_ignore list. + * User action: + * None, if an appropriate alternative console has been found, and you want + * to use this alternative console. If you want to use the CCW console, ensure + * that is not specified in the cio_ignore list, explicitly specify the console + * with the 'condev=' kernel parameter, and reboot Linux. + */ + +/*? + * Text: "Channel measurement facility initialized using format %s (mode %s)\n" + * Severity: Informational + * Parameter: + * @1: format + * @2: mode + * Description: + * The channel measurement facility has been initialized successfully. + * Format 'extended' should be used for z990 and later mainframe systems. + * Format 'basic' is intended for earlier mainframes. Mode 'autodetected' means + * that the format has been set automatically. Mode 'parameter' means that the + * format has been set according to the 'format=' kernel parameter. + * User action: + * None. + */ + +/*? + * Text: "The CSS device driver initialization failed with errno=%d\n" + * Severity: Alert + * Parameter: + * @1: Return code + * Description: + * The channel subsystem bus could not be established. + * User action: + * See the errno man page to find out what caused the problem. + */ + /*? Text: "%s: Got subchannel machine check but no sch_event handler provided.\n" */ + +/*? + * Text: "%s: Setting the device online failed because it is boxed\n" + * Severity: Warning + * Parameter: + * @1: Device bus-ID + * Description: + * Initialization of a device did not complete because it did not respond in + * time or it was reserved by another operating system. + * User action: + * Make sure that the device is working correctly, then try again to set it + * online. For devices that support the reserve/release mechanism (for example + * DASDs), you can try to override the reservation of the other system by + * writing 'force' to the 'online' sysfs attribute of the affected device. + */ + +/*? + * Text: "%s: Setting the device online failed because it is not operational\n" + * Severity: Warning + * Parameter: + * @1: Device bus-ID + * Description: + * Initialization of a device did not complete because it is not present or + * not operational. + * User action: + * Make sure that the device is present and working correctly, then try again + * to set it online. + */ + +/*? + * Text: "%s: The device stopped operating while being set offline\n" + * Severity: Warning + * Parameter: + * @1: Device bus-ID + * Description: + * While the device was set offline, it was not present or not operational. + * The device is now inactive, but setting it online again might fail. + * User action: + * None. + */ + +/*? + * Text: "%s: The device entered boxed state while being set offline\n" + * Severity: Warning + * Parameter: + * @1: Device bus-ID + * Description: + * While the device was set offline, it did not respond in time or it was + * reserved by another operating system. The device is now inactive, but + * setting it online again might fail. + * User action: + * None. + */ + +/*? + * Text: "Logging for subchannel 0.%x.%04x failed with errno=%d\n" + * Severity: Warning + * Parameter: + * @1: subchannel set ID + * @2: subchannel number + * @3: errno + * Description: + * Capturing model-dependent logs and traces could not be triggered for the + * specified subchannel. + * User action: + * See the errno man page to find out what caused the problem. + */ + +/*? + * Text: "Logging for subchannel 0.%x.%04x was triggered\n" + * Severity: Notice + * Parameter: + * @1: subchannel set ID + * @2: subchannel number + * Description: + * Model-dependent logs and traces may be captured for the specified + * subchannel. + * User action: + * None. + */ + +/*? + * Text: "%s: No interrupt was received within %lus (CS=%02x, DS=%02x, CHPID=%x.%02x)\n" + * Severity: Warning + * Parameter: + * @1: device number + * @2: timeout value + * @3: channel status + * @4: device status + * @5: channel subsystem ID + * @6: CHPID + * Description: + * Internal I/Os are used by the common I/O layer to ensure that devices are + * operational and accessible. + * The common I/O layer did not receive an interrupt for an internal I/O + * during the specified timeout period. + * As a result, the device might assume a state that makes the device + * unusable to Linux until the problem is resolved. + * User action: + * Make sure that the device is working correctly and try the action again. + */ + +/*? + * Text: "Link stopped: RS=%02x RSID=%04x IC=%02x IUPARAMS=%s IUNODEID=%s AUPARAMS=%s AUNODEID=%s\n" + * Severity: Error + * Parameter: + * @1: reporting source + * @2: reporting source ID + * @3: incident code + * @4: incident unit parameters + * @5: incident unit node ID + * @6: attached unit parameters + * @7: attached unit node ID + * + * Description: + * A hardware error has occurred. A unit at one end of an interface + * link has detected a failure in the link or in one of the units attached to + * the link. As a result, data transfer across the link has stopped. In the + * message text, the node IDs of involved units are represented in the + * following format: TTTTTT/MDL,MMM.PPSSSSSSSSSSSS,XXXX where TTTTTT refers to + * the machine type, MDL the model number, MMM the manufacturer, PP the + * manufacturing plant, SSSSSSSSSSSS the unit sequence number and XXXX the + * machine type-dependent physical interface number. If no data is available + * for the unit parameters or node ID field, "n/a" is used instead. + * + * User action: + * Report the problem to your support organization. + */ + +/*? + * Text: "Link degraded: RS=%02x RSID=%04x IC=%02x IUPARAMS=%s IUNODEID=%s AUPARAMS=%s AUNODEID=%s\n" + * Severity: Warning + * Parameter: + * @1: reporting source + * @2: reporting source ID + * @3: incident code + * @4: incident unit parameters + * @5: incident unit node ID + * @6: attached unit parameters + * @7: attached unit node ID + * Description: + * A hardware error has occurred. A unit at one end of an interface + * link has detected a failure in the link or in one of the units attached to + * the link. As a result, data transfer across the link is degraded. In the + * message text, the node IDs of involved units are represented in the + * following format: TTTTTT/MDL,MMM.PPSSSSSSSSSSSS,XXXX where TTTTTT refers to + * the machine type, MDL the model number, MMM the manufacturer, PP the + * manufacturing plant, SSSSSSSSSSSS the unit sequence number and XXXX the + * machine type-dependent physical interface number. If no data is available + * for the unit parameters or node ID field, "n/a" is used instead. + * + * User action: + * Report the problem to your support organization. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/cpcmd +++ linux-riscv-5.4.0/Documentation/kmsg/s390/cpcmd @@ -0,0 +1,16 @@ +/*? + * Text: "The cpcmd kernel function failed to allocate a response buffer\n" + * Severity: Warning + * Description: + * IPL code, console detection, and device drivers like vmcp or vmlogrdr use + * the cpcmd kernel function to send commands to the z/VM control program (CP). + * If a program that uses the cpcmd function does not allocate a contiguous + * response buffer below 2 GB guest real storage, cpcmd creates a bounce buffer + * to be used as the response buffer. Because of low memory or memory + * fragmentation, cpcmd could not create the bounce buffer. + * User action: + * Look for related page allocation failure messages and at the stack trace to + * find out which program or operation failed. Free some memory and retry the + * failed operation. Consider allocating more memory to your z/VM guest virtual + * machine. + */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/cpu +++ linux-riscv-5.4.0/Documentation/kmsg/s390/cpu @@ -0,0 +1,46 @@ +/*? + * Text: "%d configured CPUs, %d standby CPUs\n" + * Severity: Informational + * Parameter: + * @1: number of configured CPUs + * @2: number of standby CPUs + * Description: + * The kernel detected the given number of configured and standby CPUs. + * User action: + * None. + */ + +/*? + * Text: "The CPU configuration topology of the machine is:" + * Severity: Informational + * Description: + * The first six values of the topology information represent fields Mag6 to + * Mag1 of system-information block (SYSIB) 15.1.2. These fields specify the + * maximum numbers of topology-list entries (TLE) at successive topology nesting + * levels. The last value represents the MNest value of SYSIB 15.1.2 which + * specifies the maximum possible nesting that can be configured through + * dynamic changes. For details see the SYSIB 15.1.2 information in the + * "Principles of Operation." + * User action: + * None. + */ + +/*? + * Text: "CPU %i exceeds the maximum %i and is excluded from the dump\n" + * Severity: Warning + * Parameter: + * @1: CPU number + * @2: maximum CPU number + * Description: + * The Linux kernel is used as a system dumper but it runs on more CPUs than + * it has been compiled for with the CONFIG_NR_CPUS kernel configuration + * option. The system dump will be created but information on one or more + * CPUs will be missing. + * User action: + * Update the system dump kernel to a newer version that supports more + * CPUs or reduce the number of installed CPUs and reproduce the problem + * that should be analyzed. If you send the system dump that prompted this + * message to a support organization, be sure to communicate that the dump + * does not include all CPU information. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/cpum_cf +++ linux-riscv-5.4.0/Documentation/kmsg/s390/cpum_cf @@ -0,0 +1,68 @@ +/*? + * Text: "Enabling the performance measuring unit failed with rc=%x\n" + * Severity: Error + * Parameter: + * @1: error condition + * Description: + * The device driver failed to enable CPU counter sets with the + * load counter controls (lcctl) instruction. + * See the section about lcctl in "The Load-Program-Parameter and the CPU-Measurement + * Facilities", SA23-2260, for an explanation of the error conditions. + * User action: + * Stop the performance measurement programs and try again. + */ + +/*? + * Text: "Disabling the performance measuring unit failed with rc=%x\n" + * Severity: Error + * Parameter: + * @1: error condition + * Description: + * The device driver failed to disable CPU counter sets with the + * load counter controls (lcctl) instruction. + * See the section about lcctl in "The Load-Program-Parameter and the CPU-Measurement + * Facilities", SA23-2260, for an explanation of the error conditions. + * User action: + * Stop the performance measurement programs and try again. + */ + +/*? + * Text: "Registering the cpum_cf PMU failed with rc=%i\n" + * Severity: Error + * Parameter: + * @1: error code + * Description: + * The device driver could not register the Performance Measurement Unit (PMU) + * for the CPU-measurement counter facility. + * A possible cause of this problem is memory constraints. + * User action: + * If the error code is -12 (ENOMEM), consider assigning more memory + * to your Linux instance. + */ + +/*? + * Text: "CPU[%i] Counter data was lost\n" + * Severity: Error + * Parameter: + * @1: cpu number + * Description: + * CPU counter data was lost because of machine internal + * high-priority activities. + * User action: + * None. + */ + +/*? + * Text: "Registering for CPU-measurement alerts failed with rc=%i\n" + * Severity: Error + * Parameter: + * @1: error code + * Description: + * The device driver could not register to receive CPU-measurement alerts. + * Alerts make you aware of measurement errors. + * A possible cause of this problem is memory constraints. + * User action: + * If the error code is -12 (ENOMEM), consider assigning more memory + * to your Linux instance. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/cpum_sf +++ linux-riscv-5.4.0/Documentation/kmsg/s390/cpum_sf @@ -0,0 +1,104 @@ +/*? + * Text: "The sampling buffer limits have changed to: min=%lu max=%lu (diag=x%lu)\n" + * Severity: Informational + * Parameter: + * @1: minimum size in sample-data-blocks + * @2: maximum size in sample-data-blocks + * @3: size factor for buffering diagnostic-sampling data entries + * Description: + * The minimum or maximum size limit for the sampling facility buffer was + * changed. The change is effective immediately. + * User action: + * None. + */ + +/*? + * Text: "Switching off the sampling facility failed with rc=%i\n" + * Severity: Error + * Parameter: + * @1: error condition + * Description: + * The CPU-measurement sampling facility could not be switched off and continues + * to run. For details, see LOAD SAMPLING CONTROLS in + * "The Load-Program-Parameter and the CPU-Measurement Facilities", SA23-2260. + * User action: + * If this problem persists, reboot your Linux instance. + */ + +/*? + * Text: "Sample data was lost\n" + * Severity: Error + * Description: + * Sample data was lost because of machine-internal high-priority activities. + * The sampling facility is stopped. + * User action: + * End all performance measurement sessions. Discard the measurement data, + * which are likely to be flawed. Repeat your measurements. + * If the problem persists, contact your hardware administrator. + */ + +/*? + * Text: "Sampling facility support for perf is not available: reason=%04x\n" + * Severity: Error + * Parameter: + * @1: reason code + * Description: + * The device driver could not initialize the sampling facility support. + * Possible reason codes are: + * 0001: The device driver failed to query CPU-measurement sampling facility + * information. + * + * 0002: The device driver does not support the basic-sampling function that + * is available on the LPAR within which the Linux instance runs. + * + * 0003: The device driver could not register to receive CPU-measurement alerts. + * A possible cause of this problem is memory constraints. + * + * 0004: The device driver could not register the Performance Measurement Unit + * (PMU) for the CPU-measurement sampling facility. + * A possible cause of this problem is memory constraints. + * User action: + * Consider assigning more memory to your Linux instance. + */ + +/*? + * Text: "Loading sampling controls failed: op=%i err=%i\n" + * Severity: Error + * Parameter: + * @1: Type of operation + * @2: Error condition + * Description: + * The sampling facility support could not load sampling controls to enable + * (operation type 1) or disable (operation type 2) the CPU-measurement sampling + * facility. For details of the error condition, see LOAD SAMPLING CONTROLS in + * "The Load-Program-Parameter and the CPU-Measurement Facilities", SA23-2260. + * User action: + * If the problem persists, reboot your Linux instance. + */ + +/*? + * Text: "A sampling buffer entry is incorrect (alert=0x%x)\n" + * Severity: Error + * Parameter: + * @1: Alert code + * Description: + * An incorrect sampling facility buffer entry was detected. The alert code + * indicates the root cause, for example, an incorrect entry address or an + * incorrect sample-data-block-table entry. + * User action: + * End active performance measurement sessions, for example, perf processes. If + * the problem persists, reboot your Linux instance. + */ + +/*? + * Text: "Registering for s390dbf failed\n" + * Severity: Error + * Description: + * The device driver failed to register for the s390 debug feature. You will + * not receive any debug information. A possible cause of this problem is + * memory constraints. + * User action: + * Consider assigning more memory + * to your Linux instance. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/crc32-vx +++ linux-riscv-5.4.0/Documentation/kmsg/s390/crc32-vx @@ -0,0 +1 @@ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/ctcm +++ linux-riscv-5.4.0/Documentation/kmsg/s390/ctcm @@ -0,0 +1,202 @@ +/*? + * Text: "%s: An I/O-error occurred on the CTCM device\n" + * Severity: Error + * Parameter: + * @1: bus ID of the CTCM device + * Description: + * An I/O error was detected on one of the subchannels of the CTCM device. + * Depending on the error, the CTCM device driver might attempt an automatic + * recovery. + * User action: + * Check the status of the CTCM device, for example, with ifconfig. If the + * device is not operational, perform a manual recovery. See "Device Drivers, + * Features, and Commands" for details about how to recover a CTCM device. + */ + +/*? + * Text: "%s: An adapter hardware operation timed out\n" + * Severity: Error + * Parameter: + * @1: bus ID of the CTCM device + * Description: + * The CTCM device uses an adapter to physically connect to its communication + * peer. An operation on this adapter timed out. + * User action: + * Check the status of the CTCM device, for example, with ifconfig. If the + * device is not operational, perform a manual recovery. See "Device Drivers, + * Features, and Commands" for details about how to recover a CTCM device. + */ + +/*? + * Text: "%s: An error occurred on the adapter hardware\n" + * Severity: Error + * Parameter: + * @1: bus ID of the CTCM device + * Description: + * The CTCM device uses an adapter to physically connect to its communication + * peer. An operation on this adapter returned an error. + * User action: + * Check the status of the CTCM device, for example, with ifconfig. If the + * device is not operational, perform a manual recovery. See "Device Drivers, + * Features, and Commands" for details about how to recover a CTCM device. + */ + +/*? + * Text: "%s: The communication peer has disconnected\n" + * Severity: Notice + * Parameter: + * @1: channel ID + * Description: + * The remote device has disconnected. Possible reasons are that the remote + * interface has been closed or that the operating system instance with the + * communication peer has been rebooted or shut down. + * User action: + * Check the status of the peer device. Ensure that the peer operating system + * instance is running and that the peer interface is operational. + */ + +/*? + * Text: "%s: The remote operating system is not available\n" + * Severity: Notice + * Parameter: + * @1: channel ID + * Description: + * The operating system instance with the communication peer has disconnected. + * Possible reasons are that the operating system instance has been rebooted + * or shut down. + * User action: + * Ensure that the peer operating system instance is running and that the peer + * interface is operational. + */ + +/*? + * Text: "%s: The adapter received a non-specific IRQ\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the CTCM device + * Description: + * The adapter hardware used by the CTCM device received an IRQ that cannot + * be mapped to a particular device. This is a hardware problem. + * User action: + * Check the status of the CTCM device, for example, with ifconfig. Check if + * the connection to the remote device still works. If the CTCM device is not + * operational, set it offline and back online. If this does not resolve the + * problem, perform a manual recovery. See "Device Drivers, Features, and + * Commands" for details about how to recover a CTCM device. If this problem + * persists, gather Linux debug data, collect the hardware logs, and report the + * problem to your support organization. + */ + +/*? + * Text: "%s: A check occurred on the subchannel\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the CTCM device + * Description: + * A check condition has been detected on the subchannel. + * User action: + * Check if the connection to the remote device still works. If the CTCM device + * is not operational, set it offline and back online. If this does not resolve + * the problem, perform a manual recovery. See "Device Drivers, Features, and + * Commands" for details about how to recover a CTCM device. If this problem + * persists, gather Linux debug data and report the problem to your support + * organization. + */ + +/*? + * Text: "%s: The communication peer is busy\n" + * Severity: Informational + * Parameter: + * @1: channel ID + * Description: + * A busy target device was reported. This might be a temporary problem. + * User action: + * If this problem persists or is reported frequently ensure that the target + * device is working properly. + */ + +/*? + * Text: "%s: The specified target device is not valid\n" + * Severity: Error + * Parameter: + * @1: channel ID + * Description: + * A target device was called with a faulty device specification. This is an + * adapter hardware problem. + * User action: + * Gather Linux debug data, collect the hardware logs, and contact IBM support. + */ + +/*? + * Text: "An I/O operation resulted in error %04x\n" + * Severity: Error + * Parameter: + * @1: channel ID + * @2: error information + * Description: + * A hardware operation ended with an error. + * User action: + * Check the status of the CTCM device, for example, with ifconfig. If the + * device is not operational, perform a manual recovery. See "Device Drivers, + * Features, and Commands" for details about how to recover a CTCM device. + * If this problem persists, gather Linux debug data, collect the hardware logs, + * and report the problem to your support organization. + */ + +/*? + * Text: "%s: Initialization failed with RX/TX init handshake error %s\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the CTCM device + * @2: error information + * Description: + * A problem occurred during the initialization of the connection. If the + * connection can be established after an automatic recovery, a success message + * is issued. + * User action: + * If the problem is not resolved by the automatic recovery process, check the + * local and remote device. If this problem persists, gather Linux debug data + * and report the problem to your support organization. + */ + +/*? + * Text: "%s: The network backlog for %s is exceeded, package dropped\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the CTCM device + * @2: calling function + * Description: + * There is more network traffic than can be handled by the device. The device + * is closed and some data has not been transmitted. The device might be + * recovered automatically. + * User action: + * Investigate and resolve the congestion. If necessary, set the device + * online to make it operational. + */ + +/*? + * Text: "%s: The XID used in the MPC protocol is not valid, rc = %d\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the CTCM device + * @2: return code + * Description: + * The exchange identification (XID) used by the CTCM device driver when + * in MPC mode is not valid. + * User action: + * Note the error information provided with this message and contact your + * support organization. + */ + +/*? Text: "CTCM driver unloaded\n" */ +/*? Text: "%s: %s Internal error: net_device is NULL, ch = 0x%p\n" */ +/*? Text: "%s / Initializing the ctcm device driver failed, ret = %d\n" */ +/*? Text: "%s: %s: Internal error: Can't determine channel for interrupt device %s\n" */ +/*? Text: "CTCM driver initialized\n" */ +/*? Text: "%s: setup OK : r/w = %s/%s, protocol : %d\n" */ +/*? Text: "%s: Connected with remote side\n" */ +/*? Text: "%s: Restarting device\n" */ +/*? Text: "netif_stop_queue() cannot be called before register_netdev()\n" */ +/*? Text: "flen=%u proglen=%u pass=%u image=%pK from=%s pid=%d\n" */ +/*? Text: "%s selects TX queue %d, but real number of TX queues is %d\n" */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/dasd +++ linux-riscv-5.4.0/Documentation/kmsg/s390/dasd @@ -0,0 +1,704 @@ +/* dasd_ioctl */ + +/*? + * Text: "%s: The DASD has been put in the quiesce state\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the DASD + * Description: + * No I/O operation is possible on this device. + * User action: + * Resume the DASD to enable I/O operations. + */ + +/*? + * Text: "%s: I/O operations have been resumed on the DASD\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the DASD + * Description: + * The DASD is no longer in state quiesce and I/O operations can be performed + * on the device. + * User action: + * None. + */ + +/*? + * Text: "%s: The DASD cannot be formatted while it is enabled\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * The DASD you try to format is enabled. Enabled devices cannot be formatted. + * User action: + * Contact the owner of the formatting tool. + */ + +/*? + * Text: "%s: The specified DASD is a partition and cannot be formatted\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * The DASD you try to format is a partition. Partitions cannot be formatted + * separately. You can only format a complete DASD including all its partitions. + * User action: + * Format the complete DASD. + * ATTENTION: Formatting irreversibly destroys all data on all partitions + * of the DASD. + */ + +/*? + * Text: "%s: The specified DASD is a partition and cannot be checked\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * The DASD you try to check is a partition. Partitions cannot be checked + * separately. You can only check a complete DASD including all its partitions. + * User action: + * Check the complete DASD. + */ + +/*? + * Text: "%s: Formatting unit %d failed with rc=%d\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * @2: start track + * @3: return code + * Description: + * The formatting process might have been interrupted by a signal, for example, + * CTRL+C. If the process was not interrupted intentionally, an I/O error + * might have occurred. + * User action: + * Retry to format the device. If the error persists, check the log file for + * related error messages. If you cannot resolve the error, note the return + * code and contact your support organization. + */ + + +/* dasd */ + +/*? + * Text: "%s: Cancelling request %p failed with rc=%d\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * @2: pointer to request + * @3: return code of previous function + * Description: + * In response to a user action, the DASD device driver tried but failed to + * cancel a previously started I/O operation. + * User action: + * Try the action again. + */ + +/*? + * Text: "%s: Flushing the DASD request queue failed for request %p\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * @2: pointer to request + * Description: + * As part of the unloading process, the DASD device driver flushes the + * request queue. This failed because a previously started I/O operation + * could not be canceled. + * User action: + * Try again to unload the DASD device driver or to shut down Linux. + */ + +/*? + * Text: "The DASD device driver could not be initialized\n" + * Severity: Informational + * Description: + * The initialization of the DASD device driver failed because of previous + * errors. + * User action: + * Check for related previous error messages. + */ + +/*? + * Text: "%s: Accessing the DASD failed because it is in probeonly mode\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the DASD + * Description: + * The dasd= module or kernel parameter specified the probeonly attribute for + * the DASD you are trying to access. The DASD device driver cannot access + * DASDs that are in probeonly mode. + * User action: + * Change the dasd= parameter as to omit probeonly for the DASD and reload + * the DASD device driver. If the DASD device driver has been compiled into + * the kernel, reboot Linux. + */ + +/*? + * Text: "%s: cqr %p timed out (%lus), %i retries remaining\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * @2: request + * @3: timeout value + * @4: number of retries left + * Description: + * A try of the error recovery procedure (ERP) for the channel queued request + * (cqr) timed out and failed to recover the error. ERP continues for the DASD. + * User action: + * Ignore this message if it occurs infrequently and if the recovery succeeds + * during one of the retries. If this error persists, check for related + * previous error messages and report the problem to your support organization. + * + * The timeout can be changed by writing a new value to the sysfs 'expires' attribute of the DASD. The value specifies the timeout in seconds. + */ + +/*? + * Text: "%s: cqr %p timed out (%lus) but cannot be ended, retrying in 5 s\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * @2: request + * @3: timeout value + * Description: + * A try of the error recovery procedure (ERP) for the channel queued request + * (cqr) timed out and failed to recover the error. The I/O request submitted + * during the try could not be canceled. The ERP waits for 5 seconds before + * trying again. + * User action: + * Ignore this message if it occurs infrequently and if the recovery succeeds + * during one of the retries. If this error persists, check for related + * previous error messages and report the problem to your support organization. + * + * The timeout can be changed by writing a new value to the sysfs 'expires' attribute of the DASD. The value specifies the timeout in seconds. + */ + +/*? + * Text: "%s: The DASD cannot be set offline while it is in use\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * The DASD cannot be set offline because it is in use by an internal process. + * An action to free the DASD might not have completed yet. + * User action: + * Wait some time and set the DASD offline later. + */ + +/*? + * Text: "%s: The DASD cannot be set offline with open count %i\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: count + * Description: + * The DASD is being used by one or more processes and cannot be set offline. + * User action: + * Ensure that the DASD is not in use anymore, for example, unmount all + * partitions. Then try again to set the DASD offline. + */ + +/*? + * Text: "%s: Setting the DASD online failed with rc=%d\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: return code + * Description: + * The DASD could not be set online because of previous errors. + * User action: + * Look for previous error messages. If you cannot resolve the error, note + * the return code and contact your support organization. + */ + +/*? + * Text: "%s Setting the DASD online with discipline %s failed with rc=%i\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: discipline + * @3: return code + * Description: + * The DASD could not be set online because of previous errors. + * User action: + * Look for previous error messages. If you cannot resolve the error, note the + * return code and contact your support organization. + */ + +/*? + * Text: "%s Setting the DASD online failed because of missing DIAG discipline\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * The DASD was to be set online with discipline DIAG but this discipline of + * the DASD device driver is not available. + * User action: + * Ensure that the dasd_diag_mod module is loaded. If your Linux system does + * not include this module, you cannot set DASDs online with the DIAG + * discipline. + */ + +/*? + * Text: "%s Setting the DASD online failed because of a missing discipline\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * The DASD was to be set online with a DASD device driver discipline that + * is not available. + * User action: + * Ensure that all DASD modules are loaded correctly. + */ + +--------------------------- + +/*? + * Text: "The statistics feature has been switched off\n" + * Severity: Informational + * Description: + * The statistics feature of the DASD device driver has been switched off. + * User action: + * None. + */ + +/*? + * Text: "The statistics feature has been switched on\n" + * Severity: Informational + * Description: + * The statistics feature of the DASD device driver has been switched on. + * User action: + * None. + */ + +/*? + * Text: "The statistics have been reset\n" + * Severity: Informational + * Description: + * The DASD statistics data have been reset. + * User action: + * None. + */ + +/*? + * Text: "%s is not a supported value for /proc/dasd/statistics\n" + * Severity: Warning + * Parameter: + * @1: value + * Description: + * An incorrect value has been written to /proc/dasd/statistics. + * The supported values are: 'set on', 'set off', and 'reset'. + * User action: + * Write a supported value to /proc/dasd/statistics. + */ + +/*? + * Text: "%s is not a valid device range\n" + * Severity: Error + * Parameter: + * @1: range + * Description: + * A device range specified with the dasd= parameter is not valid. + * User action: + * Examine the dasd= parameter and correct the device range. + */ + +/*? + * Text: "The probeonly mode has been activated\n" + * Severity: Informational + * Description: + * The probeonly mode of the DASD device driver has been activated. In this + * mode the device driver rejects any 'open' syscalls with EPERM. + * User action: + * None. + */ + +/*? + * Text: "The IPL device is not a CCW device\n" + * Severity: Error + * Description: + * The value for the dasd= parameter contains the 'ipldev' keyword. During + * the boot process this keyword is replaced with the device from which the + * IPL was performed. The 'ipldev' keyword is not valid if the IPL device is + * not a CCW device. + * User action: + * Do not specify the 'ipldev' keyword when performing an IPL from a device + * other than a CCW device. + */ + +/*? + * Text: "A closing parenthesis ')' is missing in the dasd= parameter\n" + * Severity: Warning + * Description: + * The specification for the dasd= kernel or module parameter has an opening + * parenthesis '(' * without a matching closing parenthesis ')'. + * User action: + * Correct the parameter value. + */ + +/*? + * Text: "The autodetection mode has been activated\n" + * Severity: Informational + * Description: + * The autodetection mode of the DASD device driver has been activated. In + * this mode the DASD device driver sets all detected DASDs online. + * User action: + * None. + */ + +/*? + * Text: "%*s is not a supported device option\n" + * Severity: Warning + * Parameter: + * @1: length of option code + * @2: option code + * Description: + * The dasd= parameter includes an unknown option for a DASD or a device range. + * Options are specified in parenthesis and immediately follow a device or + * device range. + * User action: + * Check the dasd= syntax and remove any unsupported options from the dasd= + * parameter specification. + */ + +/*? + * Text: "PAV support has be deactivated\n" + * Severity: Informational + * Description: + * The 'nopav' keyword has been specified with the dasd= kernel or module + * parameter. The Parallel Access Volume (PAV) support of the DASD device + * driver has been deactivated. + * User action: + * None. + */ + +/*? + * Text: "'nopav' is not supported on z/VM\n" + * Severity: Informational + * Description: + * For Linux instances that run as guest operating systems of the z/VM + * hypervisor Parallel Access Volume (PAV) support is controlled by z/VM not + * by Linux. + * User action: + * Remove 'nopav' from the dasd= module or kernel parameter specification. + */ + +/*? + * Text: "High Performance FICON support has been deactivated\n" + * Severity: Informational + * Description: + * The 'nofcx' keyword has been specified with the dasd= kernel or module + * parameter. The High Performance FICON (transport mode) support of the DASD + * device driver has been deactivated. + * User action: + * None. + */ + +/*? + * Text: "The dasd= parameter value %s has an invalid ending\n" + * Severity: Warning + * Parameter: + * @1: parameter value + * Description: + * The specified value for the dasd= kernel or module parameter is not correct. + * User action: + * Check the module or the kernel parameter. + */ + +/*? + * Text: "Registering the device driver with major number %d failed\n" + * Severity: Warning + * Parameter: + * @1: DASD major + * Description: + * Major number 94 is reserved for the DASD device driver. The DASD device + * driver failed to register with this major number. Another device driver + * might have used major number 94. + * User action: + * Determine which device driver uses major number 94 instead of the DASD + * device driver and unload this device driver. Then try again to load the + * DASD device driver. + */ + +/*? + * Text: "%s: default ERP has run out of retries and failed\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * Description: + * The error recovery procedure (ERP) tried to recover an error but the number + * of retries for the I/O was exceeded before the error could be resolved. + * User action: + * Check for related previous error messages. + */ + +/*? + * Text: "%s: Unable to terminate request %p on suspend\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * @2: pointer to request + * Description: + * As part of the suspend process, the DASD device driver terminates requests + * on the request queue. This failed because a previously started I/O operation + * could not be canceled. The suspend process will be stopped. + * User action: + * Try again to suspend the system. + */ + +/*? + * Text: "%s: ERP failed for the DASD\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * Description: + * An error recovery procedure (ERP) was performed for the DASD but failed. + * User action: + * Check the message log for previous related error messages. + */ + +/*? + * Text: "%s: An error occurred in the DASD device driver, reason=%s\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * @2: reason code + * Description: + * This problem indicates a program error in the DASD device driver. + * User action: + * Note the reason code and contact your support organization. +*/ + +/*? + * Text: "%s: No operational channel path is left for the device\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * All channel paths to the device have become non-operational. The DASD + * device driver suspends I/O operations and queues I/O requests for this + * device until at least one channel path becomes operational again. + * User action: + * Ensure that each channel path to the device has been set up correctly + * and that the related physical cable connections are in place. + */ + +/*? + * Text: "%s: No verified channel paths remain for the device\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * All verified channel paths to the device have become non-operational. + * Any other paths to the device have previously been identified as not usable. + * The DASD device driver suspends I/O operations and queues I/O requests + * for this device until at least one channel path becomes operational + * again. + * User action: + * Ensure that each channel path to the device has been set up correctly + * and that the related physical cable connections are in place. + * Set all paths to the device offline and online again to repeat the path + * verification. Alternatively, set the device offline and online again to + * verify all available paths for this device. + * If this problem persists, gather Linux debug data and report the problem + * to your support organization. + */ + +/*? + * Text: "%s: A channel path to the device has become operational\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the DASD + * Description: + * At least one channel path of this device has become operational again. + * The DASD device driver resumes I/O operations to the device and processes + * the I/O requests that were queued while there was no operational channel path. + * User action: + * None. + */ + +------------------------------------------------------------------------------------ +/* dasd_diag */ + +/*? + * Text: "%s: A 64-bit DIAG call failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * 64-bit DIAG calls require a 64-bit z/VM version. + * User action: + * Use z/VM 5.2 or later or set the sysfs 'use_diag' attribute of the DASD to 0 + * to switch off DIAG. + */ + +/*? + * Text: "%s: Accessing the DASD failed because of an incorrect format (rc=%d)\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: return code + * Description: + * The format of the DASD is not correct. + * User action: + * Check the device format. For details about the return code see the + * section about the INITIALIZE function for DIAGNOSE Code X'250' + * in "z/VM CP Programming Services". If you cannot resolve the error, note + * the return code and contact your support organization. + */ + +/*? + * Text: "%s: New DASD with %ld byte/block, total size %ld KB%s\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the DASD + * @2: bytes per block + * @3: size + * @4: access mode + * Description: + * A DASD with the indicated block size and total size has been set online. + * If the DASD is configured as read-only to the real or virtual hardware, + * the message includes an indication of this hardware access mode. The + * hardware access mode is independent from the 'readonly' attribute of + * the device in sysfs. + * User action: + * None. + */ + +/*? + * Text: "%s: DIAG ERP failed with rc=%d\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: return code + * Description: + * An error in the DIAG processing could not be recovered by the error + * recovery procedure (ERP) of the DIAG discipline. + * User action: + * Note the return code, check for related I/O errors, and report this problem + * to your support organization. + */ + +/*? + * Text: "%s: DIAG initialization failed with rc=%d\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: return code + * Description: + * Initializing the DASD with the DIAG discipline failed. Possible reasons for + * this problem are that the device has a device type other than FBA or ECKD, + * or has a block size other than one of the supported sizes: + * 512 byte, 1024 byte, 2048 byte, or 4096 byte. + * User action: + * Ensure that the device can be written to and has a supported device type + * and block size. For details about the return code see the section about + * the INITIALIZE function for DIAGNOSE Code X'250' in "z/VM CP Programming + * Services". If you cannot resolve the error, note the error code and contact + * your support organization. + */ + +/*? + * Text: "%s: Device type %d is not supported in DIAG mode\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: device type + * Description: + * Only DASD of type FBA and ECKD are supported in DIAG mode. + * User action: + * Set the sysfs 'use_diag' attribute of the DASD to 0 and try again to access + * the DASD. + */ + +/*? + * Text: "Discipline %s cannot be used without z/VM\n" + * Severity: Informational + * Parameter: + * @1: discipline name + * Description: + * The discipline that is specified with the dasd= kernel or module parameter + * is only available for Linux instances that run as guest operating + * systems of the z/VM hypervisor. + * User action: + * Remove the unsupported discipline from the parameter string. + */ + +/*? + * Text: "%s: The access mode of a DIAG device changed to read-only\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * A device changed its access mode from writeable to + * read-only while in use. + * User action: + * Set the device offline, ensure that the device is configured correctly in + * z/VM, then set the device online again. + */ + +------------------------------------------------------------------------------------ +/* dasd_erp */ + +/*? + * Text: "%s: A timeout error occurred for cqr %p\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * @2: pointer to request + * Description: + * A channel queued request (cqr) failed because it timed out. + * One possible reason for this error is that a request did not + * complete within the timeout interval specified for the DASD. + * The timeout interval is set as the value of the 'timeout' sysfs + * attribute of a DASD. A value of 0 disables the timeout function. + * The timeout function can be used; for example, by mirroring setups; + * to quickly process a request queue for a DASD that has become unavailable. + * User action: + * Check the message log for previous related error messages. Verify + * that the storage server and the connection from host to storage + * server are operational. If the 'timeout' sysfs attribute of the + * DASD has been set to a value other than 0, verify that this + * setting is intentional and change it if required. + */ + +/*? + * Text: "%s: A transport error occurred for cqr %p\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * @2: pointer to request + * Description: + * A channel queued request (cqr) failed because the connection to the + * device was lost and the 'failfast' flag is set for the request. + * This flag can result from, for example: + * + * - A software layer above the DASD device driver; + * for example, in a host based mirroring setup. + * + * - Value 1 for the 'failfast' sysfs attribute of the DASD. + * This setting applies to all requests on the DASD. + * + * User action: + * Ensure that each channel path to the device has been set up + * correctly and that the related physical cable connections are in + * place. If the 'failfast' attribute of the DASD is set to 1, + * verify that this setting is intentional and change it to 0 if required. + */ + +/*? + * Text: "%s Setting the DASD online failed because the required module %s could not be loaded (rc=%d)\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: kernel module name + * @3: return code + * Description: + * The DASD was to be set online with discipline DIAG but this discipline of + * the DASD device driver is not available and an attempt to load the + * corresponding kernel module failed with the specified return code. + * + * User action: + * Ensure that the kernel module with the specified name is correctly installed + * or set the sysfs 'use_diag' attribute of the DASD to 0 to switch off DIAG. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/dasd-eckd +++ linux-riscv-5.4.0/Documentation/kmsg/s390/dasd-eckd @@ -0,0 +1,2154 @@ +/* dasd_eckd */ + +/*? + * Text: "%s: ERP failed for the DASD\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * Description: + * An error recovery procedure (ERP) was performed for the DASD but failed. + * User action: + * Check the message log for previous related error messages. + */ + +/*? + * Text: "%s: An error occurred in the DASD device driver, reason=%s\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * @2: reason code + * Description: + * This problem indicates a program error in the DASD device driver. + * User action: + * Note the reason code and contact your support organization. +*/ + +/*? + * Text: "%s: Allocating memory for private DASD data failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * The DASD device driver maintains data structures for each DASD it manages. + * There is not enough memory to allocate these data structures for one or + * more DASD. + * User action: + * Free some memory and try the operation again. + */ + +/*? + * Text: "%s: DASD with %d KB/block, %d KB total size, %d KB/track, %s\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the DASD + * @2: block size + * @3: DASD size + * @4: track size + * @5: disc layout + * Description: + * A DASD with the shown characteristics has been set online. + * User action: + * None. + */ + +/*? + * Text: "%s: Start track number %u used in formatting is too big\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: track number + * Description: + * The DASD format I/O control was used incorrectly by a formatting tool. + * User action: + * Contact the owner of the formatting tool. + */ + +/*? + * Text: "%s: Stop track number %u used in formatting is too big\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: track number + * Description: + * The DASD format I/O control was used incorrectly by a formatting tool. + * User action: + * Contact the owner of the formatting tool. + */ + +/*? + * Text: "%s: The DASD is not formatted\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * A DASD has been set online but it has not been formatted yet. You must + * format the DASD before you can use it. + * User action: + * Format the DASD, for example, with dasdfmt. + */ + +/*? + * Text: "%s: 0x%x is not a known command\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * @2: command + * Description: + * This problem is likely to be caused by a programming error. + * User action: + * Contact your support organization. + */ + +/*? + * Text: "%s: Track 0 has no records following the VTOC\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * Linux has identified a volume table of contents (VTOC) on the DASD but + * cannot read any data records following the VTOC. A possible cause of this + * problem is that the DASD has been used with another System z operating + * system. + * User action: + * Format the DASD for usage with Linux, for example, with dasdfmt. + * ATTENTION: Formatting irreversibly destroys all data on the DASD. + */ + +/*? + * Text: "%s: An I/O control call used incorrect flags 0x%x\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: flags + * Description: + * The DASD format I/O control was used incorrectly. + * User action: + * Contact the owner of the formatting tool. + */ + +/*? + * Text: "%s: New DASD %04X/%02X (CU %04X/%02X) with %d cylinders, %d heads, %d sectors%s\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the DASD + * @2: device type + * @3: device model + * @4: control unit type + * @5: control unit model + * @6: number of cylinders + * @7: tracks per cylinder + * @8: sectors per track + * @9: access mode + * Description: + * A DASD with the shown characteristics has been set online. + * If the DASD is configured as read-only to the real or virtual hardware, + * the message includes an indication of this hardware access mode. The + * hardware access mode is independent from the 'readonly' attribute of + * the device in sysfs. + * User action: + * None. + */ + +/*? + * Text: "%s: The disk layout of the DASD is not supported\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * The DASD device driver only supports the following disk layouts: CDL, LDL, + * FBA, CMS, and CMS RESERVED. + * User action: + * None. + */ + +/*? + * Text: "%s: Start track %u used in formatting exceeds end track\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: track number + * Description: + * The DASD format I/O control was used incorrectly by a formatting tool. + * User action: + * Contact the owner of the formatting tool. + */ + +/*? + * Text: "%s: The DASD cache mode was set to %x (%i cylinder prestage)\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the DASD + * @2: operation mode + * @3: number of cylinders + * Description: + * The DASD cache mode has been changed. See the storage system documentation + * for information about the different cache operation modes. + * User action: + * None. + */ + +/*? + * Text: "%s: The DASD cannot be formatted with block size %u\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: block size + * Description: + * The block size specified for a format instruction is not valid. The block + * size must be between 512 and 4096 byte and must be a power of 2. + * User action: + * Call the format command with a supported block size. + */ + +/*? + * Text: "%s: The UID of the DASD has changed\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * Description: + * The Unique Identifier (UID) of a DASD that is currently in use has changed. + * This indicates that the physical disk has been replaced. + * User action: + * None if the replacement was intentional. + * If the disk change is not expected, stop using the disk to prevent possible + * data loss. +*/ + + +/* dasd_3990_erp */ + +/*? + * Text: "%s: is offline or not installed - INTERVENTION REQUIRED!!\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * Description: + * The DASD to be accessed is not in an accessible state. The I/O operation + * will wait until the device is operational again. This is an operating system + * independent message that is issued by the storage system. + * User action: + * Make the DASD accessible again. For details see the storage system + * documentation. + */ + +/*? + * Text: "%s: The DASD cannot be reached on any path (lpum=%x/opm=%x)\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * @2: last path used mask + * @3: online path mask + * Description: + * After a path to the DASD failed, the error recovery procedure of the DASD + * device driver tried but failed to reconnect the DASD through an alternative + * path. + * User action: + * Ensure that the cabling between the storage server and the mainframe + * system is securely in place. Check the file systems on the DASD when it is + * accessible again. + */ + +/*? + * Text: "%s: Unable to allocate DCTL-CQR\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an internal error. + * User action: + * Contact your support organization. + */ + +/*? + * Text: "%s: FORMAT 0 - Invalid Parameter\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * A data argument of a command is not valid. This is an operating system + * independent message that is issued by the storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 0 - DPS Installation Check\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This operating system independent message is issued by the storage system + * for one of the following reasons: + * - A 3380 Model D or E DASD does not have the Dynamic Path Selection (DPS) + * feature in the DASD A-unit. + * - The device type of an attached DASD is not supported by the firmware. + * - A type 3390 DASD is attached to a 3 MB channel. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 2 - Reserved\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 1 - Drive motor switch is off\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 0 - CCW Count less than required\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * The CCW count of a command is less than required. This is an operating + * system independent message that is issued by the storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 0 - Channel requested ... %02x\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: reason code + * Description: + * This is an operating system independent message that is issued by the + * storage system. The possible reason codes indicate the following problems: + * 00 No Message. + * 01 The channel has requested unit check sense data. + * 02 The channel has requested retry and retry is exhausted. + * 03 A SA Check-2 error has occurred. This sense is presented with + * Equipment Check. + * 04 The channel has requested retry and retry is not possible. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 0 - Status Not As Required: reason %02x\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: reason code + * Description: + * This is an operating system independent message that is issued by the + * storage system. There are several potential reasons for this message; + * byte 8 contains the reason code. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 4 - Reserved\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 1 - Device status 1 not valid\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 0 - Storage Path Restart\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * An operation for an active channel program was queued in a Storage Control + * when a warm start was received by the path. This is an operating system + * independent message that is issued by the storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 0 - Reset Notification\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * A system reset or its equivalent was received on an interface. The Unit + * Check that generates this sense is posted to the next channel initiated + * selection following the resetting event. This is an operating system + * independent message that is issued by the storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 0 - Invalid Command Sequence\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * An incorrect sequence of commands has occurred. This is an operating system + * independent message that is issued by the storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 1 - Missing device address bit\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT F - Subsystem Processing Error\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * A firmware logic error has been detected. This is an operating system + * independent message that is issued by the storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 1 - Seek incomplete\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 0 - Invalid Command\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * A command was issued that is not in the 2107/1750 command set. + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 0 - Reserved\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 0 - Command Invalid on Secondary Address\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * A command or order not allowed on a PPRC secondary device has been received + * by the secondary device. This is an operating system independent message + * that is issued by the storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 0 - Invalid Defective/Alternate Track Pointer\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * A defective track has been accessed. The subsystem generates an invalid + * Defective/Alternate Track Pointer as a part of RAID Recovery. + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 0 - Channel Returned with Incorrect retry CCW\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * A command portion of the CCW returned after a command retry sequence does + * not match the command for which retry was signaled. This is an operating + * system independent message that is issued by the storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 0 - Diagnostic of Special Command Violates File Mask\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * A command is not allowed under the Access Authorization specified by the + * File Mask. This is an operating system independent message that is issued + * by the storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 1 - Head address does not compare\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 1 - Reserved\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 1 - Device did not respond to selection\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 1 - Device check-2 error or Set Sector is not complete\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 0 - Device Error Source\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * The device has completed soft error logging. This is an operating system + * independent message that is issued by the storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 0 - Data Pinned for Device\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * Modified data in cache or in persistent storage exists for the DASD. The + * data cannot be destaged to the device. This track is the first track pinned + * for this device. This is an operating system independent message that is + * issued by the storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 6 - Overrun on channel C\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 1 - Device Status 1 not as expected\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 0 - Device Fenced - device = %02x\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: sense data byte 4 + * Description: + * The device shown in sense byte 4 has been fenced. This is an operating + * system independent message that is issued by the storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 1 - Interruption cannot be reset\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 1 - Index missing\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT F - DASD Fast Write inhibited\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * DASD Fast Write is not allowed because of a nonvolatile storage battery + * check condition. This is an operating system independent message that is + * issued by the storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 7 - Invalid tag-in for an extended command sequence\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 4 - Key area error; offset active\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 4 - Count area error; offset active\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 1 - Track physical address did not compare\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 2 - 3990 check-2 error\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 1 - Offset active cannot be reset\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 7 - RCC 1 and RCC 2 sequences not successful\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 4 - No sync byte in count address area; offset active\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 4 - Data area error\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 6 - Overrun on channel A\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 4 - No sync byte in count address area\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 5 - Data Check in the key area\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT F - Caching status reset to default\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * The storage director has assigned two new subsystem status devices and + * resets the status to its default value. This is an operating system + * independent message that is issued by the storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 5 - Data Check in the data area; offset active\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 5 - Reserved\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 1 - Device not ready\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 4 - No sync byte in key area\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 8 - DASD controller failed to set or reset the long busy latch\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 1 - Cylinder address did not compare\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 3 - Reserved\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 4 - No sync byte in data area; offset active\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 2 - Support facility errors\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 4 - Key area error\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 8 - End operation with transfer count not zero\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 2 - Microcode detected error %02x\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: error code + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 5 - Data Check in the count area; offset active\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 3 - Allegiance terminated\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * Allegiance terminated because of a Reset Allegiance or an Unconditional + * Reserve command on another channel. This is an operating system independent + * message that is issued by the storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 4 - Home address area error\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 4 - Count area error\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 7 - Invalid tag-in during selection sequence\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 4 - No sync byte in data area\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 4 - No sync byte in home address area; offset active\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 4 - Home address area error; offset active\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 4 - Data area error; offset active\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 4 - No sync byte in home address area\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 5 - Data Check in the home address area; offset active\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 5 - Data Check in the home address area\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 5 - Data Check in the count area\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 4 - No sync byte in key area; offset active\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 7 - Invalid DCC selection response or timeout\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 5 - Data Check in the data area\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT F - Operation Terminated\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * The storage system ends an operation related to an active channel program + * when termination and redrive are required and logging is not desired. + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 6 - Overrun on channel B\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 5 - Data Check in the key area; offset active\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT F - Volume is suspended duplex\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * The duplex pair volume has entered the suspended duplex state because of a + * failure. This is an operating system independent message that is issued by + * the storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 6 - Overrun on channel D\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 7 - RCC 1 sequence not successful\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 6 - Overrun on channel E\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 7 - 3990 microcode time out when stopping selection\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 6 - Overrun on channel F\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 6 - Reserved\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 7 - RCC initiated by a connection check alert\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 6 - Overrun on channel G\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 7 - extra RCC required\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 6 - Overrun on channel H\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 8 - Unexpected end operation response code\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 7 - Permanent path error (DASD controller not available)\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 7 - Missing end operation; device transfer incomplete\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT F - Reserved\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT F - Cache or nonvolatile storage equipment failure\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * An equipment failure has occurred in the cache storage or nonvolatile + * storage of the storage system. This is an operating system independent + * message that is issued by the storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 8 - DPS cannot be filled\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 8 - Error correction code hardware fault\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 7 - Missing end operation; device transfer complete\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 7 - DASD controller not available on disconnected command chain\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 8 - No interruption from device during a command chain\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 7 - No response to selection after a poll interruption\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 9 - Track physical address did not compare while oriented\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 9 - Head address did not compare\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 7 - Invalid tag-in for an immediate command sequence\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 9 - Cylinder address did not compare\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 8 - DPS checks after a system reset or selective reset\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT F - Caching reinitiated\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * Caching has been automatically reinitiated following an error. + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 8 - End operation with transfer count zero\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 7 - Reserved\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 9 - Reserved\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 8 - Short busy time-out during device selection\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT F - Caching terminated\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * The storage system was unable to initiate caching or had to suspend caching + * for a 3990 control unit. If this problem is caused by a failure condition, + * an additional message will provide more information about the failure. + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * Check for additional messages that point out possible failures. For more + * information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT F - Subsystem status cannot be determined\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * The status of a DASD Fast Write or PPRC volume cannot be determined. + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT F - Nonvolatile storage terminated\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * The storage director has stopped using nonvolatile storage or cannot + * initiate nonvolatile storage. If this problem is caused by a failure, an + * additional message will provide more information about the failure. This is + * an operating system independent message that is issued by the storage system. + * User action: + * Check for additional messages that point out possible failures. For more + * information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT 8 - Reserved\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: Write inhibited path encountered\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an informational message. + * User action: + * None. + */ + +/*? + * Text: "%s: FORMAT 9 - Device check-2 error\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * This is an operating system independent message that is issued by the + * storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT F - Track format incorrect\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * A track format error occurred while data was being written to the DASD or + * while a duplex pair was being established. This is an operating system + * independent message that is issued by the storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: FORMAT F - Cache fast write access not authorized\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * A request for Cache Fast Write Data access cannot be satisfied because + * of missing access authorization for the storage system. This is an operating + * system independent message that is issued by the storage system. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: Data recovered during retry with PCI fetch mode active\n" + * Severity: Emerg + * Parameter: + * @1: bus ID of the DASD + * Description: + * A data error has been recovered on the storages system but the Linux file + * system cannot be informed about the data mismatch. To prevent Linux from + * running with incorrect data, the DASD device driver will trigger a kernel + * panic. + * User action: + * Reset your real or virtual hardware and reboot Linux. + */ + +/*? + * Text: "%s: The specified record was not found\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * Description: + * The record to be accessed does not exist. The DASD might be unformatted + * or defect. + * User action: + * Try to format the DASD or replace it. + * ATTENTION: Formatting irreversibly destroys all data on the DASD. + */ + +/*? + * Text: "%s: ERP %p (%02x) refers to %p\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * @2: pointer to ERP + * @3: ERP status + * @4: cqr + * Description: + * This message provides debug information for the enhanced error recovery + * procedure (ERP). + * User action: + * If you do not need this information, you can suppress this message by + * switching off ERP logging, for example, by writing '1' to the 'erplog' + * sysfs attribute of the DASD. + */ + +/*? + * Text: "%s: ERP chain at END of ERP-ACTION\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * Description: + * This message provides debug information for the enhanced error recovery + * procedure (ERP). + * User action: + * If you do not need this information, you can suppress this message by + * switching off ERP logging, for example, by writing '1' to the 'erplog' + * sysfs attribute of the DASD. + */ + +/*? + * Text: "%s: The cylinder data for accessing the DASD is inconsistent\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * Description: + * An error occurred in the storage system hardware. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: Accessing the DASD failed because of a hardware error\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * Description: + * An error occurred in the storage system hardware. + * User action: + * For more information see the documentation of your storage system. + */ + +/*? + * Text: "%s: ERP chain at BEGINNING of ERP-ACTION\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * Description: + * This message provides debug information for the enhanced error recovery + * procedure (ERP). + * User action: + * If you do not need this information, you can suppress this message by + * switching off ERP logging, for example, by writing '1' to the 'erplog' + * sysfs attribute of the DASD. + */ + +/*? + * Text: "%s: ERP %p has run out of retries and failed\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * @2: ERP pointer + * Description: + * The error recovery procedure (ERP) tried to recover an error but the number + * of retries for the I/O was exceeded before the error could be resolved. + * User action: + * Check for related previous error messages. + */ + +/*? + * Text: "%s: SIM - SRC: %02x%02x%02x%02x\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * @2: sense byte + * @3: sense byte + * @4: sense byte + * @5: sense byte + * Description: + * This error message is a System Information Message (SIM) generated by the + * storage system. The System Reference Code (SRC) defines the error in detail. + * User action: + * Look up the SRC in the storage server documentation. + */ + +/*? + * Text: "%s: log SIM - SRC: %02x%02x%02x%02x\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: sense byte + * @3: sense byte + * @4: sense byte + * @5: sense byte + * Description: + * This System Information Message (SIM) is generated by the storage system. + * The System Reference Code (SRC) defines the error in detail. + * User action: + * Look up the SRC in the storage server documentation. + */ + +/*? + * Text: "%s: Reading device feature codes failed with rc=%d\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: return code + * Description: + * The device feature codes state which advanced features are supported by a + * device. + * Examples for advanced features are PAV or high performance FICON. + * Some early devices do not provide feature codes and no advanced features are + * available on these devices. + * User action: + * None, if the DASD does not provide feature codes. If the DASD provides + * feature codes, make sure that it is working correctly, then set it offline + * and back online. + */ + +/*? + * Text: "%s: A channel path group could not be established\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * Initialization of a DASD did not complete because a channel path group + * could not be established. + * User action: + * Make sure that the DASD is working correctly, then try again to set it + * online. If initialization still fails, reboot. + */ + +/*? + * Text: "%s: The DASD is not operating in multipath mode\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the DASD + * Description: + * The DASD channel path group could not be configured to use multipath mode. + * This might negatively affect I/O performance on this DASD. + * User action: + * Make sure that the DASD is working correctly, then try again to set it + * online. If initialization still fails, reboot. + */ + +/*? + * Text: "%s: Detecting the DASD disk layout failed because of an I/O error\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * Description: + * The disk layout of the DASD could not be detected because of an unexpected + * I/O error. The DASD device driver treats the device like an unformatted DASD, + * and partitions on the device are not accessible. + * User action: + * If the DASD is formatted, make sure that the DASD is working correctly, + * then set it offline and back online. If the DASD is unformatted, format the + * DASD, for example, with dasdfmt. + * ATTENTION: Formatting irreversibly destroys all data on the DASD. + */ + +/*? + * Text: "%s: An I/O request was rejected because writing is inhibited\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * Description: + * An I/O request was returned with an error indication of 'command reject' + * and 'write inhibited'. The most likely reason for this error is a + * failed write request to a device that was attached as read-only in z/VM. + * User action: + * Set the device offline, ensure that the device is configured correctly in + * z/VM, then set the device online again. + */ + +/*? + * Text: "%s: An Alias device was reassigned to a new base device with UID: %s\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the alias + * @2: UID of new base device + * Description: + * The alias device with the indicated bus ID has been reassigned. The UID of the new base device is shown in the message. + * User action: + * None. + */ + +/*? + * Text: "%s: Detecting the maximum supported data size for zHPF requests failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * High Performance FICON (zHPF) requests are limited to a hardware-dependent + * maximum data size. The DASD device driver failed to detect this size and zHPF + * is not available for this device. + * User action: + * Set the device offline and online again. If this problem persists, gather + * Linux debug data and report the problem to your support organization. + */ + +/*? + * Text: "%s: Reading device feature codes failed (rc=%d) for new path %x\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: return code + * @3: path mask + * Description: + * A new path has been made available to the a device. + * A command to read the device feature codes on this device returned an error. + * The new path will not be used for I/O. + * User action: + * Set the new path offline and online again to repeat the path verification. + * Alternatively, set the device offline and online again to + * verify all available paths for this device. + * If this problem persists, gather Linux debug data and report the problem + * to your support organization. + */ + +/*? + * Text: "%s: Detecting the maximum data size for zHPF requests failed (rc=%d) for a new path %x\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: return code + * @3: path mask + * Description: + * High Performance FICON (zHPF) requests are limited to a hardware-dependent + * maximum data size. A command to detect this size for + * a new path returned an error. The new path will not be used for I/O. + * User action: + * Set the new path offline and online again to repeat the path verification. + * Alternatively, set the device offline and online again to + * verify all available paths for this device. + * If this problem persists, gather Linux debug data and report the problem + * to your support organization. + */ + +/*? + * Text: "%s: The maximum data size for zHPF requests %u on a new path %x is below the active maximum %u\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * @2: size in bytes + * @3: path mask + * @4: size in bytes + * Description: + * High Performance FICON (zHPF) requests are limited to a hardware-dependent + * maximum data size. The maximum of the new path is below + * the previously established common maximum for the + * existing paths for this device. This could cause requests on the new + * path to fail. The new path will not be used for I/O. + * User action: + * Set the device offline and online again to establish a new common maximum + * data size for the device. + */ + +/*? + * Text: "%s: The device reservation was lost\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * Description: + * This Linux instance has lost its reservation of the device to another + * operating system instance. Depending on the reservation policy for the + * device, I/O might be blocked until the other operating system instance + * surrenders the reservation or all I/O requests might fail until the + * device is reset. + * User action: + * None, if this situation is handled by system automation software. + * If this situation is not handled by automation, check the + * last_known_reservation_state attribute of the device in sysfs. + * If the value is 'lost', verify that the device is no longer reserved + * by another operating system instance, then set the device offline and + * online again. For any other value of the last_known_reservation_state + * no action is required. I/O will resume when the device reservation is + * surrendered by the other operating system instance. + */ + +/*? + * Text: "%s: The storage server does not support raw-track access\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * Description: + * The DASD cannot be accessed in raw-track access mode because the storage + * server does not have all required features for this access mode. + * In raw-track access mode, the DASD device driver accesses complete ECKD + * tracks. + * By default, the DASD device driver accesses only the data fields of ECKD + * devices and omits the count and key data fields. + * User action: + * Ensure that the raw_track_access sysfs attribute of the DASD has the value + * 0 to access the device in default ECKD mode. + */ + +/*? + * Text: "%s: The newly added channel path %02X will not be used because it leads to a different device %s\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * @2: logical path mask + * @3: UID + * Description: + * The newly added channel path has a different UID than the DASD device. This indicates + * an incorrect cabling. This path is not going to be used. + * User action: + * Check the cabling of the DASD device. Disconnect and reconnect the cable. + */ + +/*? + * Text: "%s: Not all channel paths lead to the same device, path %02X leads to device %s instead of %s\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * @2: logical path mask + * @3: UID + * @4: UID + * Description: + * Some channel paths have a different UID than others. This indicates + * an incorrect cabling. The DASD device is not enabled. + * User action: + * Check cabling of the DASD device and retry to enable the device. + */ + +/*? + * Text: "Service on the storage server caused path %x.%02x to go offline" + * Severity: Warning + * Parameter: + * @1: channel subsystem ID + * @2: CHPID + * Description: + * A channel path to the DASD has been set offline because of + * a service action on the storage server. The path will be set back + * online automatically when the service action is completed. + * User action: + * None. + */ + +/*? + * Text: "Path %x.%02x is back online after service on the storage server" + * Severity: Informational + * Parameter: + * @1: channel subsystem ID + * @2: CHPID + * Description: + * A path had been set offline temporarily because of a service + * action on the storage server. + * The service action has completed, and the channel path is available + * again. + * User action: + * None. + */ + +/*? + * Text: "%s: High Performance FICON disabled\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * Description: + * High Performance FICON (HPF) has been disabled. Either the device + * lost HPF functionality, or none of the remaining channel paths are + * HPF capable. + * User action: + * Report the problem to your support organization. + * Ensure that the cabling between the storage server and the mainframe + * system is securely in place. + * Reset the device and channel paths by writing "all" or a logical path mask + * to the path_reset sysfs attribute of the device. + */ + +/*? + * Text: "%s: Channel path %02X lost HPF functionality and is disabled\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * @2: logical path mask + * Description: + * A channel path has lost High Performance FICON (HPF) functionality + * and was removed from regular operations. + * User action: + * Report the problem to your support organization. + * Ensure that the cabling between the storage server and the mainframe + * system is securely in place. + * Reset the device and channel paths by writing "all" or a logical path mask + * to the path_reset sysfs attribute of the device. + */ + +/*? + * Text: "%s: Path %x.%02x (pathmask %02x) is disabled - IFCC threshold exceeded\n" + * Severity: Error + * Parameter: + * @1: bus ID of the DASD + * @2: cssid + * @3: chpid + * @4: logical path mask + * Description: + * Due to numerous interface or channel control checks (IFCCs), a channel path + * was removed from regular operations to retain good I/O performance. + * User action: + * Ensure that the cabling between the storage server and the mainframe + * system is securely in place. + * Reset the device and channel paths by writing "all" or a logical path mask + * to the path_reset sysfs attribute of the device. + * If the problem persists, report it to your support organization. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/dasd-fba +++ linux-riscv-5.4.0/Documentation/kmsg/s390/dasd-fba @@ -0,0 +1,36 @@ + +/*? + * Text: "%s: New FBA DASD %04X/%02X (CU %04X/%02X) with %d MB and %d B/blk%s\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the DASD + * @2: device type + * @3: device model + * @4: control unit type + * @5: control unit model + * @6: size + * @7: bytes per block + * @8: access mode + * Description: + * A DASD with the shown characteristics has been set online. + * If the DASD is configured as read-only to the real or virtual hardware, + * the message includes an indication of this hardware access mode. The + * hardware access mode is independent from the 'readonly' attribute of + * the device in sysfs. + * User action: + * None. + */ + +/*? + * Text: "%s: Allocating memory for private DASD data failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the DASD + * Description: + * The DASD device driver maintains data structures for each DASD it manages. + * There is not enough memory to allocate these data structures for one or + * more DASD. + * User action: + * Free some memory and try the operation again. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/dcssblk +++ linux-riscv-5.4.0/Documentation/kmsg/s390/dcssblk @@ -0,0 +1,206 @@ +/*? + * Text: "Adjacent DCSSs %s and %s are not contiguous\n" + * Severity: Error + * Parameter: + * @1: name 1 + * @2: name 2 + * Description: + * You can only map a set of two or more DCSSs to a single DCSS device if the + * DCSSs in the set form a contiguous memory space. The DCSS device cannot be + * created because there is a memory gap between two adjacent DCSSs. + * User action: + * Ensure that you have specified all DCSSs that belong to the set. Check the + * definitions of the DCSSs on the z/VM hypervisor to verify that they form + * a contiguous memory space. + */ + +/*? + * Text: "DCSS %s and DCSS %s have incompatible types\n" + * Severity: Error + * Parameter: + * @1: name 1 + * @2: name 2 + * Description: + * You can only map a set of two or more DCSSs to a single DCSS device if + * either all DCSSs in the set have the same type or if the set contains DCSSs + * of the two types EW and EN but no other type. The DCSS device cannot be + * created because at least two of the specified DCSSs are not compatible. + * User action: + * Check the definitions of the DCSSs on the z/VM hypervisor to verify that + * their types are compatible. + */ + +/*? + * Text: "DCSS %s is of type SC and cannot be loaded as exclusive-writable\n" + * Severity: Error + * Parameter: + * @1: device name + * Description: + * You cannot load a DCSS device in exclusive-writable access mode if the DCSS + * devise maps to one or more DCSSs of type SC. + * User action: + * Load the DCSS in shared access mode. + */ + +/*? + * Text: "DCSS device %s is removed after a failed access mode change\n" + * Severity: Error + * Parameter: + * @1: device name + * Description: + * To change the access mode of a DCSS device, all DCSSs that map to the device + * were unloaded. Reloading the DCSSs for the new access mode failed and the + * device is removed. + * User action: + * Look for related messages to find out why the DCSSs could not be reloaded. + * If necessary, add the device again. + */ + +/*? + * Text: "All DCSSs that map to device %s are saved\n" + * Severity: Informational + * Parameter: + * @1: device name + * Description: + * A save request has been submitted for the DCSS device. Changes to all DCSSs + * that map to the device are saved permanently. + * User action: + * None. + */ + +/*? + * Text: "Device %s is in use, its DCSSs will be saved when it becomes idle\n" + * Severity: Informational + * Parameter: + * @1: device name + * Description: + * A save request for the device has been deferred until the device becomes + * idle. Then changes to all DCSSs that the device maps to will be saved + * permanently. + * User action: + * None. + */ + +/*? + * Text: "A pending save request for device %s has been canceled\n" + * Severity: Informational + * Parameter: + * @1: device name + * Description: + * A save request for the DCSSs that map to a DCSS device has been pending + * while the device was in use. This save request has been canceled. Changes to + * the DCSSs will not be saved permanently. + * User action: + * None. + */ + +/*? + * Text: "Loaded %s with total size %lu bytes and capacity %lu sectors\n" + * Severity: Informational + * Parameter: + * @1: DCSS names + * @2: total size in bytes + * @3: total size in 512 byte sectors + * Description: + * The listed DCSSs have been verified as contiguous and successfully loaded. + * The displayed sizes are the sums of all DCSSs. + * User action: + * None. + */ + +/*? + * Text: "Device %s cannot be removed because it is not a known device\n" + * Severity: Warning + * Parameter: + * @1: device name + * Description: + * The DCSS device you are trying to remove is not known to the DCSS device + * driver. + * User action: + * List the entries under /sys/devices/dcssblk/ to see the names of the + * existing DCSS devices. + */ + +/*? + * Text: "Device %s cannot be removed while it is in use\n" + * Severity: Warning + * Parameter: + * @1: device name + * Description: + * You are trying to remove a device that is in use. + * User action: + * Make sure that all users of the device close the device before you try to + * remove it. + */ + +/*? + * Text: "Device %s has become idle and is being saved now\n" + * Severity: Informational + * Parameter: + * @1: device name + * Description: + * A save request for the DCSSs that map to a DCSS device has been pending + * while the device was in use. The device has become idle and all changes + * to the DCSSs are now saved permanently. + * User action: + * None. + */ + +/*? + * Text: "Writing to %s failed because it is a read-only device\n" + * Severity: Warning + * Parameter: + * @1: device name + * Description: + * The DCSS device is in shared access mode and cannot be written to. Depending + * on the type of the DCSSs that the device maps to, you might be able to + * change the access mode to exclusive-writable. + * User action: + * If the DCSSs of the device are of type SC, do not attempt to write to the + * device. If the DCSSs of the device are of type ER or SR, change the access + * mode to exclusive-writable before writing to the device. + */ + +/*? + * Text: "The address range of DCSS %s changed while the system was suspended\n" + * Severity: Error + * Parameter: + * @1: device name + * Description: + * After resuming the system, the start address or end address of a DCSS does + * not match the address when the system was suspended. DCSSs must not be + * changed after the system was suspended. + * This error cannot be recovered. The system is stopped with a kernel panic. + * User action: + * Reboot Linux. + */ + +/*? + * Text: "Suspending the system failed because DCSS device %s is writable\n" + * Severity: Error + * Parameter: + * @1: device name + * Description: + * A system cannot be suspended if one or more DCSSs are accessed in exclusive- + * writable mode. DCSS segment types EW, SW, and EN are always writable and + * must be removed before a system is suspended. + * User action: + * Remove all DCSSs of segment types EW, SW, and EN by writing the DCSS name to + * the sysfs 'remove' attribute. Set the access mode for all DCSSs of segment + * types SR and ER to read-only by writing 1 to the sysfs 'shared' attribute of + * the DCSS. Then try again to suspend the system. + */ + +/*? + * Text: "DCSS %s is of type SN or EN and cannot be saved\n" + * Severity: Warning + * Parameter: + * @1: DCSS name + * Description: + * DCSSs of type SN or EN cannot be saved. + * User action: + * If the DCSS was set up with the intention to prevent the content from being saved, + * no action is necessary. + * To be able to save the content, you must define the DCSS with a type other than SN or EN. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/diag288_wdt +++ linux-riscv-5.4.0/Documentation/kmsg/s390/diag288_wdt @@ -0,0 +1,66 @@ +/*? + * Text: "The watchdog cannot be activated\n" + * Severity: Error + * Description: + * Diagnose instruction 0x288 was called to activate the diag288 watchdog. + * The diagnose call returned an error that cannot be handled by the device driver. + * The watchdog stays inactive. + * User action: + * Contact your support organization. + */ + +/*? + * Text: "The watchdog cannot be initialized\n" + * Severity: Error + * Description: + * Diagnose instruction 0x288 was called to initialize the diag288 watchdog. + * The diagnose call returned an error that cannot be handled by the device driver. + * The watchdog stays inactive. + * A possible reason for this error is that your real or virtual hardware does not support + * the diag288 watchdog. + * User action: + * Confirm that the diag288 watchdog is supported in your environment. + * Use a watchdog that is supported in your environment. + */ + +/*? + * Text: "The watchdog cannot be deactivated\n" + * Severity: Error + * Description: + * Diagnose instruction 0x288 was called to deactivate the diag288 watchdog. + * The diagnose call returned an error that cannot be handled by the device driver. + * The watchdog stays active and a watchdog timeout will trigger the configured timeout action. + * The diag288 watchdog device driver might intentionally be configured to prevent deactivation. + * User action: + * You can configure the diag288 watchdog device driver such that it can be deactivated. + * If the diag288 device driver has been compiled as a separate module, diag288_wdt, reload the module + * without specifying the 'nowayout' module parameter. + * If the diag288 device driver has been compiled into your kernel, + * reboot Linux without specifying the 'diag288.nowayout' kernel parameter'. + */ + +/*? + * Text: "The watchdog timer cannot be started or reset\n" + * Severity: Error + * Description: + * Diagnose instruction 0x288 was called to start the diag288 watchdog or to set timer back to zero. + * The diagnose call returned an error that cannot be handled by the device driver. + * The watchdog stays inactive or becomes inactive. + * User action: + * Contact your support organization. + */ + +/*? + * Text: "Linux cannot be suspended while the watchdog is in use\n" + * Severity: Error + * Description: + * The watchdog must not time out while Linux is suspended. + * Therefore, the diag288 watchdog device driver prevents Linux from being suspended + * while the watchdog is in use. + * User action: + * i) Stop the watchdog application. ii) If the problem persists, close the watchdog + * device node by issuing 'echo V > /dev/watchdog'. + * iii) If the device driver still prevents Linux from being suspended, + * contact your support organization. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/extmem +++ linux-riscv-5.4.0/Documentation/kmsg/s390/extmem @@ -0,0 +1,293 @@ +/*? + * Text: "Querying a DCSS type failed with rc=%ld\n" + * Severity: Warning + * Parameter: + * @1: return code + * Description: + * The DCSS kernel interface used z/VM diagnose call X'64' to query the + * type of a DCSS. z/VM failed to determine the type and returned an error. + * User action: + * Look for related messages to find out which DCSS is affected. + * For details about the return codes see the section about DIAGNOSE Code + * X'64' in "z/VM CP Programming Services". + */ + +/*? + * Text: "Loading DCSS %s failed with rc=%ld\n" + * Severity: Warning + * Parameter: + * @1: DCSS name + * @2: return code + * Description: + * The DCSS kernel interface used diagnose call X'64' to load a DCSS. z/VM + * failed to load the DCSS and returned an error. + * User action: + * For details about the return codes see the section about DIAGNOSE Code + * X'64' in "z/VM CP Programming Services". + */ + +/*? + * Text: "DCSS %s of range %p to %p and type %s loaded as exclusive-writable\n" + * Severity: Informational + * Parameter: + * @1: DCSS name + * @2: starting page address + * @3: ending page address + * @4: DCSS type + * Description: + * The DCSS was loaded successfully in exclusive-writable access mode. + * User action: + * None. + */ + +/*? + * Text: "DCSS %s of range %p to %p and type %s loaded in shared access mode\n" + * Severity: Informational + * Parameter: + * @1: DCSS name + * @2: starting page address + * @3: ending page address + * @4: DCSS type + * Description: + * The DCSS was loaded successfully in shared access mode. + * User action: + * None. + */ + +/*? + * Text: "DCSS %s is already in the requested access mode\n" + * Severity: Informational + * Parameter: + * @1: DCSS name + * Description: + * A request to reload a DCSS with a new access mode has been rejected + * because the new access mode is the same as the current access mode. + * User action: + * None. + */ + +/*? + * Text: "DCSS %s is in use and cannot be reloaded\n" + * Severity: Warning + * Parameter: + * @1: DCSS name + * Description: + * Reloading a DCSS in a different access mode has failed because the DCSS is + * being used by one or more device drivers. The DCSS remains loaded with the + * current access mode. + * User action: + * Ensure that the DCSS is not used by any device driver then try again to + * load the DCSS with the new access mode. + */ + +/*? + * Text: "DCSS %s overlaps with used memory resources and cannot be reloaded\n" + * Severity: Warning + * Parameter: + * @1: DCSS name + * Description: + * The DCSS has been unloaded and cannot be reloaded because it overlaps with + * another loaded DCSS or with the memory of the z/VM guest virtual machine + * (guest storage). + * User action: + * Ensure that no DCSS is loaded that has overlapping memory resources + * with the DCSS you want to reload. If the DCSS overlaps with guest storage, + * use the DEF STORE CONFIG z/VM CP command to create a sufficient storage gap + * for the DCSS. For details, see the section about the DCSS device driver in + * "Device Drivers, Features, and Commands". + */ + +/*? + * Text: "Reloading DCSS %s failed with rc=%ld\n" + * Severity: Warning + * Parameter: + * @1: DCSS name + * @2: return code + * Description: + * The DCSS kernel interface used z/VM diagnose call X'64' to reload a DCSS + * in a different access mode. The DCSS was unloaded but z/VM failed to reload + * the DCSS. + * User action: + * For details about the return codes see the section about DIAGNOSE Code + * X'64' in "z/VM CP Programming Services". + */ + +/*? + * Text: "Unloading unknown DCSS %s failed\n" + * Severity: Error + * Parameter: + * @1: DCSS name + * Description: + * The specified DCSS cannot be unloaded. The DCSS is known to the DCSS device + * driver but not to the DCSS kernel interface. This problem indicates a + * program error in extmem.c. + * User action: + * Report this problem to your support organization. + */ + +/*? + * Text: "Saving unknown DCSS %s failed\n" + * Severity: Error + * Parameter: + * @1: DCSS name + * Description: + * The specified DCSS cannot be saved. The DCSS is known to the DCSS device + * driver but not to the DCSS kernel interface. This problem indicates a + * program error in extmem.c. + * User action: + * Report this problem to your support organization. + */ + +/*? + * Text: "Saving a DCSS failed with DEFSEG response code %i\n" + * Severity: Error + * Parameter: + * @1: response-code + * Description: + * The DEFSEG z/VM CP command failed to permanently save changes to a DCSS. + * User action: + * Ensure that the z/VM guest virtual machine is authorized to issue + * the CP DEFSEG command (typically privilege class E). + * Look for related messages to find the cause of this error. See also message + * HCPE in the DEFSEG section of the "z/VM CP Command and + * Utility Reference". + */ + +/*? + * Text: "Saving a DCSS failed with SAVESEG response code %i\n" + * Severity: Error + * Parameter: + * @1: response-code + * Description: + * The SAVESEG z/VM CP command failed to permanently save changes to a DCSS. + * User action: + * Ensure that the z/VM guest virtual machine is authorized to issue + * the CP SAVESEG command (typically privilege class E). + * Look for related messages to find the cause of this error. See also message + * HCPE in the SAVESEG section of the "z/VM CP Command and + * Utility Reference". + */ + +/*? + * Text: "DCSS %s cannot be loaded or queried\n" + * Severity: Error + * Parameter: + * @1: DCSS name + * Description: + * You cannot load or query the specified DCSS because it either is not defined + * in the z/VM hypervisor, or it is a class S DCSS, or it is above 2047 MB + * and the Linux system is a 31-bit system. + * User action: + * Use the CP command "QUERY NSS" to find out if the DCSS is a valid + * DCSS that can be loaded. + */ + +/*? + * Text: "DCSS %s cannot be loaded or queried without z/VM\n" + * Severity: Error + * Parameter: + * @1: DCSS name + * Description: + * A DCSS is a z/VM resource. Your Linux instance is not running as a z/VM + * guest operating system and, therefore, cannot load DCSSs. + * User action: + * Load DCSSs only on Linux instances that run as z/VM guest operating systems. + */ + +/*? + * Text: "Loading or querying DCSS %s resulted in a hardware error\n" + * Severity: Error + * Parameter: + * @1: DCSS name + * Description: + * Either the z/VM DIAGNOSE X'64' query or load call issued for the DCSS + * returned with an error. + * User action: + * Look for previous extmem message to find the return code from the + * DIAGNOSE X'64' query or load call. For details about the return codes see + * the section about DIAGNOSE Code X'64' in "z/VM CP Programming Services". + */ + +/*? + * Text: "DCSS %s has multiple page ranges and cannot be loaded or queried\n" + * Severity: Error + * Parameter: + * @1: DCSS name + * Description: + * You can only load or query a DCSS with multiple page ranges if: + * - The DCSS has 6 or fewer page ranges + * - The page ranges form a contiguous address space + * - The page ranges are of type EW or EN + * User action: + * Check the definition of the DCSS to make sure that the conditions for + * DCSSs with multiple page ranges are met. + */ + +/*? + * Text: "%s needs used memory resources and cannot be loaded or queried\n" + * Severity: Error + * Parameter: + * @1: DCSS name + * Description: + * You cannot load or query the DCSS because it overlaps with an already + * loaded DCSS or with the memory of the z/VM guest virtual machine + * (guest storage). + * User action: + * Ensure that no DCSS is loaded that has overlapping memory resources + * with the DCSS you want to load or query. If the DCSS overlaps with guest + * storage, use the DEF STORE CONFIG z/VM CP command to create a sufficient + * storage gap for the DCSS. For details, see the section about the DCSS + * device driver in "Device Drivers, Features, and Commands". + */ + +/*? + * Text: "DCSS %s is already loaded in a different access mode\n" + * Severity: Error + * Parameter: + * @1: DCSS name + * Description: + * The DCSS you are trying to load has already been loaded in a different + * access mode. You cannot simultaneously load the DCSS in different modes. + * User action: + * Reload the DCSS in a different mode or load it with the same mode in which + * it has already been loaded. + */ + +/*? + * Text: "There is not enough memory to load or query DCSS %s\n" + * Severity: Error + * Parameter: + * @1: DCSS name + * Description: + * The available memory is not enough to load or query the DCSS. + * User action: + * Free some memory and repeat the failed operation. + */ + +/*? + * Text: "DCSS %s overlaps with used storage and cannot be loaded\n" + * Severity: Error + * Parameter: + * @1: DCSS name + * Description: + * You cannot load the DCSS because it overlaps with an already loaded DCSS + * or with the memory of the z/VM guest virtual machine (guest storage). + * User action: + * Ensure that no DCSS is loaded that has overlapping memory resources + * with the DCSS you want to load. If the DCSS overlaps with guest storage, + * use the DEF STORE CONFIG z/VM CP command to create a sufficient storage gap + * for the DCSS. For details, see the section about the DCSS device driver in + * "Device Drivers, Features, and Commands". + */ + +/*? + * Text: "DCSS %s exceeds the kernel mapping range (%lu) and cannot be loaded\n" + * Severity: Error + * Parameter: + * @1: DCSS name + * @2: kernel mapping range in bytes + * Description: + * You cannot load the DCSS because it exceeds the kernel mapping range limit. + * User action: + * Ensure that the DCSS range is defined below the kernel mapping range. + */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/hmcdrv +++ linux-riscv-5.4.0/Documentation/kmsg/s390/hmcdrv @@ -0,0 +1,22 @@ +/*? + * Text: "Allocating the requested cache size of %zu bytes failed\n" + * Severity: Error + * Parameter: + * @1: size + * Description: + * You cannot use the 'hmcdrv' module. + * Either the cache size that was specified for the 'hmcdrv' module exceeded + * the maximum of 1048576 (1 megabyte), or not enough free memory was + * available. + * If the 'hmcdrv' module was compiled into the kernel, the cache size was + * specified with the 'hmcdrv.cachesize' kernel parameter. + * For a separate 'hmcdrv' module, the cache size was specified with the + * 'cachesize=' module parameter. + * User action: + * Specify a smaller cache size and try again to load the module. + * Do not exceed the maximum specification of 1048576 (1 megabyte). + * If necessary, free some memory and try again. + * If the module is compiled into the kernel, you must reboot Linux to change + * the cache size specification. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/hugetlb +++ linux-riscv-5.4.0/Documentation/kmsg/s390/hugetlb @@ -0,0 +1,13 @@ +/*? + * Text: "hugepagesz= specifies an unsupported page size %s\n" + * Severity: Error + * Parameter: + * @1: size + * Description: + * The hugepagesz= kernel parameter specifies a huge page size + * that is not supported. + * User action: + * Specify "1M" for 1 MB huge pages. These are supported as of z10. + * Specify "2G" for 2 GB huge pages. These are supported as of zEC12 + * and zBC12 machines. + */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/hvc_iucv +++ linux-riscv-5.4.0/Documentation/kmsg/s390/hvc_iucv @@ -0,0 +1,123 @@ +/*? + * Text: "The z/VM IUCV HVC device driver cannot be used without z/VM\n" + * Severity: Notice + * Description: + * The z/VM IUCV hypervisor console (HVC) device driver requires the + * z/VM inter-user communication vehicle (IUCV). + * User action: + * Set "hvc_iucv=" to zero in the kernel parameter line and reboot Linux. + */ + +/*? + * Text: "%lu is not a valid value for the hvc_iucv= kernel parameter\n" + * Severity: Error + * Parameter: + * @1: hvc_iucv_devices + * Description: + * The "hvc_iucv=" kernel parameter specifies the number of z/VM IUCV + * hypervisor console (HVC) terminal devices. + * The parameter value ranges from 0 to 8. + * If zero is specified, the z/VM IUCV HVC device driver is disabled + * and no IUCV-based terminal access is available. + * User action: + * Correct the "hvc_iucv=" setting in the kernel parameter line and + * reboot Linux. + */ + +/*? + * Text: "Creating a new HVC terminal device failed with error code=%d\n" + * Severity: Error + * Parameter: + * @1: errno + * Description: + * The device driver initialization failed to allocate a new + * HVC terminal device. + * A possible cause of this problem is memory constraints. + * User action: + * If the error code is -12 (ENOMEM), consider assigning more memory + * to your z/VM guest virtual machine. + */ + +/*? + * Text: "Registering HVC terminal device as Linux console failed\n" + * Severity: Error + * Description: + * The device driver initialization failed to set up the first HVC terminal + * device for use as Linux console. + * User action: + * If the error code is -12 (ENOMEM), consider assigning more memory + * to your z/VM guest virtual machine. + */ + +/*? + * Text: "Registering IUCV handlers failed with error code=%d\n" + * Severity: Error + * Parameter: + * @1: errno + * Description: + * The device driver initialization failed to register with z/VM IUCV to + * handle IUCV connections, as well as sending and receiving of IUCV messages. + * User action: + * Check for related IUCV error messages and see the errno manual page + * to find out what caused the problem. + */ + +/*? + * Text: "Allocating memory failed with reason code=%d\n" + * Severity: Error + * Parameter: + * @1: reason + * Description: + * The z/VM IUCV hypervisor console (HVC) device driver initialization failed, + * because of a general memory allocation failure. The reason code indicates + * the memory operation that has failed: + * kmem_cache (reason code=1), + * mempool (reason code=2), or + * hvc_iucv_allow= (reason code=3) + * User action: + * Consider assigning more memory to your z/VM guest virtual machine. + */ + +/*? + * Text: "hvc_iucv_allow= does not specify a valid z/VM user ID list\n" + * Severity: Error + * Description: + * The "hvc_iucv_allow=" kernel parameter specifies a comma-separated list + * of z/VM user IDs that are permitted to connect to the z/VM IUCV hypervisor + * device driver. + * The z/VM user IDs in the list must not exceed eight characters and must + * not contain spaces. + * User action: + * Correct the "hvc_iucv_allow=" setting in the kernel parameter line and reboot + * Linux. + */ + +/*? + * Text: "hvc_iucv_allow= specifies too many z/VM user IDs\n" + * Severity: Error + * Description: + * The "hvc_iucv_allow=" kernel parameter specifies a comma-separated list + * of z/VM user IDs that are permitted to connect to the z/VM IUCV hypervisor + * device driver. + * The number of z/VM user IDs that are specified with the "hvc_iucv_allow=" + * kernel parameter exceeds the maximum of 500. + * User action: + * Correct the "hvc_iucv_allow=" setting by reducing the z/VM user IDs in + * the list and reboot Linux. + */ + +/*? + * Text: "A connection request from z/VM user ID %s was refused\n" + * Severity: Informational + * Parameter: + * @1: ID + * Description: + * An IUCV connection request from another z/VM guest virtual machine has been + * refused. The request was from a z/VM guest virtual machine that is not + * listed by the "hvc_iucv_allow=" kernel parameter. + * User action: + * Check the "hvc_iucv_allow=" kernel parameter setting. + * Consider adding the z/VM user ID to the "hvc_iucv_allow=" list in the kernel + * parameter line and reboot Linux. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/hypfs +++ linux-riscv-5.4.0/Documentation/kmsg/s390/hypfs @@ -0,0 +1,56 @@ +/*? + * Text: "The hardware system does not support hypfs\n" + * Severity: Error + * Description: + * hypfs requires DIAGNOSE Code X'204' but this diagnose code is not available + * on your hardware. You need more recent hardware to use hypfs. + * User action: + * None. + */ + +/*? + * Text: "The hardware system does not provide all functions required by hypfs\n" + * Severity: Error + * Description: + * hypfs requires DIAGNOSE Code X'224' but this diagnode code is not available + * on your hardware. You need more recent hardware to use hypfs. + * User action: + * None. + */ + +/*? + * Text: "Updating the hypfs tree failed\n" + * Severity: Error + * Description: + * There was not enough memory available to update the hypfs tree. + * User action: + * Free some memory and try again to update the hypfs tree. Consider assigning + * more memory to your LPAR or z/VM guest virtual machine. + */ + +/*? + * Text: "%s is not a valid mount option\n" + * Severity: Error + * Parameter: + * @1: mount option + * Description: + * hypfs has detected mount options that are not valid. + * User action: + * See "Device Drivers Features and Commands" for information about valid + * mount options for hypfs. + */ + +/*? + * Text: "Initialization of hypfs failed with rc=%i\n" + * Severity: Error + * Parameter: + * @1: error code + * Description: + * Initialization of hypfs failed because of resource or hardware constraints. + * Possible reasons for this problem are insufficient free memory or missing + * hardware interfaces. + * User action: + * See errno.h for information about the error codes. + */ + +/*? Text: "Hypervisor filesystem mounted\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/iucv +++ linux-riscv-5.4.0/Documentation/kmsg/s390/iucv @@ -0,0 +1,33 @@ +/*? + * Text: "Defining an interrupt buffer on CPU %i failed with 0x%02x (%s)\n" + * Severity: Warning + * Parameter: + * @1: CPU number + * @2: hexadecimal error value + * @3: short error code explanation + * Description: + * Defining an interrupt buffer for external interrupts failed. Error + * value 0x03 indicates a problem with the z/VM directory entry of the + * z/VM guest virtual machine. This problem can also be caused by a + * program error. + * User action: + * If the error value is 0x03, examine the z/VM directory entry of your + * z/VM guest virtual machine. If the directory entry is correct or if the + * error value is not 0x03, report this problem to your support organization. + */ + +/*? + * Text: "Suspending Linux did not completely close all IUCV connections\n" + * Severity: Warning + * Description: + * When resuming a suspended Linux instance, the IUCV base code found + * data structures from one or more IUCV connections that existed before the + * Linux instance was suspended. Modules that use IUCV connections must close + * these connections when a Linux instance is suspended. This problem + * indicates an error in a program that used an IUCV connection. + * User action: + * Report this problem to your support organization. + */ + +/*? Text: "iucv_external_interrupt: out of memory\n" */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/lcs +++ linux-riscv-5.4.0/Documentation/kmsg/s390/lcs @@ -0,0 +1,169 @@ +/*? + * Text: "%s: Allocating a socket buffer to interface %s failed\n" + * Severity: Error + * Parameter: + * @1: bus ID of the LCS device + * @2: network interface + * Description: + * LAN channel station (LCS) devices require a socket buffer (SKB) structure + * for storing incoming data. The LCS device driver failed to allocate an SKB + * structure to the LCS device. A likely cause of this problem is memory + * constraints. + * User action: + * Free some memory and repeat the failed operation. + */ + +/*? + * Text: "%s: Shutting down the LCS device failed\n" + * Severity: Error + * Parameter: + * @1: bus ID of the LCS device + * Description: + * A request to shut down a LAN channel station (LCS) device resulted in an + * error. The error is logged in the LCS trace at trace level 4. + * User action: + * Try again to shut down the device. If the error persists, see the LCS trace + * to find out what causes the error. + */ + +/*? + * Text: "%s: Detecting a network adapter for LCS devices failed with rc=%d (0x%x)\n" + * Severity: Error + * Parameter: + * @1: bus ID of the LCS device + * @2: lcs_detect return code in decimal notation + * @3: lcs_detect return code in hexadecimal notation + * Description: + * The LCS device driver could not initialize a network adapter. + * User action: + * Ensure that the physical connection from the port to the network is + * in place. If the error persists, note the return code from the error + * message and contact IBM support. + */ + +/*? + * Text: "%s: A recovery process has been started for the LCS device\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the LCS device + * Description: + * The LAN channel station (LCS) device is shut down and restarted. The recovery + * process might have been initiated by a user or started automatically as a + * response to a device problem. + * User action: + * Wait until a message indicates the completion of the recovery process. + */ + +/*? + * Text: "%s: An I/O-error occurred on the LCS device\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the LCS device + * Description: + * The LAN channel station (LCS) device reported a problem that can be recovered + * by the LCS device driver. Repeated occurrences of this problem indicate a + * malfunctioning device. + * User action: + * If this problem occurs frequently, initiate a recovery process for the + * device, for example, by writing '1' to the 'recover' sysfs attribute of the + * device. + */ + +/*? + * Text: "%s: A command timed out on the LCS device\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the LCS device + * Description: + * The LAN channel station (LCS) device reported a problem that can be recovered + * by the LCS device driver. Repeated occurrences of this problem indicate a + * malfunctioning device. + * User action: + * If this problem occurs frequently, initiate a recovery process for the + * device, for example, by writing '1' to the 'recover' sysfs attribute of the + * device. + */ + +/*? + * Text: "%s: An error occurred on the LCS device, rc=%ld\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the LCS device + * @2: return code + * Description: + * The LAN channel station (LCS) device reported a problem that can be recovered + * by the LCS device driver. Repeated occurrences of this problem indicate a + * malfunctioning device. + * User action: + * If this problem occurs frequently, initiate a recovery process for the + * device, for example, by writing '1' to the 'recover' sysfs attribute of the + * device. + */ + +/*? + * Text: "%s: The LCS device stopped because of an error, dstat=0x%X, cstat=0x%X \n" + * Severity: Warning + * Parameter: + * @1: bus ID of the LCS device + * @2: device status + * @3: subchannel status + * Description: + * The LAN channel station (LCS) device reported an error. The LCS device driver + * might start a device recovery process. + * User action: + * If the device driver does not start a recovery process, initiate a recovery + * process, for example, by writing '1' to the 'recover' sysfs attribute of the + * device. If the problem persists, note the status information provided with + * the message and contact IBM support. + */ + +/*? + * Text: "%s: Starting an LCS device resulted in an error, rc=%d!\n" + * Severity: Error + * Parameter: + * @1: bus ID of the LCS device + * @2: ccw_device_start return code in decimal notation + * Description: + * The LAN channel station (LCS) device driver failed to initialize an LCS + * device. The device is not operational. + * User action: + * Initiate a recovery process, for example, by writing '1' to the 'recover' + * sysfs attribute of the device. If the problem persists, contact IBM support. + */ + +/*? + * Text: "%s: Sending data from the LCS device to the LAN failed with rc=%d\n" + * Severity: Error + * Parameter: + * @1: bus ID of the LCS device + * @2: ccw_device_resume return code in decimal notation + * Description: + * The LAN channel station (LCS) device driver could not send data to the LAN + * using the LCS device. This might be a temporary problem. Operations continue + * on the LCS device. + * User action: + * If this problem occurs frequently, initiate a recovery process, for example, + * by writing '1' to the 'recover' sysfs attribute of the device. If the + * problem persists, contact IBM support. + */ + +/*? Text: "Query IPAssist failed. Assuming unsupported!\n" */ +/*? Text: "Stoplan for %s initiated by LGW\n" */ +/*? Text: "Not enough memory to add new multicast entry!\n" */ +/*? Text: "Not enough memory for debug facility.\n" */ +/*? Text: "Adding multicast address failed. Table possibly full!\n" */ +/*? Text: "Error in opening device!\n" */ +/*? Text: "LCS device %s %s IPv6 support\n" */ +/*? Text: "Device %s successfully recovered!\n" */ +/*? Text: "LCS device %s %s Multicast support\n" */ +/*? Text: " Initialization failed\n" */ +/*? Text: "Loading %s\n" */ +/*? Text: "Initialization failed\n" */ +/*? Text: "Terminating lcs module.\n" */ +/*? Text: "Device %s could not be recovered!\n" */ +/*? Text: "Initializing the lcs device driver failed\n" */ +/*? Text: "%s: The lcs device driver failed to recover the device\n" */ +/*? Text: "netif_stop_queue() cannot be called before register_netdev()\n" */ +/*? Text: "flen=%u proglen=%u pass=%u image=%pK from=%s pid=%d\n" */ +/*? Text: "%s selects TX queue %d, but real number of TX queues is %d\n" */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/monreader +++ linux-riscv-5.4.0/Documentation/kmsg/s390/monreader @@ -0,0 +1,128 @@ +/*? + * Text: "Reading monitor data failed with rc=%i\n" + * Severity: Error + * Parameter: + * @1: return code + * Description: + * The z/VM *MONITOR record device driver failed to read monitor data + * because the IUCV REPLY function failed. The read function against + * the monitor record device returns EIO. All monitor data that has been read + * since the last read with 0 size is incorrect. + * User action: + * Disregard all monitor data that has been read since the last read with + * 0 size. If the device driver has been compiled as a separate module, unload + * and reload the monreader module. If the device driver has been compiled + * into the kernel, reboot Linux. For more information about possible causes + * of the error see the IUCV section in "z/VM CP Programming Services" and + * the *MONITOR section in "z/VM Performance". + */ + +/*? + * Text: "z/VM *MONITOR system service disconnected with rc=%i\n" + * Severity: Error + * Parameter: + * @1: IPUSER SEVER return code + * Description: + * The z/VM *MONITOR record device driver receives monitor records through + * an IUCV connection to the z/VM *MONITOR system service. This connection + * has been severed and the read function of the z/VM *MONITOR device driver + * returns EIO. All data received since the last read with 0 size is incorrect. + * User action: + * Disregard all monitor data read since the last read with 0 size. Close and + * reopen the monitor record device. For information about the IPUSER SEVER + * return codes see "z/VM Performance". + */ + +/*? + * Text: "The read queue for monitor data is full\n" + * Severity: Warning + * Description: + * The read function of the z/VM *MONITOR device driver returns EOVERFLOW + * because not enough monitor data has been read since the monitor device + * has been opened. Monitor data already read are valid and subsequent reads + * return valid data but some intermediate data might be missing. + * User action: + * Be aware that monitor data might be missing. Assure that you regularly + * read monitor data after opening the monitor record device. + */ + +/*? + * Text: "Connecting to the z/VM *MONITOR system service failed with rc=%i\n" + * Severity: Error + * Parameter: + * @1: IUCV CONNECT return code + * Description: + * The z/VM *MONITOR record device driver receives monitor records through + * an IUCV connection to the z/VM *MONITOR system service. This connection + * could not be established when the monitor record device was opened. If + * the return code is 15, your z/VM guest virtual machine is not authorized + * to connect to the *MONITOR system service. + * User action: + * If the return code is 15, ensure that the IUCV *MONITOR statement is + * included in the z/VM directory entry for your z/VM guest virtual machine. + * For other IUCV CONNECT return codes see the IUCV section in "CP Programming + * Services" and the *MONITOR section in "z/VM Performance". + */ + +/*? + * Text: "Disconnecting the z/VM *MONITOR system service failed with rc=%i\n" + * Severity: Warning + * Parameter: + * @1: IUCV SEVER return code + * Description: + * The z/VM *MONITOR record device driver receives monitor data through an + * IUCV connection to the z/VM *MONITOR system service. This connection + * could not be closed when the monitor record device was closed. You might + * not be able to resume monitoring. + * User action: + * No immediate action is necessary. If you cannot open the monitor record + * device in the future, reboot Linux. For information about the IUCV SEVER + * return codes see the IUCV section in "CP Programming Services" and the + * *MONITOR section in "z/VM Performance". + */ + +/*? + * Text: "The z/VM *MONITOR record device driver cannot be loaded without z/VM\n" + * Severity: Error + * Description: + * The z/VM *MONITOR record device driver uses z/VM system services to provide + * monitor data about z/VM guest operating systems to applications on Linux. + * On Linux instances that run in environments other than the z/VM hypervisor, + * the z/VM *MONITOR record device driver does not provide any useful + * function and the corresponding monreader module cannot be loaded. + * User action: + * Load the z/VM *MONITOR record device driver only on Linux instances that run + * as guest operating systems of the z/VM hypervisor. If the z/VM *MONITOR + * record device driver has been compiled into the kernel, ignore this message. + */ + +/*? + * Text: "The z/VM *MONITOR record device driver failed to register with IUCV\n" + * Severity: Error + * Description: + * The z/VM *MONITOR record device driver receives monitor data through an IUCV + * connection and needs to register with the IUCV device driver. This + * registration failed and the z/VM *MONITOR record device driver was not + * loaded. A possible cause of this problem is insufficient memory. + * User action: + * Free some memory and try again to load the module. If the z/VM *MONITOR + * record device driver has been compiled into the kernel, you might have to + * configure more memory and reboot Linux. If you do not want to read monitor + * data, ignore this message. + */ + +/*? + * Text: "The specified *MONITOR DCSS %s does not have the required type SC\n" + * Severity: Error + * Parameter: + * @1: DCSS name + * Description: + * The DCSS that was specified with the monreader.mondcss kernel parameter or + * with the mondcss module parameter cannot be a *MONITOR DCSS because it is + * not of type SC. + * User action: + * Confirm that you are using the name of the DCSS that has been configured as + * the *MONITOR DCSS on the z/VM hypervisor. If the default name, MONDCSS, is + * used, omit the monreader.mondcss or mondcss parameter. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/monwriter +++ linux-riscv-5.4.0/Documentation/kmsg/s390/monwriter @@ -0,0 +1,17 @@ +/*? + * Text: "Writing monitor data failed with rc=%i\n" + * Severity: Error + * Parameter: + * @1: return code + * Description: + * The monitor stream application device driver used the z/VM diagnose call + * DIAG X'DC' to start writing monitor data. z/VM returned an error and the + * monitor data cannot be written. If the return code is 5, your z/VM guest + * virtual machine is not authorized to write monitor data. + * User action: + * If the return code is 5, ensure that your z/VM guest virtual machine's + * entry in the z/VM directory includes the OPTION APPLMON statement. + * For other return codes see the section about DIAGNOSE Code X'DC' + * in "z/VM CP Programming Services". + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/netiucv +++ linux-riscv-5.4.0/Documentation/kmsg/s390/netiucv @@ -0,0 +1,156 @@ +/*? + * Text: "%s: The peer interface of the IUCV device has closed the connection\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the IUCV device + * Description: + * The peer interface on the remote z/VM guest virtual machine has closed the + * connection. Do not expect further packets on this interface. Any packets + * you send to this interface will be dropped. + * User action: + * None. + */ + +/*? + * Text: "%s: The IUCV device failed to connect to z/VM guest %s\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the IUCV device + * @2: z/VM user ID + * Description: + * The connection cannot be established because the z/VM guest virtual + * machine with the peer interface is not running. + * User action: + * Ensure that the z/VM guest virtual machine with the peer interface is + * running; then try again to establish the connection. + */ + +/*? + * Text: "%s: The IUCV device failed to connect to the peer on z/VM guest %s\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the IUCV device + * @2: z/VM user ID + * Description: + * The connection cannot be established because the z/VM guest virtual machine + * with the peer interface is not configured for IUCV connections. + * User action: + * Configure the z/VM guest virtual machine with the peer interface for IUCV + * connections; then try again to establish the connection. + */ + +/*? + * Text: "%s: Connecting the IUCV device would exceed the maximum number of IUCV connections\n" + * Severity: Error + * Parameter: + * @1: bus ID of the IUCV device + * Description: + * The connection cannot be established because the maximum number of IUCV + * connections has been reached on the local z/VM guest virtual machine. + * User action: + * Close some of the established IUCV connections on the local z/VM guest + * virtual machine; then try again to establish the connection. + */ + +/*? + * Text: "%s: z/VM guest %s has too many IUCV connections to connect with the IUCV device\n" + * Severity: Error + * Parameter: + * @1: bus ID of the IUCV device + * @2: remote z/VM user ID + * Description: + * Connecting to the remote z/VM guest virtual machine failed because the + * maximum number of IUCV connections for the remote z/VM guest virtual + * machine has been reached. + * User action: + * Close some of the established IUCV connections on the remote z/VM guest + * virtual machine; then try again to establish the connection. + */ + +/*? + * Text: "%s: The IUCV device cannot connect to a z/VM guest with no IUCV authorization\n" + * Severity: Error + * Parameter: + * @1: bus ID of the IUCV device + * Description: + * Because the remote z/VM guest virtual machine is not authorized for IUCV + * connections, the connection cannot be established. + * User action: + * Add the statements 'IUCV ALLOW' and 'IUCV ANY' to the z/VM directory + * entry of the remote z/VM guest virtual machine; then try again to + * establish the connection. See "z/VM CP Planning and Administration" + * for details about the IUCV statements. + */ + +/*? + * Text: "%s: Connecting the IUCV device failed with error %d\n" + * Severity: Error + * Parameter: + * @1: bus ID of the IUCV device + * @2: error code + * Description: + * The connection cannot be established because of an IUCV CONNECT error. + * User action: + * Report this problem to your support organization. + */ + +/*? + * Text: "%s: The IUCV device has been connected successfully to %s\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the IUCV device + * @2: remote z/VM user ID + * Description: + * The connection has been established and the interface is ready to + * transmit communication packages. + * User action: + * None. + */ + +/*? + * Text: "%s: The IUCV interface to %s has been established successfully\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the IUCV device + * @2: remote z/VM user ID + * Description: + * The IUCV interface to the remote z/VM guest virtual machine has been + * established and can be activated with "ifconfig up" or an equivalent + * command. + * User action: + * None. + */ + +/*? + * Text: "%s: The IUCV device is connected to %s and cannot be removed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the IUCV device + * @2: remote z/VM user ID + * Description: + * Removing a connection failed because the interface is active with a peer + * interface on a remote z/VM guest virtual machine. + * User action: + * Deactivate the interface with "ifconfig down" or an equivalent command; + * then try again to remove the interface. + */ + +/*? + * Text: "%s: The peer z/VM guest %s has closed the connection\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the IUCV device + * @2: remote z/VM user ID + * Description: + * The peer interface is no longer available. + * User action: + * Either deactivate and remove the interface, or wait for the peer + * z/VM guest to re-establish the interface. + */ + +/*? Text: "driver unloaded\n" */ +/*? Text: "driver initialized\n" */ +/*? Text: "netif_stop_queue() cannot be called before register_netdev()\n" */ +/*? Text: "flen=%u proglen=%u pass=%u image=%pK from=%s pid=%d\n" */ +/*? Text: "%s selects TX queue %d, but real number of TX queues is %d\n" */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/numa +++ linux-riscv-5.4.0/Documentation/kmsg/s390/numa @@ -0,0 +1,11 @@ +/*? + * Text: "NUMA mode: %s\n" + * Severity: Informational + * Parameter: + * @1: mode + * Description: + * Linux started with the specified NUMA mode. + * User action: + * None. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/numa_emu +++ linux-riscv-5.4.0/Documentation/kmsg/s390/numa_emu @@ -0,0 +1,50 @@ +/*? + * Text: "Not enough memory for %d nodes, reducing node count\n" + * Severity: Warning + * Parameter: + * @1: requested number of nodes + * Description: + * Using the requested memory stripe size for emulating the requested number of + * NUMA nodes requires more than the available memory. The number of nodes is + * specified with the emu_nodes= kernel parameter. The memory stripe size to + * be used for distributing the available memory among the nodes is specified + * with the emu_size= kernel parameter. Fewer nodes were created than the + * requested number; each node has one memory stripe of the requested size. + * User action: + * Specify fewer nodes, reduce the memory stripe size, or make more memory + * available to your Linux instance. + */ + +/*? + * Text: "Creating %d nodes with memory stripe size %ld MB\n" + * Severity: Informational + * Parameter: + * @1: number of nodes + * @2: stripe size + * Description: + * NUMA emulation is activated with the reported number of NUMA nodes. + * The specified memory stripe size is used to distribute, in round-robin + * fashion, the available memory among the nodes. + * User action: + * None. + */ + +/*? + * Text: "Increasing memory stripe size from %ld MB to %ld MB\n" + * Severity: Warning + * Parameter: + * @1: requested memory stripe size + * @2: adjusted memory stripe size + * Description: + * NUMA emulation could not use the requested memory stripe size and + * therefore has increased it to the next possible value. + * The requested memory stripe size is a default value or it was specified + * with the emu_size= kernel parameter. + * The memory stripe size must be a multiple of the memory block size that + * can be read in hexadecimal notation from + * /sys/devices/system/memory/block_size_bytes. + * User action: + * To avoid this message in the future, specify a valid memory stripe size + * with the emu_size= kernel parameter. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/os_info +++ linux-riscv-5.4.0/Documentation/kmsg/s390/os_info @@ -0,0 +1,36 @@ +/*? + * Text: "entry %i: %s (addr=0x%lx size=%lu)\n" + * Severity: Informational + * Parameter: + * @1: entry ID + * @2: entry state + * @3: entry address + * @4: entry size + * Description: + * Linux is running in kdump mode and reports information defined by the + * previously running production kernel. Possible values for + * "entry state" are: + * + * - copied: The entry has been found, verified, and copied + * + * - not available: The entry has not been defined + * + * - checksum failed: The entry has been found, but it is not valid + * User action: + * If kdump fails, contact your service organization and include this message + * in the error report. + */ + +/*? + * Text: "crashkernel: addr=0x%lx size=%lu\n" + * Severity: Informational + * Parameter: + * @1: address + * @2: size + * Description: + * Linux is running in kdump mode and reports the address and size of + * the memory area that was reserved for kdump by the previously running + * production kernel. + * User action: + * None. + */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/perf +++ linux-riscv-5.4.0/Documentation/kmsg/s390/perf @@ -0,0 +1,90 @@ +/*? + * Text: "CPU[%i] CPUM_CF: ver=%u.%u A=%04x E=%04x C=%04x\n" + * Severity: Informational + * Parameter: + * @1: cpu number + * @2: first version number + * @3: second version number + * @4: counter set authorization + * @5: counter set enable controls + * @6: counter set activation controls + * Description: + * This message displays information about the CPU-measurement counter facility + * (CPUM_CF) on a particular CPU. For details, see + * "The Load-Program-Parameter and the CPU-Measurement Facilities", SA23-2260. + * User action: + * None. + */ + +/*? + * Text: "CPU[%i] CPUM_SF: basic=%i diag=%i min=%lu max=%lu cpu_speed=%u\n" + * Severity: Informational + * Parameter: + * @1: cpu number + * @2: authorization status for the basic-sampling function + * @3: authorization status for the diagnostic-sampling function + * @4: minimum sampling interval + * @5: maximum sampling interval + * @6: cpu speed + * Description: + * This message displays generic information about the CPU-measurement sampling + * facility (CPUM_SF) on a particular CPU. For details, see + * "The Load-Program-Parameter and the CPU-Measurement Facilities", SA23-2260. + * User action: + * None. + */ + +/*? + * Text: "CPU[%i] CPUM_SF: Basic-sampling: a=%i e=%i c=%i bsdes=%i tear=%016lx dear=%016lx\n" + * Severity: Informational + * Parameter: + * @1: cpu number + * @2: authorization control + * @3: enable control + * @4: activation control + * @5: basic-sampling-data-entry size + * @6: tear register contents + * @7: dear register contents + * Description: + * This message displays information about the basic-sampling function of the + * CPU-measurement sampling facility (CPUM_SF) on a particular CPU. + * For details, see + * "The Load-Program-Parameter and the CPU-Measurement Facilities", SA23-2260. + * User action: + * None. + */ + +/*? + * Text: "CPU[%i] CPUM_SF: Diagnostic-sampling: a=%i e=%i c=%i dsdes=%i tear=%016lx dear=%016lx\n" + * Severity: Informational + * Parameter: + * @1: cpu number + * @2: authorization control + * @3: enable control + * @4: activation control + * @5: diagnostic-sampling-data-entry size + * @6: tear register contents + * @7: dear register contents + * Description: + * This message displays information about the diagnostic-sampling function of the + * CPU-measurement sampling facility (CPUM_SF) on a particular CPU. + * For details, see + * "The Load-Program-Parameter and the CPU-Measurement Facilities", SA23-2260. + * User action: + * None. + */ + +/*? + * Text: "The sampling facility is already reserved by %p\n" + * Severity: Warning + * Parameter: + * @1: address of perf sampling support owner + * Description: + * A process tried to reserve the sampling facility support, but it was already + * reserved by another process. + * User action: + * Check whether another process, for example, the perf program or OProfile is + * currently active. Retry activating the sampling facility after the other + * process has ended. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/prng +++ linux-riscv-5.4.0/Documentation/kmsg/s390/prng @@ -0,0 +1,103 @@ +/* prng */ + +/*? + * Text: "prng runs in TDES mode with chunksize=%d and reseed_limit=%u\n" + * Severity: Informational + * Parameter: + * @1: read chunk size in bytes + * @2: reseed limit + * Description: + * The pseudo-random number device driver started in triple DES mode. + * For IBM mainframes earlier than IBM zEnterprise EC12 (zEC12), + * triple DES is the only available mode. + * As of zEC12, the preferred mode is SHA-512. + * User action: + * If triple DES is the expected mode, no action is required. + * Otherwise, verify that the prng started with the mode= module or + * prng.mode= kernel parameter set to a value other than 1. + * The value 1 forces triple DES mode. Also ensure that the mainframe + * runs with the latest firmware level. + */ + +/*? + * Text: "The prng module stopped after running in triple DES mode\n" + * Severity: Informational + * Description: + * The pseudo-random number device driver was running in triple DES mode. + * The device driver module, prng, was unloaded, or it stopped + * because Linux shut down. + * User action: + * None. + */ + +/*? + * Text: "The prng module cannot start in SHA-512 mode\n" + * Severity: Error + * Description: + * The pseudo-random number device driver was loaded with the mode= module parameter + * or the prng.mode= kernel parameter set to 2. This setting forces SHA-512 mode, + * but the required support for MSA 5 is not available. This support requires an IBM + * zEnterprise EC12 (zEC12) or later mainframe. + * User action: + * If your mainframe is earlier than zEC12, set the mode= module or + * prng.mode= kernel parameter to 0 or 1 to run the + * pseudo-random number device driver in triple DES mode. + * Otherwise, ensure that MSA 5 support available. + */ + +/*? + * Text: "prng runs in SHA-512 mode with chunksize=%d and reseed_limit=%u\n" + * Severity: Informational + * Parameter: + * @1: read chunk size in bytes + * @2: reseed limit + * Description: + * The pseudo-random number device driver started in SHA-512 mode. + * As of IBM zEnterprise EC12, this is the preferred mode. + * User action: + * None. + */ + +/*? + * Text: "The prng module stopped after running in SHA-512 mode\n" + * Severity: Informational + * Description: + * The pseudo-random number device driver was running in SHA-512 mode. + * The device driver module, prng, was unloaded, or stopped + * because Linux shut down. + * User action: + * None. + */ + +/*? + * Text: "The prng self test state test for the SHA-512 mode failed\n" + * Severity: Error + * Description: + * The pseudo-random number device driver is not operational because the self test failed. + * After processing a published National Institute of Standards and Technology (NIST) test vector for the + * Deterministic Random Bit Generator (DRBG) algorithm, the device driver + * was not in the expected working state. This failure might indicate + * that the cryptographic software or hardware is not working correctly. + * The processed NIST test vector was: Hash Drbg, Sha-512, Count #0. + * User action: + * Unload and reload the prng module, or + * if prng was compiled into the kernel, restart Linux. + * If the error persists, contact your support organization. + */ + +/*? + * Text: "The prng self test data test for the SHA-512 mode failed\n" + * Severity: Error + * Description: + * The pseudo-random number device driver is not operational because the self test failed. + * After processing a published National Institute of Standards and Technology (NIST) test vector for the + * Deterministic Random Bit Generator (DRBG) algorithm, the device driver + * did not produce the expected pseudo-random data. This failure might indicate + * that the cryptographic software or hardware is not working correctly. + * The processed NIST test vector was: Hash Drbg, Sha-512, Count #0. + * User action: + * Unload and reload the prng module, or + * if prng was compiled into the kernel, restart Linux. + * If the error persists, contact your support organization. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/qeth +++ linux-riscv-5.4.0/Documentation/kmsg/s390/qeth @@ -0,0 +1,929 @@ +/*? + * Text: "%s: The LAN is offline\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * Description: + * A start LAN command was sent by the qeth device driver but the physical or + * virtual adapter has not started the LAN. The LAN might take a few seconds + * to become available. + * User action: + * Check the status of the qeth device, for example, with the lsqeth command. + * If the device does not become operational within a few seconds, initiate a + * recovery process, for example, by writing '1' to the 'recover' sysfs + * attribute of the device. + */ + +/*? + * Text: "%s: A recovery process has been started for the device\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * Description: + * A recovery process was started either by the qeth device driver or through + * a user command. + * User action: + * Wait until a message indicates the completion of the recovery process. + */ + +/*? + * Text: "%s: The qeth device driver failed to recover an error on the device\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * Description: + * The qeth device driver performed an automatic recovery operation to recover + * an error on a qeth device. The recovery operation failed. + * User action: + * Try the following actions in the given order: i) Check the status of the + * qeth device, for example, with the lsqeth command. ii) Initiate a recovery + * process by writing '1' to the 'recover' sysfs attribute of the device. + * iii) Ungroup and regroup the subchannel triplet of the device. vi) Reboot + * Linux. v) If the problem persists, gather Linux debug data and report the + * problem to your support organization. + */ + +/*? + * Text: "%s: Device recovery failed to restore all offload features\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * Description: + * The qeth device driver performed a recovery operation on a qeth device. Part + * of the recovery is to restore the offload features that were enabled before + * the recovery. At least one of those offload features could not be restored. + * User action: + * Check which offload features are enabled on the device, for example with + * the "ethtool -k" command. Try to explicitly re-enable the missing offload + * features for the device, for example with the "ethtool -K" command. + */ + +/*? + * Text: "%s: The link for interface %s on CHPID 0x%X failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * @2: network interface name + * @3: CHPID + * Description: + * A network link failed. A possible reason for this error is that a physical + * network cable has been disconnected. + * User action: + * Ensure that the network cable on the adapter hardware is connected properly. + * If the connection is to a guest LAN, ensure that the device is still coupled + * to the guest LAN. + */ + +/*? + * Text: "%s: The link for %s on CHPID 0x%X has been restored\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the qeth device + * @2: network interface name + * @3: CHPID + * Description: + * A failed network link has been re-established. A device recovery is in + * progress. + * User action: + * Wait until a message indicates the completion of the recovery process. + */ + +/*? + * Text: "%s: A hardware operation timed out on the device\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * Description: + * A hardware operation timed out on the qeth device. + * User action: + * Check the status of the qeth device, for example, with the lsqeth command. + * If the device is not operational, initiate a recovery process, for example, + * by writing '1' to the 'recover' sysfs attribute of the device. + */ + +/*? + * Text: "%s: The adapter hardware is of an unknown type\n" + * Severity: Error + * Parameter: + * @1: bus ID of the qeth device + * Description: + * The qeth device driver does not recognize the adapter hardware. The cause + * of this problem could be a hardware error or a Linux level that does not + * support your adapter hardware. + * User action: + * i) Investigate if your adapter hardware is supported by your Linux level. + * Consider using hardware that is supported by your Linux level or upgrading + * to a Linux level that supports your hardware. ii) Install the latest + * firmware on your adapter hardware. iii) If the problem persists and is not + * caused by a version mismatch, contact IBM support. + */ + +/*? + * Text: "%s: The adapter is used exclusively by another host\n" + * Severity: Error + * Parameter: + * @1: bus ID of the qeth device + * Description: + * The qeth adapter is exclusively used by another host. + * User action: + * Use another qeth adapter or configure this one not exclusively to a + * particular host. + */ + +/*? + * Text: "%s: QDIO reported an error, rc=%i\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * @2: return code + * Description: + * The QDIO subsystem reported an error. + * User action: + * Check for related QDIO errors. Check the status of the qeth device, for + * example, with the lsqeth command. If the device is not operational, initiate + * a recovery process, for example, by writing '1' to the 'recover' sysfs + * attribute of the device. + */ + +/*? + * Text: "%s: There is no kernel module to support discipline %d\n" + * Severity: Error + * Parameter: + * @1: bus ID of the qeth device + * @2: discipline + * Description: + * The qeth device driver or a user command requested a kernel module for a + * particular qeth discipline. Either the discipline is not supported by the + * qeth device driver or the requested module is not available to your Linux + * system. + * User action: + * Check if the requested discipline module has been compiled into the kernel + * or is present in /lib/modules//kernel/drivers/s390/net. + */ + +/*? + * Text: "Initializing the qeth device driver failed\n" + * Severity: Error + * Parameter: + * Description: + * The base module of the qeth device driver could not be initialized. + * User action: + * See errno.h to determine the reason for the error. + * i) Reboot Linux. ii) If the problem persists, gather Linux debug data and + * report the problem to your support organization. + */ + +/*? + * Text: "%s: Registering IP address %s failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * @2: IP address + * Description: + * An IP address could not be registered with the network adapter. + * User action: + * Check if another operating system instance has already registered the + * IP address with the same network adapter or at the same logical IP subnet. + */ + +/*? + * Text: "%s: Reading the adapter MAC address failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * Description: + * The qeth device driver could not read the MAC address from the network + * adapter. + * User action: + * Ungroup and regroup the subchannel triplet of the device. If this does not + * resolve the problem, reboot Linux. If the problem persists, gather Linux + * debug data and report the problem to your support organization. + */ + +/*? + * Text: "%s: Starting ARP processing support for %s failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * @2: network interface name + * Description: + * The qeth device driver could not start ARP support on the network adapter. + * User action: + * Ungroup and regroup the subchannel triplet of the device. If this does not + * resolve the problem, reboot Linux. If the problem persists, gather Linux + * debug data and report the problem to your support organization. + */ + +/*? + * Text: "%s: Starting IP fragmentation support for %s failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * @2: network interface name + * Description: + * The qeth device driver could not start IP fragmentation support on the + * network adapter. + * User action: + * Ungroup and regroup the subchannel triplet of the device. If this does not + * resolve the problem, reboot Linux. If the problem persists, gather Linux + * debug data and report the problem to your support organization. + */ + +/*? + * Text: "%s: Starting VLAN support for %s failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * @2: network interface name + * Description: + * The qeth device driver could not start VLAN support on the network adapter. + * User action: + * None if you do not require VLAN support. If you need VLAN support, + * ungroup and regroup the subchannel triplet of the device. If this does not + * resolve the problem, reboot Linux. If the problem persists, gather Linux + * debug data and report the problem to your support organization. + */ + +/*? + * Text: "%s: Starting multicast support for %s failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * @2: network interface name + * Description: + * The qeth device driver could not start multicast support on the network + * adapter. + * User action: + * Ungroup and regroup the subchannel triplet of the device. If this does not + * resolve the problem, reboot Linux. If the problem persists, gather Linux + * debug data and report the problem to your support organization. + */ + +/*? + * Text: "%s: Activating IPv6 support for %s failed\n" + * Severity: Error + * Parameter: + * @1: bus ID of the qeth device + * @2: network interface name + * Description: + * The qeth device driver could not activate IPv6 support on the network + * adapter. + * User action: + * None if you do not require IPv6 communication. If you need IPv6 support, + * ungroup and regroup the subchannel triplet of the device. If this does not + * resolve the problem, reboot Linux. If the problem persists, gather Linux + * debug data and report the problem to your support organization. + */ + +/*? + * Text: "%s: Enabling the passthrough mode for %s failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * @2: network interface name + * Description: + * The qeth device driver could not enable the passthrough mode on the + * network adapter. The passthrough mode is required for all network traffic + * other than IPv4. In particular, the passthrough mode is required for IPv6 + * traffic. + * User action: + * None if all you want to support is IPv4 communication. If you want to support + * IPv6 or other network traffic apart from IPv4, ungroup and regroup the + * subchannel triplet of the device. If this does not resolve the problem, + * reboot Linux. If the problem persists, gather Linux debug data and report + * the problem to your support organization. + */ + +/*? + * Text: "%s: Enabling broadcast filtering for %s failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * @2: network interface name + * Description: + * The qeth device driver could not enable broadcast filtering on the network + * adapter. + * User action: + * Ungroup and regroup the subchannel triplet of the device. If this does not + * resolve the problem, reboot Linux. If the problem persists, gather Linux + * debug data and report the problem to your support organization. + */ + +/*? + * Text: "%s: Setting up broadcast filtering for %s failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * @2: network interface name + * Description: + * The qeth device driver could not set up broadcast filtering on the network + * adapter. + * User action: + * Ungroup and regroup the subchannel triplet of the device. If this does not + * resolve the problem, reboot Linux. If the problem persists, gather Linux + * debug data and report the problem to your support organization. + */ + +/*? + * Text: "%s: Setting up broadcast echo filtering for %s failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * @2: network interface name + * Description: + * The qeth device driver could not set up broadcast echo filtering on the + * network adapter. + * User action: + * Ungroup and regroup the subchannel triplet of the device. If this does not + * resolve the problem, reboot Linux. If the problem persists, gather Linux + * debug data and report the problem to your support organization. + */ + +/*? + * Text: "%s: Starting HW checksumming for %s failed, using SW checksumming\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * @2: network interface name + * Description: + * The network adapter supports hardware checksumming for IP packages + * but the qeth device driver could not start hardware checksumming on the + * adapter. The qeth device driver continues to use software checksumming for + * IP packages. + * User action: + * None if you do not require hardware checksumming for network + * traffic. If you want to enable hardware checksumming, ungroup and regroup + * the subchannel triplet of the device. If this does not resolve the problem, + * reboot Linux. If the problem persists, gather Linux debug data and report + * the problem to your support organization. + */ + +/*? + * Text: "%s: Enabling HW checksumming for %s failed, using SW checksumming\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * @2: network interface name + * Description: + * The network adapter supports hardware checksumming for IP packages + * but the qeth device driver could not enable hardware checksumming on the + * adapter. The qeth device driver continues to use software checksumming for + * IP packages. + * User action: + * None if you do not require hardware checksumming for network + * traffic. If you want to enable hardware checksumming, ungroup and regroup + * the subchannel triplet of the device. If this does not resolve the problem, + * reboot Linux. If the problem persists, gather Linux debug data and report + * the problem to your support organization. + */ + +/*? + * Text: "%s: Starting outbound TCP segmentation offload for %s failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * @2: network interface name + * Description: + * The network adapter supports TCP segmentation offload, but the qeth device + * driver could not start this support on the adapter. + * User action: + * None if you do not require TCP segmentation offload. If you want to + * enable TCP segmentation offload, ungroup and regroup the subchannel triplet + * of the device. If this does not resolve the problem, reboot Linux. If the + * problem persists, gather Linux debug data and report the problem to your + * support organization. + */ + +/*? + * Text: "%s: The network adapter failed to generate a unique ID\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * Description: + * In IBM mainframe environments, network interfaces are not identified by + * a specific MAC address. Therefore, the network adapters provide the network + * interfaces with unique IDs to be used in their IPv6 link local addresses. + * Without such a unique ID, duplicate addresses might be assigned in other + * LPARs. + * User action: + * Install the latest firmware on the adapter hardware. Manually, configure + * an IPv6 link local address for this device. + */ + +/*? + * Text: "There is no IPv6 support for the layer 3 discipline\n" + * Severity: Warning + * Description: + * If you want to use IPv6 with the layer 3 discipline, you need a Linux kernel + * with IPv6 support. Because your Linux kernel has not been compiled with + * IPv6 support, you cannot use IPv6 with the layer 3 discipline, even if your + * adapter supports IPv6. + * User action: + * Use a Linux kernel that has been complied to include IPv6 support if you + * want to use IPv6 with layer 3 qeth devices. + */ + +/*? + * Text: "%s: The qeth device is not configured for the OSI layer required by z/VM\n" + * Severity: Error + * Parameter: + * @1: bus ID of the qeth device + * Description: + * A qeth device that connects to a virtual network on z/VM must be configured for the + * same Open Systems Interconnection (OSI) layer as the virtual network. An ETHERNET + * guest LAN or VSWITCH uses the data link layer (layer 2) while an IP guest LAN + * or VSWITCH uses the network layer (layer 3). + * User action: + * If you are connecting to an ETHERNET guest LAN or VSWITCH, set the layer2 sysfs + * attribute of the qeth device to 1. If you are connecting to an IP guest LAN or + * VSWITCH, set the layer2 sysfs attribute of the qeth device to 0. + */ + +/*? + * Text: "%s: Starting source MAC-address support for %s failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * @2: network interface name + * Description: + * The qeth device driver could not enable source MAC-address on the network + * adapter. + * User action: + * Ungroup and regroup the subchannel triplet of the device. If this does not + * resolve the problem, reboot Linux. If the problem persists, gather Linux + * debug data and report the problem to your support organization. + */ + +/*? + * Text: "%s: MAC address %pM already exists\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * @2: MAC-address + * Description: + * Setting the MAC address for the qeth device fails, because this + * MAC address is already defined on the OSA CHPID. + * User action: + * Use a different MAC address for this qeth device. + */ + +/*? + * Text: "%s: MAC address %pM is not authorized\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * @2: MAC-address + * Description: + * This qeth device is a virtual network interface card (NIC), to which z/VM + * has already assigned a MAC address. z/VM MAC address verification does + * not allow you to change this predefined address. + * User action: + * None; use the MAC address that has been assigned by z/VM. + */ + +/*? + * Text: "%s: The HiperSockets network traffic analyzer is activated\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the qeth device + * Description: + * The sysfs 'sniffer' attribute of the HiperSockets device has the value '1'. + * The corresponding HiperSockets interface has been switched into promiscuous mode. + * As a result, the HiperSockets network traffic analyzer is started on the device. + * User action: + * None. + */ + + /*? + * Text: "%s: The HiperSockets network traffic analyzer is deactivated\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the qeth device + * Description: + * The sysfs 'sniffer' attribute of the HiperSockets device has the value '1'. + * Promiscuous mode has been switched off for the corresponding HiperSockets interface + * As a result, the HiperSockets network traffic analyzer is stopped on the device. + * User action: + * None. + */ + +/*? + * Text: "%s: The device is not authorized to run as a HiperSockets network traffic analyzer\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * Description: + * The sysfs 'sniffer' attribute of the HiperSockets device has the value '1'. + * The corresponding HiperSockets interface is switched into promiscuous mode + * but the network traffic analyzer (NTA) rules configured at the Support Element (SE) + * do not allow tracing. Possible reasons are: + * - Tracing is not authorized for all HiperSockets LANs in the mainframe system + * - Tracing is not authorized for this HiperSockets LAN + * - LPAR is not authorized to enable an NTA + * User action: + * Configure appropriate HiperSockets NTA rules at the SE. + */ + +/*? + * Text: "%s: A HiperSockets network traffic analyzer is already active in the HiperSockets LAN\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * Description: + * The sysfs 'sniffer' attribute of the HiperSockets device has the value '1'. + * The HiperSockets interface is switched into promiscuous mode but another + * HiperSockets device on the same HiperSockets LAN is already running as + * a network traffic analyzer. + * A HiperSockets LAN can only have one active network traffic analyzer. + * User action: + * Do not configure multiple HiperSockets devices in the same HiperSockets LAN as + * tracing devices. + */ + +/*? + * Text: "%s: Enabling HW TX checksumming for %s failed, using SW TX checksumming\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * @2: network interface name + * Description: + * The network adapter supports hardware checksumming for outgoing IP packages + * but the qeth device driver could not enable hardware TX checksumming on the + * adapter. The qeth device driver continues to use software checksumming for + * outgoing IP packages. + * User action: + * None if you do not require hardware checksumming for outgoing network + * traffic. If you want to enable hardware checksumming, ungroup and regroup + * the subchannel triplet of the device. If this does not resolve the problem, + * reboot Linux. If the problem persists, gather Linux debug data and report + * the problem to your support organization. + */ + +/*? + * Text: "%s: A connection could not be established because of an OLM limit\n" + * Severity: Error + * Parameter: + * @1: bus ID of the qeth device + * Description: + * z/OS has activated Optimized Latency Mode (OLM) for a connection through an OSA Express3 adapter. + * This reduces the maximum number of concurrent connections per physical port for shared adapters. + * The new connection would exceed the maximum. Linux cannot establish further connections using + * this adapter. + * User action: + * If possible, deactivate an existing connection that uses this adapter and try again to establish + * the new connection. If you cannot free an existing connection, use a different adapter for the + * new connection. + */ + +/*? + * Text: "%s: Setting the device online failed because of insufficient authorization\n" + * Severity: Error + * Parameter: + * @1: bus ID of the qeth device + * Description: + * The qeth device is configured with OSX CHPIDs. An OSX CHPID cannot be activated unless the LPAR is explicitly authorized to access it. + * For z/VM guest operating systems, the z/VM user ID must be explicitly authorized in addition to the LPAR. + * You grant these authorizations through the Service Element. + * User action: + * At the Service Element, authorize the LPAR and, if applicable, the z/VM user ID for using the OSX CHPIDs with which the qeth device has been configured. + * Then try again to set the device online. + */ + +/*? + * Text: "%s: portname is deprecated and is ignored\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * Description: + * An OSA-Express port name was required to identify a shared OSA port. + * All operating system instances that shared the port had to use the same port name. + * This requirement no longer applies, and the specified portname attribute is ignored. + * User action: + * For future upgrades, remove OSA port name specifications from your + * network configuration. + */ + +/*? Text: "core functions removed\n" */ +/*? Text: "%s: Device is a%s card%s%s%s\nwith link type %s.\n" */ +/*? Text: "%s: issue_next_read failed: no iob available!\n" */ +/*? Text: "%s: Priority Queueing not supported\n" */ +/*? Text: "%s: sense data available. cstat 0x%X dstat 0x%X\n" */ +/*? Text: "loading core functions\n" */ +/*? Text: "%s: MAC address %pM successfully registered on device %s\n" */ +/*? Text: "%s: Device successfully recovered!\n" */ +/*? Text: "register layer 2 discipline\n" */ +/*? Text: "unregister layer 2 discipline\n" */ +/*? Text: "%s: Hardware IP fragmentation not supported on %s\n" */ +/*? Text: "%s: IPv6 not supported on %s\n" */ +/*? Text: "%s: VLAN not supported on %s\n" */ +/*? Text: "%s: Inbound source MAC-address not supported on %s\n" */ +/*? Text: "%s: IPV6 enabled\n" */ +/*? Text: "%s: ARP processing not supported on %s!\n" */ +/*? Text: "%s: Hardware IP fragmentation enabled \n" */ +/*? Text: "%s: set adapter parameters not supported.\n" */ +/*? Text: "%s: VLAN enabled\n" */ +/*? Text: "register layer 3 discipline\n" */ +/*? Text: "%s: Outbound TSO enabled\n" */ +/*? Text: "%s: Broadcast not supported on %s\n" */ +/*? Text: "%s: Outbound TSO not supported on %s\n" */ +/*? Text: "%s: Inbound HW Checksumming not supported on %s,\ncontinuing using Inbound SW Checksumming\n" */ +/*? Text: "%s: Using no checksumming on %s.\n" */ +/*? Text: "%s: Broadcast enabled\n" */ +/*? Text: "%s: Multicast not supported on %s\n" */ +/*? Text: "%s: Using SW checksumming on %s.\n" */ +/*? Text: "%s: HW Checksumming (%sbound) enabled\n" */ +/*? Text: "unregister layer 3 discipline\n" */ +/*? Text: "%s: Multicast enabled\n" */ +/*? Text: "%s: QDIO data connection isolation is deactivated\n" */ +/*? Text: "%s: QDIO data connection isolation is activated\n" */ +/*? Text: "%s: Adapter does not support QDIO data connection isolation\n" */ +/*? Text: "%s: Adapter is dedicated. QDIO data connection isolation not supported\n" */ +/*? Text: "%s: TSO does not permit QDIO data connection isolation\n" */ +/*? Text: "%s: HW TX Checksumming enabled\n" */ +/*? Text: "netif_stop_queue() cannot be called before register_netdev()\n" */ +/*? Text: "qeth_l3: ignoring TR device\n" */ +/*? Text: "flen=%u proglen=%u pass=%u image=%pK from=%s pid=%d\n" */ +/*? Text: "%s selects TX queue %d, but real number of TX queues is %d\n" */ + +/*? + * Text: "%s: Turning off reflective relay mode at the adjacent switch failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * Description: + * The policy for the QDIO data connection isolation was + * changed successfully, and communications are now handled according to the + * new policy. The ISOLATION_FORWARD policy is no longer used, but the qeth + * device driver could not turn off the reflective relay mode on the adjacent + * switch port. + * User action: + * Check the adjacent switch for errors and correct the problem. + */ + +/*? + * Text: "%s: The adjacent switch port does not support reflective relay mode\n" + * Severity: Error + * Parameter: + * @1: bus ID of the qeth device + * Description: + * The 'isolation' sysfs attribute of the qeth device could not be set to 'forward'. + * This setting selects the ISOLATION_FORWARD policy for the QDIO data connection + * isolation. The ISOLATION_FORWARD policy requires a network adapter in Virtual + * Ethernet Port Aggregator (VEPA) mode with an adjacent switch port in reflective + * relay mode. + * User action: + * Use a switch port that supports reflective relay mode if you want to use the + * ISOLATION_FORWARD policy for the qeth device. + */ + +/*? + * Text: "%s: The reflective relay mode cannot be enabled at the adjacent switch port" + * Severity: Error + * Parameter: + * @1: bus ID of the qeth device + * Description: + * The 'isolation' sysfs attribute of the qeth device could not be set to 'forward'. + * This setting selects the ISOLATION_FORWARD policy for the QDIO data connection + * isolation. The ISOLATION_FORWARD policy requires a network adapter in Virtual + * Ethernet Port Aggregator (VEPA) mode with an adjacent switch port in reflective relay + * mode. The qeth device driver failed to enable the required reflective relay mode on + * the adjacent switch port although the switch port supports this mode. + * User action: + * Enable reflective relay mode on the switch for the adjacent port and try again. + */ + +/*? + * Text: "%s: Interface %s is down because the adjacent port is no longer in reflective relay mode\n" + * Severity: Error + * Parameter: + * @1: bus ID of the qeth device + * @2: interface name + * Description: + * The ISOLATION_FORWARD policy is active for the QDIO data connection isolation + * of the qeth device. This policy requires a network adapter in Virtual Ethernet + * Port Aggregator (VEPA) mode with an adjacent switch port in reflective relay mode. + * The reflective relay mode on the adjacent switch port was disabled. The qeth device + * was set offline and the interface was deactivated to prevent any unintended network traffic. + * User action: + * Enable the reflective relay mode again on the adjacent port or use the 'isolation' + * sysfs attribute of the qeth device to set a different policy for the QDIO data connection + * isolation. You can then resume operations by setting the qeth device back + * online and activating the interface. + */ + +/*? + * Text: "%s: Failed to create completion queue\n" + * Severity: Error + * Parameter: + * @1: bus ID of the qeth device + * Description: + * The HiperSockets device could not be configured with a completion queue. + * A completion queue is required to operate AF_IUCV communication in an LPAR. + * User action: + * i) Investigate if you have the latest firmware level in place. + * ii) If the problem persists and is not caused by a version mismatch, contact IBM + * support. + */ + +/*? + * Text: "%s: Completion Queueing supported\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the qeth device + * Description: + * The HiperSockets device supports completion queueing. This is required to + * set up AF_IUCV communication in an LPAR. + */ + +/*? + * Text: "%s: Completion Queue support enabled" + * Severity: Informational + * Parameter: + * @1: bus ID of the qeth device + * Description: + * The HiperSockets device is enabled for completion queueing. This is part of + * the process to set up AF_IUCV communication in an LPAR. + */ + +/*? + * Text: "%s: Completion Queue support disabled" + * Severity: Informational + * Parameter: + * @1: bus ID of the qeth device + * Description: + * The HiperSockets device is disabled for completion queueing. This device + * cannot or no longer be used to set up AF_IUCV communication in an LPAR. + */ + +/*? + * Text: "%s: The device represents a Bridge Capable Port\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the qeth device + * Description: + * You can configure this device as a Bridge Port. + * User action: + * None. + */ + +/*? + * Text: "%s: The device is not configured as a Bridge Port\n" + * Severity: Error + * Parameter: + * @1: bus ID of the qeth device + * Description: + * The Bridge Port role cannot be withdrawn from a device + * that is not configured as a Bridge Port. + * User action: + * None. + */ + +/*? + * Text: "%s: The LAN already has a primary Bridge Port\n" + * Severity: Error + * Parameter: + * @1: bus ID of the qeth device + * Description: + * A LAN can have multiple secondary Bridge Ports, but only + * one primary Bridge Port. Configuring the device as a + * primary Bridge Port failed because another port on the + * LAN has been configured as the primary Bridge Port. + * User action: + * Find out which operating system instance has configured the primary + * Bridge Port. Assure that the primary role for this port is withdrawn + * before trying again to configure your device as the primary Bridge + * Port. Alternatively, consider configuring your device as a secondary + * Bridge Port. + */ + +/*? + * Text: "%s: The device is already a secondary Bridge Port\n" + * Severity: Error + * Parameter: + * @1: bus ID of the qeth device + * Description: + * A device cannot be configured as a primary or secondary + * Bridge Port if it is already configured as a secondary Bridge Port. + * User action: + * None, if you want the device to be a secondary Bridge Port. + * If you want to configure the device as the primary Bridge Port, + * withdraw the secondary role by writing 'none' to the 'bridgeport_role' + * sysfs attribute of the device. Then try again to configure the + * device as the primary Bridge Port. + */ + +/*? + * Text: "%s: The LAN cannot have more secondary Bridge Ports\n" + * Severity: Error + * Parameter: + * @1: bus ID of the qeth device + * Description: + * A LAN can have up to five secondary Bridge Ports. + * You cannot configure a further device as a secondary + * Bridge Port unless the Bridge Ports role is withdrawn from one of + * the existing secondary Bridge Ports. + * User action: + * Assure that the Bridge Port role is withdrawn from one of the + * existing secondary Bridge Ports before trying again to configure your + * device as a secondary Bridge Port. + */ + +/*? + * Text: "%s: The device is already a primary Bridge Port\n" + * Severity: Error + * Parameter: + * @1: bus ID of the qeth device + * Description: + * A device cannot be configured as a primary or secondary + * Bridge Port if it is already configured as a primary Bridge Port. + * User action: + * None, if you want the device to be a primary Bridge Port. + * If you want to configure the device as a secondary Bridge Port, + * withdraw the primary role by writing 'none' to the 'bridgeport_role' + * sysfs attribute of the device. Then try again to configure the + * device as the secondary Bridge Port. + */ + +/*? + * Text: "%s: The device is not authorized to be a Bridge Port\n" + * Severity: Error + * Parameter: + * @1: bus ID of the qeth device + * Description: + * The device cannot be configured as a Bridge Port because + * the required authorizations in the hardware are not in place. + * User action: + * See your hardware documentation about how to authorize + * ports for becoming a Bridge Port. + */ + +/*? + * Text: "%s: A Bridge Port is already configured by a different operating system\n" + * Severity: Error + * Parameter: + * @1: bus ID of the qeth device + * Description: + * Linux instances cannot configure the target port as a Bridge Port. + * Another operating system already uses a Bridge Port on the HiperSockets + * or on the OSA adapter. For example, a z/VM instance might be using + * a port in a VSWITCH configuration. Multiple Bridge Ports on the same + * HiperSockets or OSA adapter must be configured by instances of the same + * operating system, for example, all Linux or all z/VM. + * User action: + * Reconsider your network topology. Configure Bridge Ports only for ports + * on adapters where any other Bridge Ports are configured by other Linux + * instances. + */ + +/*? + * Text: "%s: Setting address notification failed\n" + * Severity: Error + * Parameter: + * @1: bus ID of the qeth device + * Description: + * Enabling or disabling the address notification feature of a + * HiperSockets device failed. The device might not be configured as a + * Bridge Port. + * User action: + * None, unless you need address notifications for this device. + * If you need notifications, confirm that your device is attached to a + * HiperSockets LAN that supports Bridge Capable Ports and that your + * device is configured as a Bridge Port. If the 'bridgeport_role' + * sysfs attribute of the device contains, one of the values 'primary' + * or 'secondary' and you cannot set the address notification, contact + * your support organization. + */ + +/*? + * Text: "%s: Address notification from the Bridge Port stopped %s (%s)\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the qeth device + * @2: network interface name + * @3: error reported by the hardware + * Description: + * A Bridge Port no longer provides address notifications. + * Possible reasons include traffic overflow and that the device is no + * longer configured as a Bridge Port. A udev event with + * BRIDGEDHOST=abort was emitted to alert applications that rely on the + * address notifications. + * User action: + * None. + */ + +/*? + * Text: "%s: The qeth driver ran out of channel command buffers\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the qeth device + * Description: + * Command buffers can temporarily run out during periods of + * intense network configuration activities. + * The device driver recovers from this condition as outstanding + * commands are completed. + * User action: + * Wait for a short time. If the problem persists, + * initiate a recovery process by writing '1' to the 'recover' + * sysfs attribute of the device. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/s390dbf +++ linux-riscv-5.4.0/Documentation/kmsg/s390/s390dbf @@ -0,0 +1,83 @@ +/*? + * Text: "Root becomes the owner of all s390dbf files in sysfs\n" + * Severity: Warning + * Description: + * The S/390 debug feature you are using only supports uid/gid = 0. + * User action: + * None. + */ + +/*? + * Text: "Registering debug feature %s failed\n" + * Severity: Error + * Parameter: + * @1: feature name + * Description: + * The initialization of an S/390 debug feature failed. A likely cause of this + * problem is memory constraints. The system keeps running, but the debug + * data for this feature will not be available in sysfs. + * User action: + * Consider assigning more memory to your LPAR or z/VM guest virtual machine. + */ + +/*? + * Text: "Registering view %s/%s would exceed the maximum number of views %i\n" + * Severity: Error + * Parameter: + * @1: feature name + * @2: view name + * @3: maximum + * Description: + * The maximum number of allowed debug feature views has been reached. The + * view has not been registered. The system keeps running but the new view + * will not be available in sysfs. This is a program error. + * User action: + * Report this problem to your support partner. + */ + +/*? + * Text: "%s is not a valid level for a debug feature\n" + * Severity: Warning + * Parameter: + * @1: level + * Description: + * Setting a new level for a debug feature by using the 'level' sysfs attribute + * failed. Valid levels are the minus sign (-) and the integers in the + * range 0 to 6. The minus sign switches off the feature. The numbers switch + * the feature on, where higher numbers produce more debug output. + * User action: + * Write a valid value to the 'level' sysfs attribute. + */ + +/*? + * Text: "Flushing debug data failed because %c is not a valid area\n" + * Severity: Informational + * Parameter: + * @1: debug area number + * Description: + * Flushing a debug area by using the 'flush' sysfs attribute failed. Valid + * values are the minus sign (-) for flushing all areas, or the number of the + * respective area for flushing a single area. + * User action: + * Write a valid area number or the minus sign (-) to the 'flush' sysfs + * attribute. + */ + +/*? + * Text: "Allocating memory for %i pages failed\n" + * Severity: Informational + * Parameter: + * @1: number of pages + * Description: + * Setting the debug feature size by using the 'page' sysfs attribute failed. + * Linux did not have enough memory for expanding the debug feature to the + * requested size. + * User action: + * Use a smaller number of pages for the debug feature or allocate more + * memory to your LPAR or z/VM guest virtual machine. + */ + +/*? Text: "%s: set new size (%i pages)\n" */ +/*? Text: "%s: switched off\n" */ +/*? Text: "%s: level %i is out of range (%i - %i)\n" */ +/*? Text: "Registering view %s/%s failed due to out of memory\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/sclp_cmd +++ linux-riscv-5.4.0/Documentation/kmsg/s390/sclp_cmd @@ -0,0 +1,44 @@ +/*? Text: "sync request failed (cmd=0x%08x, status=0x%02x)\n" */ +/*? Text: "readcpuinfo failed (response=0x%04x)\n" */ +/*? Text: "configure cpu failed (cmd=0x%08x, response=0x%04x)\n" */ +/*? Text: "configure channel-path failed (cmd=0x%08x, response=0x%04x)\n" */ +/*? Text: "read channel-path info failed (response=0x%04x)\n" */ +/*? Text: "assign storage failed (cmd=0x%08x, response=0x%04x, rn=0x%04x)\n" */ +/*? Text: "configure PCI I/O adapter failed: cmd=0x%08x response=0x%04x\n" */ +/*? Text: "request failed (status=0x%02x)\n" */ +/*? Text: "request failed with response code 0x%x\n" */ + +/*? + * Text: "Memory hotplug state changed, suspend refused.\n" + * Severity: Error + * Description: + * Suspend is refused after a memory hotplug operation was performed. + * User action: + * The system needs to be restarted and no memory hotplug operation must be + * performed in order to allow suspend. + */ + +/*? + * Text: "Standby memory at 0x%llx (%lluM of %lluM usable)\n" + * Severity: Informational + * Parameter: + * @1: start address of standby memory + * @2: usable memory in MB + * @3: total detected memory in MB + * Description: + * Standby memory was detected. It can be used for memory hotplug only + * if it is aligned to the Linux hotplug memory block size. + * If the aligned amount of memory matches the total amount, + * all detected standby memory can be used. Otherwise, some of the detected + * memory is unaligned and cannot be used. + * User action: + * None, if the usable and the total amount of detected standby memory match. + * If the amounts of memory do not match, + * check the memory setup of your guest virtual machine and ensure that + * the standby memory start and end + * address is aligned to the Linux hotplug memory block size. + * On Linux, issue "cat /sys/devices/system/memory/block_size_bytes" + * to find the hotplug memory block size value in hexadecimal notation. + * On z/VM, query your memory setup with "vmcp q v store". + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/sclp_config +++ linux-riscv-5.4.0/Documentation/kmsg/s390/sclp_config @@ -0,0 +1,15 @@ +/*? + * Text: "CPU capability may have changed\n" + * Severity: Informational + * Description: + * The capability of the CPUs in the configuration may have been upgraded + * or downgraded. This message may also appear if the capability of the + * CPUs in the configuration did not change. + * For details see the STORE SYSTEM INFORMATION description in the + * "Principles of Operation." + * User action: + * The user can examine /proc/sysinfo for CPU capability values. + */ +/*? Text: "Open for Business request failed with response code 0x%04x\n" */ +/*? Text: "SCLP receiver did not register to receive Configuration Management Data Events.\n" */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/sclp_cpi +++ linux-riscv-5.4.0/Documentation/kmsg/s390/sclp_cpi @@ -0,0 +1,3 @@ +/*? Text: "request failed (status=0x%02x)\n" */ +/*? Text: "request failed with response code 0x%x\n" */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/sclp_ocf +++ linux-riscv-5.4.0/Documentation/kmsg/s390/sclp_ocf @@ -0,0 +1 @@ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/sclp_sdias +++ linux-riscv-5.4.0/Documentation/kmsg/s390/sclp_sdias @@ -0,0 +1,4 @@ +/*? Text: "sclp_send failed for get_nr_blocks\n" */ +/*? Text: "SCLP error: %x\n" */ +/*? Text: "sclp_send failed: %x\n" */ +/*? Text: "Error from SCLP while copying hsa. Event status = %x\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/scm_block +++ linux-riscv-5.4.0/Documentation/kmsg/s390/scm_block @@ -0,0 +1,51 @@ +/*? + * Text: "%lx: The capabilities of the SCM increment changed\n" + * Severity: Informational + * Parameter: + * @1: start address of the SCM increment + * Description: + * A configuration change is in progress for the storage class memory (SCM) + * increment. + * User action: + * Verify that the capability of the SCM increment is as intended; for + * example, with lsscm. + */ + +/*? + * Text: "An I/O operation to SCM failed with rc=%d\n" + * Severity: Error + * Parameter: + * @1: return code + * Description: + * An error occurred during I/O to storage class memory (SCM). The operation + * was repeated, but the maximum number of retries was exceeded before the + * request could be fulfilled. + * User action: + * Contact your support organization. + */ + +/*? + * Text: "%lx: Write access to the SCM increment is suspended\n" + * Severity: Informational + * Parameter: + * @1: start address of the SCM increment + * Description: + * A concurrent firmware upgrade is in progress. For the duration of the + * upgrade, write access to the storage class memory (SCM) increment has been + * suspended. + * User action: + * None. + */ + +/*? + * Text: "%lx: Write access to the SCM increment is restored\n" + * Severity: Informational + * Parameter: + * @1: start address of the SCM increment + * Description: + * Write access to the storage class memory (SCM) increment was restored + * after a temporary suspension during a concurrent firmware upgrade. + * User action: + * None. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/setup +++ linux-riscv-5.4.0/Documentation/kmsg/s390/setup @@ -0,0 +1,165 @@ +/*? + * Text: "The initial RAM disk does not fit into the memory\n" + * Severity: Error + * Description: + * The load address and the size of the initial RAM disk specify a memory + * area that is not available. + * User action: + * Lower the load address of the initial RAM disk, reduce the size of the + * initial RAM disk, or increase the size of the system memory to make the + * initial RAM disk fit into the memory. + */ + +/*? + * Text: "The maximum memory size is %luMB\n" + * Severity: Notice + * Parameter: + * @1: size in MB + * Description: + * The system memory size cannot exceed the amount of memory that is + * provided by the real or virtual hardware. It can be further reduced + * through an upper memory address limit that is specified with the + * mem= kernel parameter. + * User action: + * None. + */ + +/*? + * Text: "Linux is running as a z/VM guest operating system in 31-bit mode\n" + * Severity: Informational + * Description: + * The 31-bit Linux kernel detected that it is running as a guest operating + * system of the z/VM hypervisor. + * User action: + * None. + */ + +/*? + * Text: "Linux is running natively in 31-bit mode\n" + * Severity: Informational + * Description: + * The 31-bit Linux kernel detected that it is running on an IBM mainframe, + * either as the sole operating system in an LPAR or as the sole operating + * system on the entire mainframe. The Linux kernel is not running as a + * guest operating system of the z/VM hypervisor. + * User action: + * None. + */ + +/*? + * Text: "The hardware system has IEEE compatible floating point units\n" + * Severity: Informational + * Description: + * The Linux kernel detected that it is running on a hardware system with + * CPUs that have IEEE compatible floating point units. + * User action: + * None. + */ + +/*? + * Text: "The hardware system has no IEEE compatible floating point units\n" + * Severity: Informational + * Description: + * The Linux kernel detected that it is running on a hardware system with + * CPUs that do not have IEEE compatible floating point units. + * User action: + * None. + */ + +/*? + * Text: "Linux is running as a z/VM guest operating system in 64-bit mode\n" + * Severity: Informational + * Description: + * The 64-bit Linux kernel detected that it is running as a guest operating + * system of the z/VM hypervisor. + * User action: + * None. + */ + +/*? + * Text: "Linux is running under KVM in 64-bit mode\n" + * Severity: Informational + * Description: + * The 64-bit Linux kernel detected that it is running as a guest operating + * system of the KVM hypervisor. + * User action: + * None. + */ + +/*? + * Text: "Linux is running natively in 64-bit mode\n" + * Severity: Informational + * Description: + * The 64-bit Linux kernel detected that it is running on an IBM mainframe, + * either as the sole operating system in an LPAR or as the sole operating + * system on the entire mainframe. The Linux kernel is not running as a + * guest operating system of the z/VM hypervisor. + * User action: + * None. + */ + +/*? + * Text: "Defining the Linux kernel NSS failed with rc=%d\n" + * Severity: Error + * Parameter: + * @1: return code + * Description: + * The Linux kernel could not define the named saved system (NSS) with + * the z/VM CP DEFSYS command. The return code represents the numeric + * portion of the CP DEFSYS error message. + * User action: + * For return code 1, the z/VM guest virtual machine is not authorized + * to define named saved systems. + * Ensure that the z/VM guest virtual machine is authorized to issue + * the CP DEFSYS command (typically privilege class E). + * For other return codes, see the help and message documentation for + * the CP DEFSYS command. + */ + +/*? + * Text: "Saving the Linux kernel NSS failed with rc=%d\n" + * Severity: Error + * Parameter: + * @1: return code + * Description: + * The Linux kernel could not save the named saved system (NSS) with + * the z/VM CP SAVESYS command. The return code represents the numeric + * portion of the CP SAVESYS error message. + * User action: + * For return code 1, the z/VM guest virtual machine is not authorized + * to save named saved systems. + * Ensure that the z/VM guest virtual machine is authorized to issue + * the CP SAVESYS command (typically privilege class E). + * For other return codes, see the help and message documentation for + * the CP SAVESYS command. + */ + +/*? + * Text: "crashkernel reservation failed: %s\n" + * Severity: Informational + * Parameter: + * @1: reason string + * Description: + * The memory reservation for the kdump "crashkernel" parameter was not + * successful. The Linux kernel was either not able to find a free memory + * area or an invalid area has been defined. The reason string describes the + * cause of the failure in more detail. + * User action: + * Increase the memory footprint of your virtual machine or adjust the values + * for the "crashkernel" kernel parameter. Then boot your Linux system again. + */ + +/*? + * Text: "Reserving %lluMB of memory at %lluMB for crashkernel (System RAM: %luMB)\n" + * Severity: Informational + * Parameter: + * @1: amount of reserved memory + * @2: storage location of reserved memory + * @3: amount of system RAM + * Description: + * The memory reservation for the kdump "crashkernel" parameter was successful + * and a kdump kernel can now be loaded with the kexec tool. + * User action: + * None. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/smsgiucv +++ linux-riscv-5.4.0/Documentation/kmsg/s390/smsgiucv @@ -0,0 +1 @@ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/smsgiucv_app +++ linux-riscv-5.4.0/Documentation/kmsg/s390/smsgiucv_app @@ -0,0 +1 @@ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/tape +++ linux-riscv-5.4.0/Documentation/kmsg/s390/tape @@ -0,0 +1,63 @@ +/*? + * Text: "%s: A tape unit was detached while in use\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * A tape unit has been detached from the I/O configuration while a tape + * was being accessed. This typically results in I/O error messages and + * potentially in damaged data on the tape. + * User action: + * Check the output of the application that accesses the tape device. + * If this problem occurred during a write-type operation, consider repeating + * the operation after bringing the tape device back online. + */ + +/*? + * Text: "%s: A tape cartridge has been mounted\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the tape device + * Description: + * A tape cartridge has been inserted into the tape unit. The tape in the + * tape unit is ready to be accessed. + * User action: + * None. + */ + +/*? + * Text: "%s: The tape cartridge has been successfully unloaded\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the tape device + * Description: + * The tape cartridge has been unloaded from the tape unit. Insert a tape + * cartridge before accessing the tape device. + * User action: + * None. + */ + +/*? + * Text: "A cartridge is loaded in tape device %s, refusing to suspend\n" + * Severity: Error + * Parameter: + * @1: bus ID of the tape device + * Description: + * A request to suspend a tape device currently loaded with a cartridge is + * rejected. + * User action: + * Unload the tape device. Then try to suspend the system again. + */ + +/*? + * Text: "Tape device %s is busy, refusing to suspend\n" + * Severity: Error + * Parameter: + * @1: bus ID of the tape device + * Description: + * A request to suspend a tape device being currently in use is rejected. + * User action: + * Terminate applications performing tape operations + * and then try to suspend the system again. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/tape_34xx +++ linux-riscv-5.4.0/Documentation/kmsg/s390/tape_34xx @@ -0,0 +1,418 @@ +/*? + * Text: "%s: An unexpected condition %d occurred in tape error recovery\n" + * Severity: Error + * Parameter: + * @1: bus ID of the tape device + * @2: number + * Description: + * The control unit has reported an error condition that is not recognized by + * the error recovery process of the tape device driver. + * User action: + * Report this problem and the condition number from the message to your + * support organization. + */ + +/*? + * Text: "%s: A data overrun occurred between the control unit and tape unit\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * A data overrun error has occurred on the connection between the control + * unit and the tape unit. If this problem occurred during a write-type + * operation, the integrity of the data on the tape might be compromised. + * User action: + * Use a faster connection. If this problem occurred during a write-type + * operation, consider repositioning the tape and repeating the operation. + */ + +/*? + * Text: "%s: The block ID sequence on the tape is incorrect\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * The control unit has detected an incorrect block ID sequence on the tape. + * This problem typically indicates that the data on the tape is damaged. + * User action: + * If this problem occurred during a write-type operation reposition the tape + * and repeat the operation. + */ + +/*? + * Text: "%s: A read error occurred that cannot be recovered\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * A read error has occurred that cannot be recovered. The current tape might + * be damaged. + * User action: + * None. + */ + +/*? + * Text: "%s: A write error on the tape cannot be recovered\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * A write error has occurred that could not be recovered by the automatic + * error recovery process. + * User action: + * Use a different tape cartridge. + */ + +/*? + * Text: "%s: Writing the ID-mark failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * The ID-mark at the beginning of tape could not be written. The tape medium + * might be write-protected. + * User action: + * Try a different tape cartridge. Ensure that the write-protection on the + * cartridge is switched off. + */ + +/*? + * Text: "%s: Reading the tape beyond the end of the recorded area failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * A read-type operation failed because it extended beyond the end of the + * recorded area on the tape medium. + * User action: + * None. + */ + +/*? + * Text: "%s: The tape contains an incorrect block ID sequence\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * The control unit has detected an incorrect block ID sequence on the tape. + * This problem typically indicates that the data on the tape is damaged. + * User action: + * If this problem occurred during a write-type operation reposition the tape + * and repeat the operation. + */ + +/*? + * Text: "%s: A path equipment check occurred for the tape device\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * A path equipment check has occurred. This check indicates problems with the + * connection between the mainframe system and the tape control unit. + * User action: + * Ensure that the cable connections between the mainframe system and the + * control unit are securely in place and not damaged. + */ + +/*? + * Text: "%s: The tape unit cannot process the tape format\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * Either the tape unit is not able to read the format ID mark, or the + * specified format is not supported by the tape unit. + * User action: + * If you do not need the data recorded on the current tape, use a different + * tape or write a new format ID mark at the beginning of the tape. Be aware + * that writing a new ID mark leads to a loss of all data that has been + * recorded on the tape. If you need the data on the current tape, use a tape + * unit that supports the tape format. + */ + +/*? + * Text: "%s: The tape medium is write-protected\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * A write-type operation failed because the tape medium is write-protected. + * User action: + * Eject the tape cartridge, switch off the write protection on the cartridge, + * insert the cartridge, and try the operation again. + */ + +/*? + * Text: "%s: The tape does not have the required tape tension\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * The tape does not have the required tape tension. + * User action: + * Rewind and reposition the tape, then repeat the operation. + */ + +/*? + * Text: "%s: The tape unit failed to load the cartridge\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * An error has occurred while loading the tape cartridge. + * User action: + * Unload the cartridge and load it again. + */ + +/*? + * Text: "%s: Automatic unloading of the tape cartridge failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * The tape unit failed to unload the cartridge. + * User action: + * Unload the cartridge manually by using the eject button on the tape unit. + */ + +/*? + * Text: "%s: An equipment check has occurred on the tape unit\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * Possible reasons for the check condition are a unit adapter error, a buffer + * error on the lower interface, an unusable internal path, or an error that + * has occurred while loading the cartridge. + * User action: + * Examine the tape unit and the cartridge loader. Consult the tape unit + * documentation for details. + */ + +/*? + * Text: "%s: The tape information states an incorrect length\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * The tape is shorter than stated at the beginning of the tape data. A + * possible reason for this problem is that the tape might have been physically + * truncated. Data written to the tape might be incomplete or damaged. + * User action: + * If this problem occurred during a write-type operation, consider repeating + * the operation with a different tape cartridge. + */ + +/*? + * Text: "%s: The tape unit is not ready\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * The tape unit is online but not ready. + * User action: + * Turn the ready switch on the tape unit to the ready position and try the + * operation again. + */ + +/*? + * Text: "%s: The tape medium has been rewound or unloaded manually\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * The tape unit rewind button, unload button, or both have been used to + * rewind or unload the tape cartridge. A tape cartridge other than the + * intended cartridge might have been inserted or the tape medium might not + * be at the expected position. + * User action: + * Verify that the correct tape cartridge has been inserted and that the tape + * medium is at the required position before continuing to work with the tape. + */ + +/*? + * Text: "%s: The tape subsystem is running in degraded mode\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * The tape subsystem is not operating at its maximum performance. + * User action: + * Contact your service representative for the tape unit and report this + * problem. + */ + +/*? + * Text: "%s: The tape unit is already assigned\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * The tape unit is already assigned to another channel path. + * User action: + * Free the tape unit from the operating system instance to which it is + * currently assigned then try again. + */ + +/*? + * Text: "%s: The tape unit is not online\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * The tape unit is not online to the tape device driver. + * User action: + * Ensure that the tape unit is operational and that the cable connections + * between the control unit and the tape unit are securely in place and not + * damaged. + */ + +/*? + * Text: "%s: The control unit has fenced access to the tape volume\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * The control unit fences further access to the current tape volume. The data + * integrity on the tape volume might have been compromised. + * User action: + * Rewind and unload the tape cartridge. + */ + +/*? + * Text: "%s: A parity error occurred on the tape bus\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * A data parity check error occurred on the bus. Data that was read or written + * while the error occurred is not valid. + * User action: + * Reposition the tape and repeat the read-type or write-type operation. + */ + +/*? + * Text: "%s: I/O error recovery failed on the tape control unit\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * An I/O error occurred that cannot be recovered by the automatic error + * recovery process of the tape control unit. The application that operates + * the tape unit will receive a return value of -EIO which indicates an + * I/O error. The data on the tape might be damaged. + * User action: + * If this problem occurred during a write-type operation, consider + * repositioning the tape and repeating the operation. + */ + +/*? + * Text: "%s: The tape unit requires a firmware update\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * The tape unit requires firmware patches from the tape control unit but the + * required patches are not available on the control unit. + * User action: + * Make the require patches available on the control unit then reposition the + * tape and retry the operation. For details about obtaining and installing + * firmware updates see the control unit documentation. + */ + +/*? + * Text: "%s: The maximum block size for buffered mode is exceeded\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * The block to be written is larger than allowed for the buffered mode. + * User action: + * Use a smaller block size. + */ + +/*? + * Text: "%s: A channel interface error cannot be recovered\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * An error has occurred on the channel interface. This error cannot + * be recovered by the control unit error recovery process. + * User action: + * See the documentation of the control unit. + */ + +/*? + * Text: "%s: A channel protocol error occurred\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * An error was detected in the channel protocol. + * User action: + * Reposition the tape and try the operation again. + */ + +/*? + * Text: "%s: The tape unit does not support the compaction algorithm\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * The tape unit cannot read the current tape. The data on the tape has been + * compressed with an algorithm that is not supported by the tape unit. + * User action: + * Use a tape unit that supports the compaction algorithm used for the + * current tape. + */ + +/*? + * Text: "%s: The tape unit does not support tape format 3480-2 XF\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * The tape unit does not support tapes recorded in the 3480-2 XF format. + * User action: + * If you do not need the data recorded on the current tape, rewind the tape + * and overwrite it with a supported format. If you need the data on the + * current tape, use a tape unit that supports the tape format. + */ + +/*? + * Text: "%s: The tape unit does not support format 3480 XF\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * The tape unit does not support tapes recorded in the 3480 XF format. + * User action: + * If you do not need the data recorded on the current tape, rewind the tape + * and overwrite it with a supported format. If you need the data on the + * current tape, use a tape unit that supports the tape format. + */ + +/*? + * Text: "%s: The tape unit does not support the current tape length\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * The length of the tape in the cartridge is incompatible with the tape unit. + * User action: + * Either use a different tape unit or use a tape with a supported length. + */ + +/*? + * Text: "%s: The tape unit does not support the tape length\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * The length of the tape in the cartridge is incompatible with the tape + * unit. + * User action: + * Either use a different tape unit or use a tape with a supported length. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/tape_3590 +++ linux-riscv-5.4.0/Documentation/kmsg/s390/tape_3590 @@ -0,0 +1,183 @@ +/*? + * Text: "%s: The tape medium must be loaded into a different tape unit\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * The tape device has indicated an error condition that requires loading + * the tape cartridge into a different tape unit to recover. + * User action: + * Unload the cartridge and use a different tape unit to retry the operation. + */ + +/*? + * Text: "%s: Tape media information: exception %s, service %s\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * @2: exception + * @3: service + * Description: + * This is an operating system independent tape medium information message + * that was issued by the tape unit. The information in the message is + * intended for the IBM customer engineer. + * User action: + * See the documentation for the tape unit for further information. + */ + +/*? + * Text: "%s: Device subsystem information: exception %s, service %s\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * @2: exception + * @3: required service action + * Description: + * This is an operating system independent device subsystem information message + * that was issued by the tape unit. The information in the message is + * intended for the IBM customer engineer. + * User action: + * See the documentation for the tape unit for further information. + */ + +/*? + * Text: "%s: I/O subsystem information: exception %s, service %s\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * @2: exception + * @3: required service action + * Description: + * This is an operating system independent I/O subsystem information message + * that was issued by the tape unit. The information in the message is + * intended for the IBM customer engineer. + * User action: + * See the documentation for the tape unit for further information. + */ + +/*? + * Text: "%s: The tape unit has issued sense message %s\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * @2: sense message code + * Description: + * The tape unit has issued an operating system independent sense message. + * User action: + * See the documentation for the tape unit for further information. + */ + +/*? + * Text: "%s: The tape unit has issued an unknown sense message code 0x%x\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * @2: code + * Description: + * The tape device driver has received an unknown sense message from the + * tape unit. + * User action: + * See the documentation for the tape unit for further information. + */ + +/*? + * Text: "%s: MIM SEV=%i, MC=%02x, ES=%x/%x, RC=%02x-%04x-%02x\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * @2: SEV + * @3: message code + * @4: exception + * @5: required service action + * @6: refcode + * @7: mid + * @8: fid + * Description: + * This is an operating system independent information message that was + * issued by the tape unit. The information in the message is intended for + * the IBM customer engineer. + * User action: + * See to the documentation for the tape unit for further information. + */ + +/*? + * Text: "%s: IOSIM SEV=%i, DEVTYPE=3590/%02x, MC=%02x, ES=%x/%x, REF=0x%04x-0x%04x-0x%04x\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * @2: SEV + * @3: model + * @4: message code + * @5: exception + * @6: required service action + * @7: refcode1 + * @8: refcode2 + * @9: refcode3 + * Description: + * This is an operating system independent I/O subsystem information message + * that was issued by the tape unit. The information in the message is + * intended for the IBM customer engineer. + * User action: + * See the documentation for the tape unit for further information. + */ + +/*? + * Text: "%s: DEVSIM SEV=%i, DEVTYPE=3590/%02x, MC=%02x, ES=%x/%x, REF=0x%04x-0x%04x-0x%04x\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * @2: SEV + * @3: model + * @4: message code + * @5: exception + * @6: required service action + * @7: refcode1 + * @8: refcode2 + * @9: refcode3 + * Description: + * This is an operating system independent device subsystem information message + * issued by the tape unit. The information in the message is intended for + * the IBM customer engineer. + * User action: + * See the documentation for the tape unit for further information. + */ + +/*? + * Text: "%s: The tape unit has issued an unknown sense message code %x\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * @2: code + * Description: + * The tape device has issued a sense message, that is unknown to the device + * driver. + * User action: + * Use the message code printed as hexadecimal value and see the documentation + * for the tape unit for further information. + */ + +/*? + * Text: "%s: The tape unit failed to obtain the encryption key from EKM\n" + * Severity: Error + * Parameter: + * @1: bus ID of the tape device + * Description: + * The tape unit was unable to retrieve the encryption key required to decode + * the data on the tape from the enterprise key manager (EKM). + * User action: + * See the EKM and tape unit documentation for information about how to enable + * the tape unit to retrieve the encryption key. + */ + +/*? + * Text: "%s: A different host has privileged access to the tape unit\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the tape device + * Description: + * You cannot access the tape unit because a different operating system + * instance has privileged access to the unit. + * User action: + * Unload the current cartridge to solve this problem. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/time +++ linux-riscv-5.4.0/Documentation/kmsg/s390/time @@ -0,0 +1,36 @@ +/*? + * Text: "The ETR interface has adjusted the clock by %li microseconds\n" + * Severity: Notice + * Parameter: + * @1: number of microseconds + * Description: + * The external time reference (ETR) interface has synchronized the system + * clock with the external reference and set it to a new value. The time + * difference between the old and new clock value has been passed to the + * network time protocol (NTP) as a single shot adjustment. + * User action: + * None. + */ + +/*? + * Text: "The real or virtual hardware system does not provide an ETR interface\n" + * Severity: Warning + * Description: + * The 'etr=' parameter has been passed on the kernel parameter line for + * a Linux instance that does not have access to the external time reference + * (ETR) facility. + * User action: + * To avoid this warning remove the 'etr=' kernel parameter. + */ + +/*? + * Text: "The real or virtual hardware system does not provide an STP interface\n" + * Severity: Warning + * Description: + * The 'stp=' parameter has been passed on the kernel parameter line for + * a Linux instance that does not have access to the server time protocol + * (STP) facility. + * User action: + * To avoid this warning remove the 'stp=' kernel parameter. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/vmlogrdr +++ linux-riscv-5.4.0/Documentation/kmsg/s390/vmlogrdr @@ -0,0 +1,19 @@ +/*? Text: "vmlogrdr: failed to start recording automatically\n" */ +/*? Text: "vmlogrdr: connection severed with reason %i\n" */ +/*? Text: "vmlogrdr: iucv connection to %s failed with rc %i \n" */ +/*? Text: "vmlogrdr: failed to stop recording automatically\n" */ +/*? Text: "not running under VM, driver not loaded.\n" */ + +/*? + * Text: "vmlogrdr: device %s is busy. Refuse to suspend.\n" + * Severity: Error + * Parameter: + * @1: device name + * Description: + * Suspending vmlogrdr devices that are in uses is not supported. + * A request to suspend such a device is refused. + * User action: + * Close all applications that use any of the vmlogrdr devices + * and then try to suspend the system again. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/vmur +++ linux-riscv-5.4.0/Documentation/kmsg/s390/vmur @@ -0,0 +1,48 @@ +/*? + * Text: "The %s cannot be loaded without z/VM\n" + * Severity: Error + * Parameter: + * @1: z/VM virtual unit record device driver + * Description: + * The z/VM virtual unit record device driver provides Linux with access to + * z/VM virtual unit record devices like punch card readers, card punches, and + * line printers. On Linux instances that run in environments other than the + * z/VM hypervisor, the device driver does not provide any useful function and + * the corresponding vmur module cannot be loaded. + * User action: + * Load the vmur module only on Linux instances that run as guest operating + * systems of the z/VM hypervisor. If the z/VM virtual unit record device + * has been compiled into the kernel, ignore this message. + */ + +/*? + * Text: "Kernel function alloc_chrdev_region failed with error code %d\n" + * Severity: Error + * Parameter: + * @1: error code according to errno definitions + * Description: + * The z/VM virtual unit record device driver (vmur) needs to register a range + * of character device minor numbers from 0x0000 to 0xffff. + * This registration failed, probably because of memory constraints. + * User action: + * Free some memory and reload the vmur module. If the z/VM virtual unit + * record device driver has been compiled into the kernel reboot Linux. + * Consider assigning more memory to your LPAR or z/VM guest virtual machine. + */ + +/*? + * Text: "Unit record device %s is busy, %s refusing to suspend.\n" + * Severity: Error + * Parameter: + * @1: bus ID of the unit record device + * @1: z/VM virtual unit record device driver + * Description: + * Linux cannot be suspended while a unit record device is in use. + * User action: + * Stop all applications that work on z/VM spool file queues, for example, the + * vmur tool. Then try again to suspend Linux. + */ + +/*? Text: "%s loaded.\n" */ +/*? Text: "%s unloaded.\n" */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/xpram +++ linux-riscv-5.4.0/Documentation/kmsg/s390/xpram @@ -0,0 +1,74 @@ +/*? + * Text: "%d is not a valid number of XPRAM devices\n" + * Severity: Error + * Parameter: + * @1: number of partitions + * Description: + * The number of XPRAM partitions specified for the 'devs' module parameter + * or with the 'xpram.parts' kernel parameter must be an integer in the + * range 1 to 32. The XPRAM device driver created a maximum of 32 partitions + * that are probably not configured as intended. + * User action: + * If the XPRAM device driver has been compiled as a separate module, + * unload the module and load it again with a correct value for the 'devs' + * module parameter. If the XPRAM device driver has been compiled + * into the kernel, correct the 'xpram.parts' parameter in the kernel + * command line and restart Linux. + */ + +/*? + * Text: "Not enough expanded memory available\n" + * Severity: Error + * Description: + * The amount of expanded memory required to set up your XPRAM partitions + * depends on the 'sizes' parameter specified for the xpram module or on + * the specifications for the 'xpram.parts' parameter if the XPRAM device + * driver has been compiled into the kernel. Your + * current specification exceed the amount of available expanded memory. + * Your XPRAM partitions are probably not configured as intended. + * User action: + * If the XPRAM device driver has been compiled as a separate module, + * unload the xpram module and load it again with an appropriate value + * for the 'sizes' module parameter. If the XPRAM device driver has been + * compiled into the kernel, adjust the 'xpram.parts' parameter in the + * kernel command line and restart Linux. If you need more than the + * available expanded memory, increase the expanded memory allocation for + * your virtual hardware or LPAR. + */ + +/*? + * Text: "No expanded memory available\n" + * Severity: Error + * Description: + * The XPRAM device driver has been loaded in a Linux instance that runs + * in an LPAR or virtual hardware without expanded memory. + * No XPRAM partitions are created. + * User action: + * Allocate expanded memory for your LPAR or virtual hardware or do not + * load the xpram module. You can ignore this message, if you do not want + * to create XPRAM partitions. + */ + +/*? + * Text: "Resuming the system failed: %s\n" + * Severity: Error + * Parameter: + * @1: cause of the failure + * Description: + * A system cannot be resumed if the expanded memory setup changes + * after hibernation. Possible reasons for the failure are: + * - Expanded memory was removed after hibernation. + * - Size of the expanded memory changed after hibernation. + * The system is stopped with a kernel panic. + * User action: + * Reboot Linux. + */ + +/*? Text: " number of devices (partitions): %d \n" */ +/*? Text: " size of partition %d: %u kB\n" */ +/*? Text: " size of partition %d to be set automatically\n" */ +/*? Text: " memory needed (for sized partitions): %lu kB\n" */ +/*? Text: " partitions to be sized automatically: %d\n" */ +/*? Text: " automatically determined partition size: %lu kB\n" */ +/*? Text: " %u pages expanded memory found (%lu KB).\n" */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/zcrypt +++ linux-riscv-5.4.0/Documentation/kmsg/s390/zcrypt @@ -0,0 +1,22 @@ +/*? + * Text: "Cryptographic device %02x.%04x failed and was set offline\n" + * Severity: Error + * Parameter: + * @1: AP device ID + * @2: AP queue + * Description: + * A cryptographic device failed to process a cryptographic request. + * The cryptographic device driver could not correct the error and + * set the device offline. The application that issued the + * request received an indication that the request has failed. + * User action: + * Use the lszcrypt command to confirm that the cryptographic + * hardware is still configured to your LPAR or z/VM guest virtual + * machine. If the device is available to your Linux instance the + * command output contains a line that begins with 'card', + * where is the two-digit decimal number in the message text. + * After ensuring that the device is available, use the chzcrypt command to + * set it online again. + * If the error persists, contact your support organization. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/zdump +++ linux-riscv-5.4.0/Documentation/kmsg/s390/zdump @@ -0,0 +1,27 @@ +/*? + * Text: "The 32-bit dump tool cannot be used for a 64-bit system\n" + * Severity: Alert + * Description: + * The dump process ends without creating a system dump. + * User action: + * Use a 64-bit dump tool to obtain a system dump for 64-bit Linux instance. + */ +/*? + * Text: "The 64-bit dump tool cannot be used for a 32-bit system\n" + * Severity: Alert + * Description: + * The dump process ends without creating a system dump. + * User action: + * Use a 32-bit dump tool to obtain a system dump for 32-bit Linux instance. + */ +/*? + * Text: "The dump process started for a 64-bit operating system\n" + * Severity: Alert + * Description: + * The SCSI dump process started to create a dump for a 64-bit operating + * system instance. + * User action: + * None. + */ +/*? Text: "0x%x is an unknown architecture.\n" */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/zfcp +++ linux-riscv-5.4.0/Documentation/kmsg/s390/zfcp @@ -0,0 +1,709 @@ +/*? + * Text: "%s is not a valid SCSI device\n" + * Severity: Error + * Parameter: + * @1: device specification + * Description: + * The specification for an initial SCSI device provided with the 'zfcp.device' + * kernel parameter or with the 'device' module parameter is syntactically + * incorrect. The specified SCSI device could not be attached to the Linux + * system. + * User action: + * Correct the value for the 'zfcp.device' or 'device' parameter and reboot + * Linux. See "Device Drivers, Features, and Commands" for information about + * the syntax. + */ + +/*? + * Text: "The zfcp device driver could not register with the common I/O layer\n" + * Severity: Error + * Description: + * The device driver initialization failed. A possible cause of this problem is + * memory constraints. + * User action: + * Free some memory and try again to load the zfcp device driver. If the zfcp + * device driver has been compiled into the kernel, reboot Linux. Consider + * assigning more memory to your LPAR or z/VM guest virtual machine. If the + * problem persists, contact your support organization. + */ + +/*? + * Text: "%s: Setting up data structures for the FCP adapter failed\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * The zfcp device driver could not allocate data structures for an FCP adapter. + * A possible reason for this problem is memory constraints. + * User action: + * Set the FCP adapter offline or detach it from the Linux system, free some + * memory and set the FCP adapter online again or attach it again. If this + * problem persists, gather Linux debug data, collect the FCP adapter + * hardware logs, and report the problem to your support organization. + */ + +/*? + * Text: "%s: The FCP device is operational again\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * An FCP device has been unavailable because it had been detached from the + * Linux system or because the corresponding CHPID was offline. The FCP device + * is now available again and the zfcp device driver resumes all operations to + * the FCP device. + * User action: + * None. + */ + +/*? + * Text: "%s: The CHPID for the FCP device is offline\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * The CHPID for an FCP device has been set offline, either logically in Linux + * or on the hardware. + * User action: + * Find out which CHPID corresponds to the FCP device, for example, with the + * lscss command. Check if the CHPID has been set logically offline in sysfs. + * Write 'on' to the CHPID's status attribute to set it online. If the CHPID is + * online in sysfs, find out if it has been varied offline through a hardware + * management interface, for example the service element (SE). + */ + +/*? + * Text: "%s: The FCP device has been detached\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * An FCP device is no longer available to Linux. + * User action: + * Ensure that the FCP adapter is operational and attached to the LPAR or z/VM + * virtual machine. + */ + +/*? + * Text: "%s: The FCP device did not respond within the specified time\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * The common I/O layer waited for a response from the FCP adapter but + * no response was received within the specified time limit. This might + * indicate a hardware problem. + * User action: + * Consult your hardware administrator. If this problem persists, + * gather Linux debug data, collect the FCP adapter hardware logs, and + * report the problem to your support organization. + */ + +/*? + * Text: "%s: Registering the FCP device with the SCSI stack failed\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * The FCP adapter could not be registered with the Linux SCSI + * stack. A possible reason for this problem is memory constraints. + * User action: + * Set the FCP adapter offline or detach it from the Linux system, free some + * memory and set the FCP adapter online again or attach it again. If this + * problem persists, gather Linux debug data, collect the FCP adapter + * hardware logs, and report the problem to your support organization. + */ + +/*? + * Text: "%s: ERP cannot recover an error on the FCP device\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * An error occurred on an FCP device. The error recovery procedure (ERP) + * could not resolve the error. The FCP device driver cannot use the FCP device. + * User action: + * Check for previous error messages for the same FCP device to find the + * cause of the problem. + */ + +/*? + * Text: "%s: Creating an ERP thread for the FCP device failed.\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * The zfcp device driver could not set up error recovery procedure (ERP) + * processing for the FCP device. The FCP device is not available for use + * in Linux. + * User action: + * Free some memory and try again to load the zfcp device driver. If the zfcp + * device driver has been compiled into the kernel, reboot Linux. Consider + * assigning more memory to your LPAR or z/VM guest virtual machine. If the + * problem persists, contact your support organization. + */ + +/*? + * Text: "%s: ERP failed for LUN 0x%016Lx on port 0x%016Lx\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * @2: LUN + * @3: WWPN + * Description: + * An error occurred on the SCSI device at the specified LUN. The error recovery + * procedure (ERP) could not resolve the error. The SCSI device is not + * available. + * User action: + * Verify that the LUN is correct. Check the fibre channel fabric for errors + * related to the specified WWPN and LUN, the storage server, and Linux. + */ + +/*? + * Text: "%s: ERP failed for remote port 0x%016Lx\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * @2: WWPN + * Description: + * An error occurred on a remote port. The error recovery procedure (ERP) + * could not resolve the error. The port is not available. + * User action: + * Verify that the WWPN is correct and check the fibre channel fabric for + * errors related to the WWPN. + */ + +/*? + * Text: "%s: Registering port 0x%016Lx failed\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * @2: WWPN + * Description: + * The Linux kernel could not allocate enough memory to register the + * remote port with the indicated WWPN with the SCSI stack. The remote + * port is not available. + * User action: + * Free some memory and trigger the rescan for ports. + */ + +/*? + * Text: "%s: A QDIO problem occurred\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * QDIO reported a problem to the zfcp device driver. The zfcp device driver + * tries to recover this problem. + * User action: + * Check for related error messages. If this problem occurs frequently, gather + * Linux debug data and contact your support organization. + */ + +/*? + * Text: "%s: Setting up the QDIO connection to the FCP adapter failed\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * The zfcp device driver failed to establish a QDIO connection with the FCP + * adapter. + * User action: + * Set the FCP adapter offline or detach it from the Linux system, free some + * memory and set the FCP adapter online again or attach it again. If this + * problem persists, gather Linux debug data, collect the FCP adapter + * hardware logs, and report the problem to your support organization. + */ + +/*? + * Text: "%s: The FCP adapter reported a problem that cannot be recovered\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * The FCP adapter has a problem that cannot be recovered by the zfcp device + * driver. The zfcp device driver stopped using the FCP device. + * User action: + * Gather Linux debug data, collect the FCP adapter hardware logs, and report + * this problem to your support organization. + */ + +/*? + * Text: "%s: There is a wrap plug instead of a fibre channel cable\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * The FCP adapter is not physically connected to the fibre channel fabric. + * User action: + * Remove the wrap plug from the FCP adapter and connect the adapter with the + * fibre channel fabric. + */ + +/*? + * Text: "%s: FCP device not operational because of an unsupported FC class\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * The FCP adapter hardware does not support the fibre channel service class + * requested by the zfcp device driver. This problem indicates a program error + * in the zfcp device driver. + * User action: + * Gather Linux debug data, collect the FCP adapter hardware logs, and report + * this problem to your support organization. + */ + +/*? + * Text: "%s: 0x%Lx is an ambiguous request identifier\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * @2: request ID + * Description: + * The FCP adapter reported that it received the same request ID twice. This is + * an error. The zfcp device driver stopped using the FCP device. + * User action: + * Gather Linux debug data, collect the FCP adapter hardware logs, and report + * this problem to your support organization. + */ + +/*? + * Text: "%s: QTCB version 0x%x not supported by FCP adapter (0x%x to 0x%x)\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * @2: requested version + * @3: lowest supported version + * @4: highest supported version + * Description: + * See message text. + * The queue transfer control block (QTCB) version requested by the zfcp device + * driver is not supported by the FCP adapter hardware. + * User action: + * If the requested version is higher than the highest version supported by the + * hardware, install more recent firmware on the FCP adapter. If the requested + * version is lower then the lowest version supported by the hardware, upgrade + * to a Linux level with a more recent zfcp device driver. + */ + +/*? + * Text: "%s: The FCP adapter could not log in to the fibre channel fabric\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * The fibre channel switch rejected the login request from the FCP adapter. + * User action: + * Check the fibre channel fabric or switch logs for possible errors. + */ + +/*? + * Text: "%s: The FCP device is suspended because of a firmware update\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * The FCP device is not available while a firmware update is in progress. This + * problem is temporary. The FCP device will resume operations when the + * firmware update is completed. + * User action: + * Wait 10 seconds and try the operation again. + */ + +/*? + * Text: "%s: All NPIV ports on the FCP adapter have been assigned\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * The number of N_Port ID Virtualization (NPIV) ports that can be assigned + * on an FCP adapter is limited. Once assigned, NPIV ports are not released + * automatically but have to be released explicitly through the support + * element (SE). + * User action: + * Identify NPIV ports that have been assigned but are no longer in use and + * release them from the SE. + */ + +/*? + * Text: "%s: The link between the FCP adapter and the FC fabric is down\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * The FCP adapter is not usable. Specific error information is not available. + * User action: + * Check the cabling and the fibre channel fabric configuration. If this + * problem persists, gather Linux debug data, collect the FCP adapter + * hardware logs, and report the problem to your support organization. + */ + +/*? + * Text: "%s: The QTCB type is not supported by the FCP adapter\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * The queue transfer control block (QTCB) type requested by the zfcp device + * driver is not supported by the FCP adapter hardware. + * User action: + * Install the latest firmware on your FCP adapter hardware. If this does not + * resolve the problem, upgrade to a Linux level with a more recent zfcp device + * driver. If the problem persists, contact your support organization. + */ + +/*? + * Text: "%s: The error threshold for checksum statistics has been exceeded\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * The FCP adapter has reported a large number of bit errors. This might + * indicate a problem with the physical components of the fibre channel fabric. + * Details about the errors have been written to the HBA trace for the FCP + * adapter. + * User action: + * Check for problems in the fibre channel fabric and ensure that all cables + * are properly plugged. + */ + +/*? + * Text: "%s: The local link has been restored\n" + * Severity: Informational + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * A problem with the connection between the FCP adapter and the adjacent node + * on the fibre channel fabric has been resolved. The FCP adapter is now + * available again. + * User action: + * None. + */ + +/*? + * Text: "%s: The mode table on the FCP adapter has been damaged\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * This is an FCP adapter hardware problem. + * User action: + * Report this problem with FCP hardware logs to IBM support. + */ + +/*? + * Text: "%s: The adjacent fibre channel node does not support FCP\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * The fibre channel switch or storage system that is connected to the FCP + * channel does not support the fibre channel protocol (FCP). The zfcp + * device driver stopped using the FCP device. + * User action: + * Check the adjacent fibre channel node. + */ + +/*? + * Text: "%s: The FCP adapter does not recognize the command 0x%x\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * @2: command + * Description: + * A command code that was sent from the zfcp device driver to the FCP adapter + * is not valid. The zfcp device driver stopped using the FCP device. + * User action: + * Gather Linux debug data, collect the FCP adapter hardware logs, and report + * this problem to your support organization. + */ + +/*? + * Text: "%s: There is no light signal from the local fibre channel cable\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * There is no signal on the fibre channel cable that connects the FCP adapter + * to the fibre channel fabric. + * User action: + * Ensure that the cable is in place and connected properly to the FCP adapter + * and to the adjacent fibre channel switch or storage system. + */ + +/*? + * Text: "%s: The WWPN assignment file on the FCP adapter has been damaged\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * This is an FCP adapter hardware problem. + * User action: + * Report this problem with FCP hardware logs to IBM support. + */ + +/*? + * Text: "%s: The FCP device detected a WWPN that is duplicate or not valid\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * This condition indicates an error in the FCP adapter hardware or in the z/VM + * hypervisor. + * User action: + * Gather Linux debug data, collect the FCP adapter hardware logs, and report + * this problem to IBM support. + */ + +/*? + * Text: "%s: The fibre channel fabric does not support NPIV\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * The FCP adapter requires N_Port ID Virtualization (NPIV) from the adjacent + * fibre channel node. Either the FCP adapter is connected to a fibre channel + * switch that does not support NPIV or the FCP adapter tries to use NPIV in a + * point-to-point setup. The connection is not operational. + * User action: + * Verify that NPIV is correctly used for this connection. Check the FCP adapter + * configuration and the fibre channel switch configuration. If necessary, + * update the fibre channel switch firmware. + */ + +/*? + * Text: "%s: The FCP adapter cannot support more NPIV ports\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * N_Port ID Virtualization (NPIV) ports consume physical resources on the FCP + * adapter. The FCP adapter resources are exhausted. The connection is not + * operational. + * User action: + * Analyze the number of available NPIV ports and which operating system + * instances use them. If necessary, reconfigure your setup to move some + * NPIV ports to an FCP adapter with free resources. + */ + +/*? + * Text: "%s: The adjacent switch cannot support more NPIV ports\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * N_Port ID Virtualization (NPIV) ports consume physical resources. The + * resources of the fibre channel switch that is connected to the FCP adapter + * are exhausted. The connection is not operational. + * User action: + * Analyze the number of available NPIV ports on the adjacent fibre channel + * switch and how they are used. If necessary, reconfigure your fibre channel + * fabric to accommodate the required NPIV ports. + */ + +/*? + * Text: "%s: 0x%x is not a valid transfer protocol status\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * @2: status information + * Description: + * The transfer protocol status information reported by the FCP adapter is not + * a valid status for the zfcp device driver. The zfcp device driver stopped + * using the FCP device. + * User action: + * Gather Linux debug data, collect the FCP adapter hardware logs, and report + * this problem to your support organization. + */ + +/*? + * Text: "%s: Unknown or unsupported arbitrated loop fibre channel topology detected\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * The FCP device is connected to a fibre channel arbitrated loop or the FCP adapter + * reported an unknown fibre channel topology. The zfcp device driver supports + * point-to-point connections and switched fibre channel fabrics but not arbitrated + * loop topologies. The FCP device cannot be used. + * User action: + * Check the fibre channel setup and ensure that only supported topologies are + * connected to the FCP adapter. + */ + +/*? + * Text: "%s: FCP adapter maximum QTCB size (%d bytes) is too small\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * @2: maximum supported size + * @3: requested QTCB size + * Description: + * The queue transfer control block (QTCB) size requested by the zfcp + * device driver is not supported by the FCP adapter hardware. + * User action: + * Update the firmware on your FCP adapter hardware to the latest + * available level and update the Linux kernel to the latest supported + * level. If the problem persists, contact your support organization. + */ + +/*? + * Text: "%s: The FCP adapter only supports newer control block versions\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * The protocol supported by the FCP adapter is not compatible with the zfcp + * device driver. + * User action: + * Upgrade your Linux kernel to a level that includes a zfcp device driver + * with support for the control block version required by your FCP adapter. + */ + +/*? + * Text: "%s: The FCP adapter only supports older control block versions\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * Description: + * The protocol supported by the FCP adapter is not compatible with the zfcp + * device driver. + * User action: + * Install the latest firmware on your FCP adapter. + */ + +/*? + * Text: "%s: Not enough FCP adapter resources to open remote port 0x%016Lx\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * @2: WWPN + * Description: + * Each port that is opened consumes physical resources of the FCP adapter to + * which it is attached. These resources are exhausted and the specified port + * cannot be opened. + * User action: + * Reduce the total number of remote ports that are attached to the + * FCP adapter. + */ + +/*? + * Text: "%s: LUN 0x%Lx on port 0x%Lx is already in use by CSS%d, MIF Image ID %x\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * @2: LUN + * @3: remote port WWPN + * @4: channel subsystem ID + * @5: MIF Image ID of the LPAR + * Description: + * The SCSI device at the indicated LUN is already in use by another system. + * Only one system at a time can use the SCSI device. + * User action: + * Ensure that the other system stops using the device before trying to use it. + */ + +/*? + * Text: "%s: No handle is available for LUN 0x%016Lx on port 0x%016Lx\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * @2: LUN + * @3: WWPN + * Description: + * The FCP adapter can only open a limited number of SCSI devices. This limit + * has been reached and the SCSI device at the indicated LUN cannot be opened. + * User action: + * For FCP subchannels running in non-NPIV mode, check all SCSI + * devices opened through the FCP adapter and close some of them. For + * FCP subchannels running in NPIV mode, verify the SAN zoning and + * host connections on the storage systems. Ensure that the zoning and + * host connections only allow access to the required LUNs. As a + * workaround, disable the automatic LUN scanning by setting the + * zfcp.allow_lun_scan kernel parameter or the allow_lun_scan module + * parameter to 0. + */ + +/*? + * Text: "%s: Incorrect direction %d, LUN 0x%016Lx on port 0x%016Lx closed\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * @2: value in direction field + * @3: LUN + * @4: WWPN + * Description: + * The direction field in a SCSI request contains an incorrect value. The zfcp + * device driver closed down the SCSI device at the indicated LUN. + * User action: + * Gather Linux debug data and report this problem to your support organization. + */ + +/*? + * Text: "%s: Incorrect CDB length %d, LUN 0x%016Lx on port 0x%016Lx closed\n" + * Severity: Error + * Parameter: + * @1: bus ID of the zfcp device + * @2: value in length field + * @3: LUN + * @4: WWPN + * Description: + * The control-data-block (CDB) length field in a SCSI request is not valid or + * too large for the FCP adapter. The zfcp device driver closed down the SCSI + * device at the indicated LUN. + * User action: + * Gather Linux debug data and report this problem to your support organization. + */ + +/*? + * Text: "%s: Opening WKA port 0x%x failed\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * @2: destination ID of the WKA port + * Description: + * The FCP adapter rejected a request to open the specified + * well-known address (WKA) port. No retry is possible. + * User action: + * Verify the setup and check if the maximum number of remote ports + * used through this adapter is below the maximum allowed. If the + * problem persists, gather Linux debug data, collect the FCP adapter + * hardware logs, and report the problem to your support organization. + */ + +/*? + * Text: "%s: The name server reported %d words residual data\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * @2: number of words in residual data + * Description: + * The fibre channel name server sent too much information about remote ports. + * The zfcp device driver did not receive sufficient information to attach all + * available remote ports in the SAN. + * User action: + * Verify that you are running the latest firmware level on the FCP + * adapter. Check your SAN setup and consider reducing the number of ports + * visible to the FCP adapter by using more restrictive zoning in the SAN. + */ + +/*? + * Text: "%s: A port opened with WWPN 0x%016Lx returned data that identifies it as WWPN 0x%016Lx\n" + * Severity: Warning + * Parameter: + * @1: bus ID of the zfcp device + * @2: expected WWPN + * @3: reported WWPN + * Description: + * A remote port was opened successfully, but it reported an + * unexpected WWPN in the returned port login (PLOGI) data. This + * condition might have been caused by a change applied to the SAN + * configuration while the port was being opened. + * User action: + * If this condition is only temporary and access to the remote port + * is possible, no action is required. If the condition persists, + * identify the storage system with the specified WWPN and contact the + * support organization of the storage system. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/s390/zpci +++ linux-riscv-5.4.0/Documentation/kmsg/s390/zpci @@ -0,0 +1,42 @@ +/*? + * Text: "%s: Event 0x%x reconfigured PCI function 0x%x\n" + * Severity: Informational + * Parameter: + * @1: device name of the function + * @2: PCI event code + * @3: function ID + * Description: + * The availability of a PCI function has changed. + * Possible reasons for the change include PCI configuration actions on the + * Hardware Management Console or hypervisor. + * For shared PCI functions, the function might also have been reserved or + * released by another system. + * If the device name of a function is shown as 'n/a', the device registration + * with the PCI device driver has not completed. + * The function ID identifies the function to the I/O configuration (IOCDS). + * The PCI event code can be useful diagnostic information for your support + * organization. + * User action: + * None. + */ + +/*? + * Text: "%s: Event 0x%x reports an error for PCI function 0x%x\n" + * Severity: Error + * Parameter: + * @1: device name of the function + * @2: PCI event code + * @3: function ID + * Description: + * A PCI function entered an error state from which it cannot recover + * automatically. + * User action: + * Trigger a recovery action by writing '1' to the 'recover' sysfs attribute + * of the PCI function. + * In sysfs, PCI functions are represented as /sys/bus/pci/devices/, + * where is the device name of the function. + * If the device name of a function is shown as 'n/a', the device + * registration with the PCI device driver has not completed. + * If the problem persists, contact your support organization. + */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ --- linux-riscv-5.4.0.orig/Documentation/kmsg/sbp_target +++ linux-riscv-5.4.0/Documentation/kmsg/sbp_target @@ -0,0 +1,49 @@ +/*? Text: "ABORT TASK SET not implemented\n" */ +/*? Text: "ABORT TASK not implemented\n" */ +/*? Text: "Cannot change the directory_id on an active target.\n" */ +/*? Text: "Cannot enable a target with no LUNs!\n" */ +/*? Text: "Could not update Config ROM\n" */ +/*? Text: "Ignoring ORB_POINTER write while active.\n" */ +/*? Text: "LOGICAL UNIT RESET not implemented\n" */ +/*? Text: "Node ACL not found for %s\n" */ +/*? Text: "Only one TPG per Unit is possible.\n" */ +/*? Text: "QUERY LOGINS not implemented\n" */ +/*? Text: "Reconnect timer expired for node: %016llx\n" */ +/*? Text: "SET PASSWORD not implemented\n" */ +/*? Text: "TARGET RESET not implemented\n" */ +/*? Text: "Unable to allocate struct sbp_nacl\n" */ +/*? Text: "Unable to allocate struct sbp_tpg\n" */ +/*? Text: "Unable to allocate struct sbp_tport\n" */ +/*? Text: "Waiting for reconnect from node: %016llx\n" */ +/*? Text: "cannot find login: %d\n" */ +/*? Text: "failed to allocate login descriptor\n" */ +/*? Text: "failed to allocate login response block\n" */ +/*? Text: "failed to allocate session descriptor\n" */ +/*? Text: "failed to init se_session\n" */ +/*? Text: "failed to map command block handler: %d\n" */ +/*? Text: "failed to read peer GUID: %d\n" */ +/*? Text: "ignoring management request while busy\n" */ +/*? Text: "ignoring request from foreign node (%x != %x)\n" */ +/*? Text: "ignoring request with wrong generation\n" */ +/*? Text: "initiator already logged-in\n" */ +/*? Text: "login to unknown LUN: %d\n" */ +/*? Text: "logout from different node ID\n" */ +/*? Text: "max number of logins reached\n" */ +/*? Text: "mgt_agent LOGIN to LUN %d from %016llx\n" */ +/*? Text: "mgt_agent LOGOUT from LUN %d session %d\n" */ +/*? Text: "mgt_agent RECONNECT from %016llx\n" */ +/*? Text: "mgt_agent RECONNECT login GUID doesn't match\n" */ +/*? Text: "mgt_agent RECONNECT unknown login ID\n" */ +/*? Text: "mgt_orb bad request\n" */ +/*? Text: "netif_stop_queue() cannot be called before register_netdev()\n" */ +/*? Text: "refusing exclusive login with other active logins\n" */ +/*? Text: "refusing login while another exclusive login present\n" */ +/*? Text: "sbp_run_transaction: page size ignored\n" */ +/*? Text: "sbp_send_sense: unknown sense format: 0x%x\n" */ +/*? Text: "target_fabric_configfs_init() failed\n" */ +/*? Text: "target_fabric_configfs_register() failed for SBP\n" */ +/*? Text: "unknown management function 0x%x\n" */ +/*? Text: "unlink LUN: failed to update unit directory\n" */ +/*? Text: "flen=%u proglen=%u pass=%u image=%pK from=%s pid=%d\n" */ +/*? Text: "%s selects TX queue %d, but real number of TX queues is %d\n" */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ \ No newline at end of file --- linux-riscv-5.4.0.orig/Documentation/kmsg/zram +++ linux-riscv-5.4.0/Documentation/kmsg/zram @@ -0,0 +1,34 @@ +/*? Text: "Error allocating compressor buffer space\n" */ +/*? Text: "Error allocating memory for compressed page: %u, size=%zu\n" */ +/*? Text: "Error creating memory pool\n" */ +/*? Text: "num_devices not specified. Using default: 1\n" */ +/*? Text: "Error allocating compressor working memory!\n" */ +/*? Text: "Error allocating zram address table\n" */ +/*? Text: "Unable to get major number\n" */ +/*? Text: "Compression failed! err=%d\n" */ +/*? Text: "Decompression failed! err=%d, page=%u\n" */ +/*? Text: "There is little point creating a zram of greater than twice the size of memory since we expect a 2:1 compression ratio. Note that zram uses about 0.1%% of the size of the disk when not in use so a huge zram is wasteful.\n\tMemory Size: %zu kB\n\tSize you selected: %llu kB\nContinuing anyway ...\n" */ +/*? Text: "disk size not provided. You can use disksize_kb module param to specify size.\nUsing default: (%u%% of RAM).\n" */ +/*? Text: "Error creating sysfs group" */ +/*? Text: "Error allocating memory for incompressible page: %u\n" */ +/*? Text: "Creating %u devices ...\n" */ +/*? Text: "Initialization failed: err=%d\n" */ +/*? Text: "Error allocating disk queue for device %d\n" */ +/*? Text: "Error allocating disk structure for device %d\n" */ +/*? Text: "Invalid value for num_devices: %u\n" */ +/*? Text: "Error allocating temp memory!\n" */ +/*? Text: "Unable to allocate temp memory\n" */ +/*? Text: "Created %u device(s) ...\n" */ +/*? Text: "There is little point creating a zram of greater than twice the size of memory since we expect a 2:1 compression ratio. Note that zram uses about 0.1%% of the size of the disk when not in use so a huge zram is wasteful.\n\tMemory Size: %lu kB\n\tSize you selected: %llu kB\nContinuing anyway ...\n" */ +/*? Text: "Cannot change disksize for initialized device\n" */ +/*? Text: "Can't change algorithm for initialized device\n" */ +/*? Text: "Cannot initialise %s compressing backend\n" */ +/*? Text: "Cannot change max compression streams\n" */ +/*? Text: "Destroyed %u device(s)\n" */ +/*? Text: "Created %u device(s)\n" */ +/*? Text: "Unable to register zram-control class\n" */ +/*? Text: "Removed device: %s\n" */ +/*? Text: "Added device: %s\n" */ +/*? Text: "Error creating sysfs group for device %d\n" */ +/*? Text: "Error allocating memory for compressed page: %u, size=%u\n" */ +/*? Text: "%s: %d output lines suppressed due to ratelimiting\n" */ \ No newline at end of file --- linux-riscv-5.4.0.orig/Documentation/networking/j1939.rst +++ linux-riscv-5.4.0/Documentation/networking/j1939.rst @@ -339,7 +339,7 @@ .pgn = J1939_PGN_ADDRESS_CLAIMED, .pgn_mask = J1939_PGN_PDU1_MAX, }, { - .pgn = J1939_PGN_ADDRESS_REQUEST, + .pgn = J1939_PGN_REQUEST, .pgn_mask = J1939_PGN_PDU1_MAX, }, { .pgn = J1939_PGN_ADDRESS_COMMANDED, --- linux-riscv-5.4.0.orig/Documentation/networking/nf_flowtable.txt +++ linux-riscv-5.4.0/Documentation/networking/nf_flowtable.txt @@ -76,7 +76,7 @@ table inet x { flowtable f { - hook ingress priority 0 devices = { eth0, eth1 }; + hook ingress priority 0; devices = { eth0, eth1 }; } chain y { type filter hook forward priority 0; policy accept; --- linux-riscv-5.4.0.orig/Documentation/scsi/smartpqi.txt +++ linux-riscv-5.4.0/Documentation/scsi/smartpqi.txt @@ -29,7 +29,7 @@ smartpqi host attributes: ------------------------- /sys/class/scsi_host/host*/rescan - /sys/class/scsi_host/host*/version + /sys/class/scsi_host/host*/driver_version The host rescan attribute is a write only attribute. Writing to this attribute will trigger the driver to scan for new, changed, or removed --- linux-riscv-5.4.0.orig/Documentation/virt/kvm/api.txt +++ linux-riscv-5.4.0/Documentation/virt/kvm/api.txt @@ -1897,9 +1897,11 @@ Parameters: struct kvm_one_reg (in) Returns: 0 on success, negative value on failure Errors: -  ENOENT:   no such register -  EINVAL:   invalid register ID, or no such register -  EPERM:    (arm64) register access not allowed before vcpu finalization +  ENOENT   no such register +  EINVAL   invalid register ID, or no such register or used with VMs in + protected virtualization mode on s390 +  EPERM    (arm64) register access not allowed before vcpu finalization + (These error codes are indicative only: do not rely on a specific error code being returned in a specific situation.) @@ -2290,9 +2292,11 @@ Parameters: struct kvm_one_reg (in and out) Returns: 0 on success, negative value on failure Errors include: -  ENOENT:   no such register -  EINVAL:   invalid register ID, or no such register -  EPERM:    (arm64) register access not allowed before vcpu finalization +  ENOENT   no such register +  EINVAL   invalid register ID, or no such register or used with VMs in + protected virtualization mode on s390 +  EPERM    (arm64) register access not allowed before vcpu finalization + (These error codes are indicative only: do not rely on a specific error code being returned in a specific situation.) @@ -4127,6 +4131,90 @@ #define KVM_PMU_EVENT_DENY 1 +4.122 KVM_S390_NORMAL_RESET + +Capability: KVM_CAP_S390_VCPU_RESETS +Architectures: s390 +Type: vcpu ioctl +Parameters: none +Returns: 0 + +This ioctl resets VCPU registers and control structures according to +the cpu reset definition in the POP (Principles Of Operation). + +4.123 KVM_S390_INITIAL_RESET + +Capability: none +Architectures: s390 +Type: vcpu ioctl +Parameters: none +Returns: 0 + +This ioctl resets VCPU registers and control structures according to +the initial cpu reset definition in the POP. However, the cpu is not +put into ESA mode. This reset is a superset of the normal reset. + +4.124 KVM_S390_CLEAR_RESET + +Capability: KVM_CAP_S390_VCPU_RESETS +Architectures: s390 +Type: vcpu ioctl +Parameters: none +Returns: 0 + +This ioctl resets VCPU registers and control structures according to +the clear cpu reset definition in the POP. However, the cpu is not put +into ESA mode. This reset is a superset of the initial reset. + + +4.125 KVM_S390_PV_COMMAND +------------------------- + +:Capability: KVM_CAP_S390_PROTECTED +:Architectures: s390 +:Type: vm ioctl +:Parameters: struct kvm_pv_cmd +:Returns: 0 on success, < 0 on error + +:: + + struct kvm_pv_cmd { + __u32 cmd; /* Command to be executed */ + __u16 rc; /* Ultravisor return code */ + __u16 rrc; /* Ultravisor return reason code */ + __u64 data; /* Data or address */ + __u32 flags; /* flags for future extensions. Must be 0 for now */ + __u32 reserved[3]; + }; + +cmd values: + +KVM_PV_ENABLE + Allocate memory and register the VM with the Ultravisor, thereby + donating memory to the Ultravisor that will become inaccessible to + KVM. All existing CPUs are converted to protected ones. After this + command has succeeded, any CPU added via hotplug will become + protected during its creation as well. + +KVM_PV_DISABLE + + Deregister the VM from the Ultravisor and reclaim the memory that + had been donated to the Ultravisor, making it usable by the kernel + again. All registered VCPUs are converted back to non-protected + ones. + +KVM_PV_VM_SET_SEC_PARMS + Pass the image header from VM memory to the Ultravisor in + preparation of image unpacking and verification. + +KVM_PV_VM_UNPACK + Unpack (protect and decrypt) a page of the encrypted boot image. + +KVM_PV_VM_VERIFY + Verify the integrity of the unpacked image. Only if this succeeds, + KVM is allowed to start protected VCPUs. + + 5. The kvm_run structure ------------------------ @@ -5322,3 +5410,21 @@ flush hypercalls by Hyper-V) so userspace should disable KVM identification in CPUID and only exposes Hyper-V identification. In this case, guest thinks it's running on Hyper-V and only use Hyper-V hypercalls. + +8.22 KVM_CAP_S390_VCPU_RESETS + +Architectures: s390 + +This capability indicates that the KVM_S390_NORMAL_RESET and +KVM_S390_CLEAR_RESET ioctls are available. + +8.23 KVM_CAP_S390_PROTECTED + +Architecture: s390 + + +This capability indicates that the Ultravisor has been initialized and +KVM can therefore start protected VMs. +This capability governs the KVM_S390_PV_COMMAND ioctl and the +KVM_MP_STATE_LOAD MP_STATE. KVM_SET_MP_STATE can fail for protected +guests when the state change is invalid. --- linux-riscv-5.4.0.orig/Documentation/virt/kvm/devices/s390_flic.txt +++ linux-riscv-5.4.0/Documentation/virt/kvm/devices/s390_flic.txt @@ -100,15 +100,9 @@ mask or unmask the adapter, as specified in mask KVM_S390_IO_ADAPTER_MAP - perform a gmap translation for the guest address provided in addr, - pin a userspace page for the translated address and add it to the - list of mappings - Note: A new mapping will be created unconditionally; therefore, - the calling code should avoid making duplicate mappings. - + This is now a no-op. The mapping is purely done by the irq route. KVM_S390_IO_ADAPTER_UNMAP - release a userspace page for the translated address specified in addr - from the list of mappings + This is now a no-op. The mapping is purely done by the irq route. KVM_DEV_FLIC_AISM modify the adapter-interruption-suppression mode for a given isc if the --- linux-riscv-5.4.0.orig/Documentation/virt/kvm/index.rst +++ linux-riscv-5.4.0/Documentation/virt/kvm/index.rst @@ -9,4 +9,6 @@ amd-memory-encryption cpuid + s390-pv + s390-pv-boot vcpu-requests --- linux-riscv-5.4.0.orig/Documentation/virt/kvm/s390-pv-boot.rst +++ linux-riscv-5.4.0/Documentation/virt/kvm/s390-pv-boot.rst @@ -0,0 +1,84 @@ +.. SPDX-License-Identifier: GPL-2.0 + +====================================== +s390 (IBM Z) Boot/IPL of Protected VMs +====================================== + +Summary +------- +The memory of Protected Virtual Machines (PVMs) is not accessible to +I/O or the hypervisor. In those cases where the hypervisor needs to +access the memory of a PVM, that memory must be made accessible. +Memory made accessible to the hypervisor will be encrypted. See +:doc:`s390-pv` for details." + +On IPL (boot) a small plaintext bootloader is started, which provides +information about the encrypted components and necessary metadata to +KVM to decrypt the protected virtual machine. + +Based on this data, KVM will make the protected virtual machine known +to the Ultravisor (UV) and instruct it to secure the memory of the +PVM, decrypt the components and verify the data and address list +hashes, to ensure integrity. Afterwards KVM can run the PVM via the +SIE instruction which the UV will intercept and execute on KVM's +behalf. + +As the guest image is just like an opaque kernel image that does the +switch into PV mode itself, the user can load encrypted guest +executables and data via every available method (network, dasd, scsi, +direct kernel, ...) without the need to change the boot process. + + +Diag308 +------- +This diagnose instruction is the basic mechanism to handle IPL and +related operations for virtual machines. The VM can set and retrieve +IPL information blocks, that specify the IPL method/devices and +request VM memory and subsystem resets, as well as IPLs. + +For PVMs this concept has been extended with new subcodes: + +Subcode 8: Set an IPL Information Block of type 5 (information block +for PVMs) +Subcode 9: Store the saved block in guest memory +Subcode 10: Move into Protected Virtualization mode + +The new PV load-device-specific-parameters field specifies all data +that is necessary to move into PV mode. + +* PV Header origin +* PV Header length +* List of Components composed of + * AES-XTS Tweak prefix + * Origin + * Size + +The PV header contains the keys and hashes, which the UV will use to +decrypt and verify the PV, as well as control flags and a start PSW. + +The components are for instance an encrypted kernel, kernel parameters +and initrd. The components are decrypted by the UV. + +After the initial import of the encrypted data, all defined pages will +contain the guest content. All non-specified pages will start out as +zero pages on first access. + + +When running in protected virtualization mode, some subcodes will result in +exceptions or return error codes. + +Subcodes 4 and 7, which specify operations that do not clear the guest +memory, will result in specification exceptions. This is because the +UV will clear all memory when a secure VM is removed, and therefore +non-clearing IPL subcodes are not allowed. + +Subcodes 8, 9, 10 will result in specification exceptions. +Re-IPL into a protected mode is only possible via a detour into non +protected mode. + +Keys +---- +Every CEC will have a unique public key to enable tooling to build +encrypted images. +See `s390-tools `_ +for the tooling. --- linux-riscv-5.4.0.orig/Documentation/virt/kvm/s390-pv.rst +++ linux-riscv-5.4.0/Documentation/virt/kvm/s390-pv.rst @@ -0,0 +1,116 @@ +.. SPDX-License-Identifier: GPL-2.0 + +========================================= +s390 (IBM Z) Ultravisor and Protected VMs +========================================= + +Summary +------- +Protected virtual machines (PVM) are KVM VMs that do not allow KVM to +access VM state like guest memory or guest registers. Instead, the +PVMs are mostly managed by a new entity called Ultravisor (UV). The UV +provides an API that can be used by PVMs and KVM to request management +actions. + +Each guest starts in non-protected mode and then may make a request to +transition into protected mode. On transition, KVM registers the guest +and its VCPUs with the Ultravisor and prepares everything for running +it. + +The Ultravisor will secure and decrypt the guest's boot memory +(i.e. kernel/initrd). It will safeguard state changes like VCPU +starts/stops and injected interrupts while the guest is running. + +As access to the guest's state, such as the SIE state description, is +normally needed to be able to run a VM, some changes have been made in +the behavior of the SIE instruction. A new format 4 state description +has been introduced, where some fields have different meanings for a +PVM. SIE exits are minimized as much as possible to improve speed and +reduce exposed guest state. + + +Interrupt injection +------------------- +Interrupt injection is safeguarded by the Ultravisor. As KVM doesn't +have access to the VCPUs' lowcores, injection is handled via the +format 4 state description. + +Machine check, external, IO and restart interruptions each can be +injected on SIE entry via a bit in the interrupt injection control +field (offset 0x54). If the guest cpu is not enabled for the interrupt +at the time of injection, a validity interception is recognized. The +format 4 state description contains fields in the interception data +block where data associated with the interrupt can be transported. + +Program and Service Call exceptions have another layer of +safeguarding; they can only be injected for instructions that have +been intercepted into KVM. The exceptions need to be a valid outcome +of an instruction emulation by KVM, e.g. we can never inject a +addressing exception as they are reported by SIE since KVM has no +access to the guest memory. + + +Mask notification interceptions +------------------------------- +KVM cannot intercept lctl(g) and lpsw(e) anymore in order to be +notified when a PVM enables a certain class of interrupt. As a +replacement, two new interception codes have been introduced: One +indicating that the contents of CRs 0, 6, or 14 have been changed, +indicating different interruption subclasses; and one indicating that +PSW bit 13 has been changed, indicating that a machine check +intervention was requested and those are now enabled. + +Instruction emulation +--------------------- +With the format 4 state description for PVMs, the SIE instruction already +interprets more instructions than it does with format 2. It is not able +to interpret every instruction, but needs to hand some tasks to KVM; +therefore, the SIE and the ultravisor safeguard emulation inputs and outputs. + +The control structures associated with SIE provide the Secure +Instruction Data Area (SIDA), the Interception Parameters (IP) and the +Secure Interception General Register Save Area. Guest GRs and most of +the instruction data, such as I/O data structures, are filtered. +Instruction data is copied to and from the SIDA when needed. Guest +GRs are put into / retrieved from the Secure Interception General +Register Save Area. + +Only GR values needed to emulate an instruction will be copied into this +save area and the real register numbers will be hidden. + +The Interception Parameters state description field still contains the +the bytes of the instruction text, but with pre-set register values +instead of the actual ones. I.e. each instruction always uses the same +instruction text, in order not to leak guest instruction text. +This also implies that the register content that a guest had in r +may be in r from the hypervisor's point of view. + +The Secure Instruction Data Area contains instruction storage +data. Instruction data, i.e. data being referenced by an instruction +like the SCCB for sclp, is moved via the SIDA. When an instruction is +intercepted, the SIE will only allow data and program interrupts for +this instruction to be moved to the guest via the two data areas +discussed before. Other data is either ignored or results in validity +interceptions. + + +Instruction emulation interceptions +----------------------------------- +There are two types of SIE secure instruction intercepts: the normal +and the notification type. Normal secure instruction intercepts will +make the guest pending for instruction completion of the intercepted +instruction type, i.e. on SIE entry it is attempted to complete +emulation of the instruction with the data provided by KVM. That might +be a program exception or instruction completion. + +The notification type intercepts inform KVM about guest environment +changes due to guest instruction interpretation. Such an interception +is recognized, for example, for the store prefix instruction to provide +the new lowcore location. On SIE reentry, any KVM data in the data areas +is ignored and execution continues as if the guest instruction had +completed. For that reason KVM is not allowed to inject a program +interrupt. + +Links +----- +`KVM Forum 2019 presentation `_ --- linux-riscv-5.4.0.orig/Kconfig +++ linux-riscv-5.4.0/Kconfig @@ -21,6 +21,8 @@ source "drivers/Kconfig" +source "ubuntu/Kconfig" + source "fs/Kconfig" source "security/Kconfig" --- linux-riscv-5.4.0.orig/MAINTAINERS +++ linux-riscv-5.4.0/MAINTAINERS @@ -2832,6 +2832,19 @@ F: include/uapi/linux/audit.h F: kernel/audit* +AUFS (advanced multi layered unification filesystem) FILESYSTEM +M: "J. R. Okajima" +L: aufs-users@lists.sourceforge.net (members only) +L: linux-unionfs@vger.kernel.org +W: http://aufs.sourceforge.net +T: git://github.com/sfjro/aufs4-linux.git +S: Supported +F: Documentation/filesystems/aufs/ +F: Documentation/ABI/testing/debugfs-aufs +F: Documentation/ABI/testing/sysfs-aufs +F: fs/aufs/ +F: include/uapi/linux/aufs_type.h + AUXILIARY DISPLAY DRIVERS M: Miguel Ojeda Sandonis S: Maintained @@ -6973,6 +6986,7 @@ S: Maintained F: Documentation/firmware-guide/acpi/gpio-properties.rst F: drivers/gpio/gpiolib-acpi.c +F: drivers/gpio/gpiolib-acpi.h GPIO IR Transmitter M: Sean Young @@ -7364,6 +7378,25 @@ F: net/802/hippi.c F: drivers/net/hippi/ +HISILICON SECURITY ENGINE V2 DRIVER (SEC2) +M: Zaibo Xu +L: linux-crypto@vger.kernel.org +S: Maintained +F: drivers/crypto/hisilicon/sec2/sec_crypto.c +F: drivers/crypto/hisilicon/sec2/sec_main.c +F: drivers/crypto/hisilicon/sec2/sec_crypto.h +F: drivers/crypto/hisilicon/sec2/sec.h +F: Documentation/ABI/testing/debugfs-hisi-sec + +HISILICON HIGH PERFORMANCE RSA ENGINE DRIVER (HPRE) +M: Zaibo Xu +L: linux-crypto@vger.kernel.org +S: Maintained +F: drivers/crypto/hisilicon/hpre/hpre_crypto.c +F: drivers/crypto/hisilicon/hpre/hpre_main.c +F: drivers/crypto/hisilicon/hpre/hpre.h +F: Documentation/ABI/testing/debugfs-hisi-hpre + HISILICON NETWORK SUBSYSTEM 3 DRIVER (HNS3) M: Yisen Zhuang M: Salil Mehta @@ -7372,6 +7405,11 @@ S: Maintained F: drivers/net/ethernet/hisilicon/hns3/ +HISILICON TRUE RANDOM NUMBER GENERATOR V2 SUPPORT +M: Zaibo Xu +S: Maintained +F: drivers/char/hw_random/hisi-trng-v2.c + HISILICON LPC BUS DRIVER M: john.garry@huawei.com W: http://www.hisilicon.com @@ -7410,6 +7448,12 @@ F: drivers/scsi/hisi_sas/ F: Documentation/devicetree/bindings/scsi/hisilicon-sas.txt +HISILICON V3XX SPI NOR FLASH Controller Driver +M: John Garry +W: http://www.hisilicon.com +S: Maintained +F: drivers/spi/spi-hisi-sfc-v3xx.c + HISILICON QM AND ZIP Controller DRIVER M: Zhou Wang L: linux-crypto@vger.kernel.org @@ -7417,7 +7461,6 @@ F: drivers/crypto/hisilicon/qm.c F: drivers/crypto/hisilicon/qm.h F: drivers/crypto/hisilicon/sgl.c -F: drivers/crypto/hisilicon/sgl.h F: drivers/crypto/hisilicon/zip/ F: Documentation/ABI/testing/debugfs-hisi-zip @@ -8200,7 +8243,7 @@ M: Rodrigo Vivi L: intel-gfx@lists.freedesktop.org W: https://01.org/linuxgraphics/ -B: https://01.org/linuxgraphics/documentation/how-report-bugs +B: https://gitlab.freedesktop.org/drm/intel/-/wikis/How-to-file-i915-bugs C: irc://chat.freenode.net/intel-gfx Q: http://patchwork.freedesktop.org/project/intel-gfx/ T: git git://anongit.freedesktop.org/drm-intel @@ -8703,8 +8746,10 @@ L: netdev@vger.kernel.org W: http://www.isdn4linux.de S: Maintained -F: drivers/isdn/mISDN -F: drivers/isdn/hardware +F: drivers/isdn/mISDN/ +F: drivers/isdn/hardware/ +F: drivers/isdn/Kconfig +F: drivers/isdn/Makefile ISDN/CAPI SUBSYSTEM M: Karsten Keil @@ -8991,6 +9036,7 @@ W: http://www.ibm.com/developerworks/linux/linux390/ T: git git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git S: Supported +F: Documentation/virt/kvm/s390* F: arch/s390/include/uapi/asm/kvm* F: arch/s390/include/asm/gmap.h F: arch/s390/include/asm/kvm* --- linux-riscv-5.4.0.orig/Makefile +++ linux-riscv-5.4.0/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 VERSION = 5 PATCHLEVEL = 4 -SUBLEVEL = 0 +SUBLEVEL = 30 EXTRAVERSION = NAME = Kleptomaniac Octopus @@ -206,6 +206,20 @@ KBUILD_CHECKSRC = 0 endif +# Call message checker as part of the C compilation +# +# Use 'make D=1' to enable checking +# Use 'make D=2' to create the message catalog + +ifdef D + ifeq ("$(origin D)", "command line") + KBUILD_KMSG_CHECK = $(D) + endif +endif +ifndef KBUILD_KMSG_CHECK + KBUILD_KMSG_CHECK = 0 +endif + # Use make M=dir or set the environment variable KBUILD_EXTMOD to specify the # directory of external module to build. Setting M= takes precedence. ifeq ("$(origin M)", "command line") @@ -429,6 +443,7 @@ CHECKFLAGS := -D__linux__ -Dlinux -D__STDC__ -Dunix -D__unix__ \ -Wbitwise -Wno-return-void -Wno-unknown-attribute $(CF) +KMSG_CHECK = $(srctree)/scripts/kmsg-doc NOSTDINC_FLAGS := CFLAGS_MODULE = AFLAGS_MODULE = @@ -437,6 +452,13 @@ AFLAGS_KERNEL = LDFLAGS_vmlinux = +# Prefer linux-backports-modules +ifneq ($(KBUILD_SRC),) +ifneq ($(shell if test -e $(KBUILD_OUTPUT)/ubuntu-build; then echo yes; fi),yes) +UBUNTUINCLUDE := -I/usr/src/linux-headers-lbm-$(KERNELRELEASE) +endif +endif + # Use USERINCLUDE when you must reference the UAPI directories only. USERINCLUDE := \ -I$(srctree)/arch/$(SRCARCH)/include/uapi \ @@ -448,12 +470,16 @@ # Use LINUXINCLUDE when you must reference the include/ directory. # Needed to be compatible with the O= option LINUXINCLUDE := \ + $(UBUNTUINCLUDE) \ -I$(srctree)/arch/$(SRCARCH)/include \ -I$(objtree)/arch/$(SRCARCH)/include/generated \ $(if $(building_out_of_srctree),-I$(srctree)/include) \ -I$(objtree)/include \ $(USERINCLUDE) +# UBUNTU: Include our third party driver stuff too +LINUXINCLUDE += -Iubuntu/include $(if $(KBUILD_SRC),-I$(srctree)/ubuntu/include) + KBUILD_AFLAGS := -D__ASSEMBLY__ -fno-PIE KBUILD_CFLAGS := -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs \ -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE \ @@ -480,6 +506,7 @@ export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE export CFLAGS_KASAN CFLAGS_KASAN_NOSANITIZE CFLAGS_UBSAN export KBUILD_AFLAGS AFLAGS_KERNEL AFLAGS_MODULE +export KBUILD_KMSG_CHECK KMSG_CHECK export KBUILD_AFLAGS_MODULE KBUILD_CFLAGS_MODULE KBUILD_LDFLAGS_MODULE export KBUILD_AFLAGS_KERNEL KBUILD_CFLAGS_KERNEL @@ -616,9 +643,8 @@ ifeq ($(KBUILD_EXTMOD),) # Objects we will link into vmlinux / subdirs we need to visit init-y := init/ -drivers-y := drivers/ sound/ +drivers-y := drivers/ sound/ ubuntu/ drivers-$(CONFIG_SAMPLES) += samples/ -drivers-$(CONFIG_KERNEL_HEADER_TEST) += include/ net-y := net/ libs-y := lib/ core-y := usr/ @@ -1195,20 +1221,17 @@ $(error Headers not exportable for the $(SRCARCH) architecture)) $(Q)$(MAKE) $(hdr-inst)=include/uapi $(Q)$(MAKE) $(hdr-inst)=arch/$(SRCARCH)/include/uapi + $(Q)$(MAKE) $(hdr-inst)=ubuntu/include dst=include oldheaders= +# Deprecated. It is no-op now. PHONY += headers_check -headers_check: headers - $(Q)$(MAKE) $(hdr-inst)=include/uapi HDRCHECK=1 - $(Q)$(MAKE) $(hdr-inst)=arch/$(SRCARCH)/include/uapi HDRCHECK=1 +headers_check: + @: ifdef CONFIG_HEADERS_INSTALL prepare: headers endif -ifdef CONFIG_HEADERS_CHECK -all: headers_check -endif - PHONY += scripts_unifdef scripts_unifdef: scripts_basic $(Q)$(MAKE) $(build)=scripts scripts/unifdef @@ -1242,7 +1265,7 @@ %.dtb: include/config/kernel.release scripts_dtc $(Q)$(MAKE) $(build)=$(dtstree) $(dtstree)/$@ -PHONY += dtbs dtbs_install dt_binding_check +PHONY += dtbs dtbs_install dtbs_check dtbs dtbs_check: include/config/kernel.release scripts_dtc $(Q)$(MAKE) $(build)=$(dtstree) @@ -1262,6 +1285,7 @@ scripts_dtc: scripts_basic $(Q)$(MAKE) $(build)=scripts/dtc +PHONY += dt_binding_check dt_binding_check: scripts_dtc $(Q)$(MAKE) $(build)=Documentation/devicetree/bindings @@ -1476,7 +1500,6 @@ @echo ' versioncheck - Sanity check on version.h usage' @echo ' includecheck - Check for duplicate included header files' @echo ' export_report - List the usages of all exported symbols' - @echo ' headers_check - Sanity check on exported headers' @echo ' headerdep - Detect inclusion cycles in headers' @echo ' coccicheck - Check with Coccinelle' @echo '' @@ -1641,6 +1664,50 @@ PHONY += prepare endif # KBUILD_EXTMOD +# Single targets +# --------------------------------------------------------------------------- +# To build individual files in subdirectories, you can do like this: +# +# make foo/bar/baz.s +# +# The supported suffixes for single-target are listed in 'single-targets' +# +# To build only under specific subdirectories, you can do like this: +# +# make foo/bar/baz/ + +ifdef single-build + +# .ko is special because modpost is needed +single-ko := $(sort $(filter %.ko, $(MAKECMDGOALS))) +single-no-ko := $(sort $(patsubst %.ko,%.mod, $(MAKECMDGOALS))) + +$(single-ko): single_modpost + @: +$(single-no-ko): descend + @: + +ifeq ($(KBUILD_EXTMOD),) +# For the single build of in-tree modules, use a temporary file to avoid +# the situation of modules_install installing an invalid modules.order. +MODORDER := .modules.tmp +endif + +PHONY += single_modpost +single_modpost: $(single-no-ko) + $(Q){ $(foreach m, $(single-ko), echo $(extmod-prefix)$m;) } > $(MODORDER) + $(Q)$(MAKE) -f $(srctree)/scripts/Makefile.modpost + +KBUILD_MODULES := 1 + +export KBUILD_SINGLE_TARGETS := $(addprefix $(extmod-prefix), $(single-no-ko)) + +# trim unrelated directories +build-dirs := $(foreach d, $(build-dirs), \ + $(if $(filter $(d)/%, $(KBUILD_SINGLE_TARGETS)), $(d))) + +endif + # Handle descending into subdirectories listed in $(build-dirs) # Preset locale variables to speed up the build process. Limit locale # tweaks to this spot to avoid wrong language settings when running @@ -1649,7 +1716,9 @@ PHONY += descend $(build-dirs) descend: $(build-dirs) $(build-dirs): prepare - $(Q)$(MAKE) $(build)=$@ single-build=$(single-build) need-builtin=1 need-modorder=1 + $(Q)$(MAKE) $(build)=$@ \ + single-build=$(if $(filter-out $@/, $(single-no-ko)),1) \ + need-builtin=1 need-modorder=1 clean-dirs := $(addprefix _clean_, $(clean-dirs)) PHONY += $(clean-dirs) clean @@ -1753,50 +1822,6 @@ $(Q)mkdir -p $(objtree)/tools $(Q)$(MAKE) LDFLAGS= MAKEFLAGS="$(tools_silent) $(filter --j% -j,$(MAKEFLAGS))" O=$(abspath $(objtree)) subdir=tools -C $(srctree)/tools/ $* -# Single targets -# --------------------------------------------------------------------------- -# To build individual files in subdirectories, you can do like this: -# -# make foo/bar/baz.s -# -# The supported suffixes for single-target are listed in 'single-targets' -# -# To build only under specific subdirectories, you can do like this: -# -# make foo/bar/baz/ - -ifdef single-build - -single-all := $(filter $(single-targets), $(MAKECMDGOALS)) - -# .ko is special because modpost is needed -single-ko := $(sort $(filter %.ko, $(single-all))) -single-no-ko := $(sort $(patsubst %.ko,%.mod, $(single-all))) - -$(single-ko): single_modpost - @: -$(single-no-ko): descend - @: - -ifeq ($(KBUILD_EXTMOD),) -# For the single build of in-tree modules, use a temporary file to avoid -# the situation of modules_install installing an invalid modules.order. -MODORDER := .modules.tmp -endif - -PHONY += single_modpost -single_modpost: $(single-no-ko) - $(Q){ $(foreach m, $(single-ko), echo $(extmod-prefix)$m;) } > $(MODORDER) - $(Q)$(MAKE) -f $(srctree)/scripts/Makefile.modpost - -KBUILD_MODULES := 1 - -export KBUILD_SINGLE_TARGETS := $(addprefix $(extmod-prefix), $(single-no-ko)) - -single-build = $(if $(filter-out $@/, $(single-no-ko)),1) - -endif - # FIXME Should go into a make.lib or something # =========================================================================== --- linux-riscv-5.4.0.orig/arch/Kconfig +++ linux-riscv-5.4.0/arch/Kconfig @@ -396,10 +396,10 @@ config HAVE_RCU_TABLE_FREE bool -config HAVE_RCU_TABLE_NO_INVALIDATE +config HAVE_MMU_GATHER_PAGE_SIZE bool -config HAVE_MMU_GATHER_PAGE_SIZE +config MMU_GATHER_NO_RANGE bool config HAVE_MMU_GATHER_NO_GATHER --- linux-riscv-5.4.0.orig/arch/arc/boot/dts/axs10x_mb.dtsi +++ linux-riscv-5.4.0/arch/arc/boot/dts/axs10x_mb.dtsi @@ -77,6 +77,7 @@ interrupt-names = "macirq"; phy-mode = "rgmii"; snps,pbl = < 32 >; + snps,multicast-filter-bins = <256>; clocks = <&apbclk>; clock-names = "stmmaceth"; max-speed = <100>; --- linux-riscv-5.4.0.orig/arch/arc/include/asm/linkage.h +++ linux-riscv-5.4.0/arch/arc/include/asm/linkage.h @@ -29,6 +29,8 @@ .endm #define ASM_NL ` /* use '`' to mark new line in macro */ +#define __ALIGN .align 4 +#define __ALIGN_STR __stringify(__ALIGN) /* annotation for data we want in DCCM - if enabled in .config */ .macro ARCFP_DATA nm --- linux-riscv-5.4.0.orig/arch/arc/plat-eznps/Kconfig +++ linux-riscv-5.4.0/arch/arc/plat-eznps/Kconfig @@ -7,7 +7,7 @@ menuconfig ARC_PLAT_EZNPS bool "\"EZchip\" ARC dev platform" select CPU_BIG_ENDIAN - select CLKSRC_NPS + select CLKSRC_NPS if !PHYS_ADDR_T_64BIT select EZNPS_GIC select EZCHIP_NPS_MANAGEMENT_ENET if ETHERNET help --- linux-riscv-5.4.0.orig/arch/arm/Kconfig +++ linux-riscv-5.4.0/arch/arm/Kconfig @@ -73,8 +73,9 @@ select HAVE_ARM_SMCCC if CPU_V7 select HAVE_EBPF_JIT if !CPU_ENDIAN_BE32 select HAVE_CONTEXT_TRACKING + select HAVE_COPY_THREAD_TLS select HAVE_C_RECORDMCOUNT - select HAVE_DEBUG_KMEMLEAK + select HAVE_DEBUG_KMEMLEAK if !XIP_KERNEL select HAVE_DMA_CONTIGUOUS if MMU select HAVE_DYNAMIC_FTRACE if !XIP_KERNEL && !CPU_ENDIAN_BE32 && MMU select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE @@ -1906,7 +1907,7 @@ config KEXEC bool "Kexec system call (EXPERIMENTAL)" depends on (!SMP || PM_SLEEP_SMP) - depends on !CPU_V7M + depends on MMU select KEXEC_CORE help kexec is a system call that implements the ability to shutdown your --- linux-riscv-5.4.0.orig/arch/arm/Makefile +++ linux-riscv-5.4.0/arch/arm/Makefile @@ -307,13 +307,15 @@ ifeq ($(CONFIG_STACKPROTECTOR_PER_TASK),y) prepare: stack_protector_prepare stack_protector_prepare: prepare0 - $(eval KBUILD_CFLAGS += \ + $(eval SSP_PLUGIN_CFLAGS := \ -fplugin-arg-arm_ssp_per_task_plugin-tso=$(shell \ awk '{if ($$2 == "THREAD_SZ_ORDER") print $$3;}'\ include/generated/asm-offsets.h) \ -fplugin-arg-arm_ssp_per_task_plugin-offset=$(shell \ awk '{if ($$2 == "TI_STACK_CANARY") print $$3;}'\ include/generated/asm-offsets.h)) + $(eval KBUILD_CFLAGS += $(SSP_PLUGIN_CFLAGS)) + $(eval GCC_PLUGINS_CFLAGS += $(SSP_PLUGIN_CFLAGS)) endif all: $(notdir $(KBUILD_IMAGE)) --- linux-riscv-5.4.0.orig/arch/arm/boot/compressed/Makefile +++ linux-riscv-5.4.0/arch/arm/boot/compressed/Makefile @@ -101,7 +101,6 @@ $(libfdt) $(libfdt_hdrs) hyp-stub.S KBUILD_CFLAGS += -DDISABLE_BRANCH_PROFILING -KBUILD_CFLAGS += $(DISABLE_ARM_SSP_PER_TASK_PLUGIN) ifeq ($(CONFIG_FUNCTION_TRACER),y) ORIG_CFLAGS := $(KBUILD_CFLAGS) @@ -117,7 +116,8 @@ CFLAGS_fdt_rw.o := $(nossp_flags) CFLAGS_fdt_wip.o := $(nossp_flags) -ccflags-y := -fpic $(call cc-option,-mno-single-pic-base,) -fno-builtin -I$(obj) +ccflags-y := -fpic $(call cc-option,-mno-single-pic-base,) -fno-builtin \ + -I$(obj) $(DISABLE_ARM_SSP_PER_TASK_PLUGIN) asflags-y := -DZIMAGE # Supply kernel BSS size to the decompressor via a linker symbol. --- linux-riscv-5.4.0.orig/arch/arm/boot/compressed/libfdt_env.h +++ linux-riscv-5.4.0/arch/arm/boot/compressed/libfdt_env.h @@ -2,11 +2,13 @@ #ifndef _ARM_LIBFDT_ENV_H #define _ARM_LIBFDT_ENV_H +#include #include #include #include -#define INT_MAX ((int)(~0U>>1)) +#define INT32_MAX S32_MAX +#define UINT32_MAX U32_MAX typedef __be16 fdt16_t; typedef __be32 fdt32_t; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/am335x-boneblack-common.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/am335x-boneblack-common.dtsi @@ -131,6 +131,11 @@ }; / { + memory@80000000 { + device_type = "memory"; + reg = <0x80000000 0x20000000>; /* 512 MB */ + }; + clk_mcasp0_fixed: clk_mcasp0_fixed { #clock-cells = <0>; compatible = "fixed-clock"; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/am335x-sancloud-bbe.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/am335x-sancloud-bbe.dts @@ -108,7 +108,7 @@ &cpsw_emac0 { phy-handle = <ðphy0>; - phy-mode = "rgmii-txid"; + phy-mode = "rgmii-id"; }; &i2c0 { --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/am437x-gp-evm.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/am437x-gp-evm.dts @@ -86,7 +86,7 @@ }; lcd0: display { - compatible = "osddisplays,osd057T0559-34ts", "panel-dpi"; + compatible = "osddisplays,osd070t1718-19ts", "panel-dpi"; label = "lcd"; backlight = <&lcd_bl>; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/am437x-idk-evm.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/am437x-idk-evm.dts @@ -526,11 +526,11 @@ * Supply voltage supervisor on board will not allow opp50 so * disable it and set opp100 as suspend OPP. */ - opp50@300000000 { + opp50-300000000 { status = "disabled"; }; - opp100@600000000 { + opp100-600000000 { opp-suspend; }; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/am43x-epos-evm.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/am43x-epos-evm.dts @@ -42,7 +42,7 @@ }; lcd0: display { - compatible = "osddisplays,osd057T0559-34ts", "panel-dpi"; + compatible = "osddisplays,osd070t1718-19ts", "panel-dpi"; label = "lcd"; backlight = <&lcd_bl>; @@ -848,6 +848,7 @@ pinctrl-names = "default", "sleep"; pinctrl-0 = <&spi0_pins_default>; pinctrl-1 = <&spi0_pins_sleep>; + ti,pindir-d0-out-d1-in = <1>; }; &spi1 { @@ -855,6 +856,7 @@ pinctrl-names = "default", "sleep"; pinctrl-0 = <&spi1_pins_default>; pinctrl-1 = <&spi1_pins_sleep>; + ti,pindir-d0-out-d1-in = <1>; }; &usb2_phy1 { --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/am43xx-clocks.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/am43xx-clocks.dtsi @@ -704,6 +704,60 @@ ti,bit-shift = <8>; reg = <0x2a48>; }; + + clkout1_osc_div_ck: clkout1-osc-div-ck { + #clock-cells = <0>; + compatible = "ti,divider-clock"; + clocks = <&sys_clkin_ck>; + ti,bit-shift = <20>; + ti,max-div = <4>; + reg = <0x4100>; + }; + + clkout1_src2_mux_ck: clkout1-src2-mux-ck { + #clock-cells = <0>; + compatible = "ti,mux-clock"; + clocks = <&clk_rc32k_ck>, <&sysclk_div>, <&dpll_ddr_m2_ck>, + <&dpll_per_m2_ck>, <&dpll_disp_m2_ck>, + <&dpll_mpu_m2_ck>; + reg = <0x4100>; + }; + + clkout1_src2_pre_div_ck: clkout1-src2-pre-div-ck { + #clock-cells = <0>; + compatible = "ti,divider-clock"; + clocks = <&clkout1_src2_mux_ck>; + ti,bit-shift = <4>; + ti,max-div = <8>; + reg = <0x4100>; + }; + + clkout1_src2_post_div_ck: clkout1-src2-post-div-ck { + #clock-cells = <0>; + compatible = "ti,divider-clock"; + clocks = <&clkout1_src2_pre_div_ck>; + ti,bit-shift = <8>; + ti,max-div = <32>; + ti,index-power-of-two; + reg = <0x4100>; + }; + + clkout1_mux_ck: clkout1-mux-ck { + #clock-cells = <0>; + compatible = "ti,mux-clock"; + clocks = <&clkout1_osc_div_ck>, <&clk_rc32k_ck>, + <&clkout1_src2_post_div_ck>, <&dpll_extdev_m2_ck>; + ti,bit-shift = <16>; + reg = <0x4100>; + }; + + clkout1_ck: clkout1-ck { + #clock-cells = <0>; + compatible = "ti,gate-clock"; + clocks = <&clkout1_mux_ck>; + ti,bit-shift = <23>; + reg = <0x4100>; + }; }; &prcm { --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/am571x-idk.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/am571x-idk.dts @@ -167,11 +167,7 @@ &pcie1_rc { status = "okay"; - gpios = <&gpio3 23 GPIO_ACTIVE_HIGH>; -}; - -&pcie1_ep { - gpios = <&gpio3 23 GPIO_ACTIVE_HIGH>; + gpios = <&gpio5 18 GPIO_ACTIVE_HIGH>; }; &mmc1 { --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/am572x-idk-common.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/am572x-idk-common.dtsi @@ -147,10 +147,6 @@ gpios = <&gpio3 23 GPIO_ACTIVE_HIGH>; }; -&pcie1_ep { - gpios = <&gpio3 23 GPIO_ACTIVE_HIGH>; -}; - &mailbox5 { status = "okay"; mbox_ipu1_ipc3x: mbox_ipu1_ipc3x { --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/am57xx-beagle-x15-common.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/am57xx-beagle-x15-common.dtsi @@ -29,6 +29,27 @@ reg = <0x0 0x80000000 0x0 0x80000000>; }; + main_12v0: fixedregulator-main_12v0 { + /* main supply */ + compatible = "regulator-fixed"; + regulator-name = "main_12v0"; + regulator-min-microvolt = <12000000>; + regulator-max-microvolt = <12000000>; + regulator-always-on; + regulator-boot-on; + }; + + evm_5v0: fixedregulator-evm_5v0 { + /* Output of TPS54531D */ + compatible = "regulator-fixed"; + regulator-name = "evm_5v0"; + regulator-min-microvolt = <5000000>; + regulator-max-microvolt = <5000000>; + vin-supply = <&main_12v0>; + regulator-always-on; + regulator-boot-on; + }; + vdd_3v3: fixedregulator-vdd_3v3 { compatible = "regulator-fixed"; regulator-name = "vdd_3v3"; @@ -547,10 +568,6 @@ gpios = <&gpio2 8 GPIO_ACTIVE_LOW>; }; -&pcie1_ep { - gpios = <&gpio2 8 GPIO_ACTIVE_LOW>; -}; - &mcasp3 { #sound-dai-cells = <0>; assigned-clocks = <&l4per2_clkctrl DRA7_L4PER2_MCASP3_CLKCTRL 24>; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/at91sam9260.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/at91sam9260.dtsi @@ -187,7 +187,7 @@ usart0 { pinctrl_usart0: usart0-0 { atmel,pins = - ; }; @@ -221,7 +221,7 @@ usart1 { pinctrl_usart1: usart1-0 { atmel,pins = - ; }; @@ -239,7 +239,7 @@ usart2 { pinctrl_usart2: usart2-0 { atmel,pins = - ; }; @@ -257,7 +257,7 @@ usart3 { pinctrl_usart3: usart3-0 { atmel,pins = - ; }; @@ -275,7 +275,7 @@ uart0 { pinctrl_uart0: uart0-0 { atmel,pins = - ; }; }; @@ -283,7 +283,7 @@ uart1 { pinctrl_uart1: uart1-0 { atmel,pins = - ; }; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/at91sam9261.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/at91sam9261.dtsi @@ -329,7 +329,7 @@ usart0 { pinctrl_usart0: usart0-0 { atmel,pins = - , + , ; }; @@ -347,7 +347,7 @@ usart1 { pinctrl_usart1: usart1-0 { atmel,pins = - , + , ; }; @@ -365,7 +365,7 @@ usart2 { pinctrl_usart2: usart2-0 { atmel,pins = - , + , ; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/at91sam9263.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/at91sam9263.dtsi @@ -183,7 +183,7 @@ usart0 { pinctrl_usart0: usart0-0 { atmel,pins = - ; }; @@ -201,7 +201,7 @@ usart1 { pinctrl_usart1: usart1-0 { atmel,pins = - ; }; @@ -219,7 +219,7 @@ usart2 { pinctrl_usart2: usart2-0 { atmel,pins = - ; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/at91sam9g45.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/at91sam9g45.dtsi @@ -556,7 +556,7 @@ usart0 { pinctrl_usart0: usart0-0 { atmel,pins = - ; }; @@ -574,7 +574,7 @@ usart1 { pinctrl_usart1: usart1-0 { atmel,pins = - ; }; @@ -592,7 +592,7 @@ usart2 { pinctrl_usart2: usart2-0 { atmel,pins = - ; }; @@ -610,7 +610,7 @@ usart3 { pinctrl_usart3: usart3-0 { atmel,pins = - ; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/at91sam9rl.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/at91sam9rl.dtsi @@ -682,7 +682,7 @@ usart0 { pinctrl_usart0: usart0-0 { atmel,pins = - , + , ; }; @@ -721,7 +721,7 @@ usart1 { pinctrl_usart1: usart1-0 { atmel,pins = - , + , ; }; @@ -744,7 +744,7 @@ usart2 { pinctrl_usart2: usart2-0 { atmel,pins = - , + , ; }; @@ -767,7 +767,7 @@ usart3 { pinctrl_usart3: usart3-0 { atmel,pins = - , + , ; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/bcm-cygnus.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/bcm-cygnus.dtsi @@ -174,8 +174,8 @@ mdio: mdio@18002000 { compatible = "brcm,iproc-mdio"; reg = <0x18002000 0x8>; - #size-cells = <1>; - #address-cells = <0>; + #size-cells = <0>; + #address-cells = <1>; status = "disabled"; gphy0: ethernet-phy@0 { --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/bcm2835-rpi-zero-w.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/bcm2835-rpi-zero-w.dts @@ -112,6 +112,7 @@ &sdhci { #address-cells = <1>; #size-cells = <0>; + pinctrl-names = "default"; pinctrl-0 = <&emmc_gpio34 &gpclk2_gpio43>; bus-width = <4>; mmc-pwrseq = <&wifi_pwrseq>; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/bcm283x.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/bcm283x.dtsi @@ -40,7 +40,7 @@ trips { cpu-crit { - temperature = <80000>; + temperature = <90000>; hysteresis = <0>; type = "critical"; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/bcm5301x.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/bcm5301x.dtsi @@ -353,8 +353,8 @@ mdio: mdio@18003000 { compatible = "brcm,iproc-mdio"; reg = <0x18003000 0x8>; - #size-cells = <1>; - #address-cells = <0>; + #size-cells = <0>; + #address-cells = <1>; }; mdio-bus-mux@18003000 { --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/dra7-l4.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/dra7-l4.dtsi @@ -3059,7 +3059,7 @@ davinci_mdio: mdio@1000 { compatible = "ti,cpsw-mdio","ti,davinci_mdio"; - clocks = <&gmac_clkctrl DRA7_GMAC_GMAC_CLKCTRL 0>; + clocks = <&gmac_main_clk>; clock-names = "fck"; #address-cells = <1>; #size-cells = <0>; @@ -3413,6 +3413,7 @@ clocks = <&l4per3_clkctrl DRA7_L4PER3_TIMER13_CLKCTRL 24>; clock-names = "fck"; interrupts = ; + ti,timer-pwm; }; }; @@ -3441,6 +3442,7 @@ clocks = <&l4per3_clkctrl DRA7_L4PER3_TIMER14_CLKCTRL 24>; clock-names = "fck"; interrupts = ; + ti,timer-pwm; }; }; @@ -3469,6 +3471,7 @@ clocks = <&l4per3_clkctrl DRA7_L4PER3_TIMER15_CLKCTRL 24>; clock-names = "fck"; interrupts = ; + ti,timer-pwm; }; }; @@ -3497,6 +3500,7 @@ clocks = <&l4per3_clkctrl DRA7_L4PER3_TIMER16_CLKCTRL 24>; clock-names = "fck"; interrupts = ; + ti,timer-pwm; }; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/dra7.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/dra7.dtsi @@ -148,6 +148,7 @@ #address-cells = <1>; #size-cells = <1>; ranges = <0x0 0x0 0x0 0xc0000000>; + dma-ranges = <0x80000000 0x0 0x80000000 0x80000000>; ti,hwmods = "l3_main_1", "l3_main_2"; reg = <0x0 0x44000000 0x0 0x1000000>, <0x0 0x45000000 0x0 0x1000>; @@ -184,6 +185,7 @@ device_type = "pci"; ranges = <0x81000000 0 0 0x03000 0 0x00010000 0x82000000 0 0x20013000 0x13000 0 0xffed000>; + dma-ranges = <0x02000000 0x0 0x00000000 0x00000000 0x1 0x00000000>; bus-range = <0x00 0xff>; #interrupt-cells = <1>; num-lanes = <1>; @@ -238,6 +240,7 @@ device_type = "pci"; ranges = <0x81000000 0 0 0x03000 0 0x00010000 0x82000000 0 0x30013000 0x13000 0 0xffed000>; + dma-ranges = <0x02000000 0x0 0x00000000 0x00000000 0x1 0x00000000>; bus-range = <0x00 0xff>; #interrupt-cells = <1>; num-lanes = <1>; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/dra76x.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/dra76x.dtsi @@ -86,3 +86,8 @@ &usb4_tm { status = "disabled"; }; + +&mmc3 { + /* dra76x is not affected by i887 */ + max-frequency = <96000000>; +}; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/dra7xx-clocks.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/dra7xx-clocks.dtsi @@ -796,16 +796,6 @@ clock-div = <1>; }; - ipu1_gfclk_mux: ipu1_gfclk_mux@520 { - #clock-cells = <0>; - compatible = "ti,mux-clock"; - clocks = <&dpll_abe_m2x2_ck>, <&dpll_core_h22x2_ck>; - ti,bit-shift = <24>; - reg = <0x0520>; - assigned-clocks = <&ipu1_gfclk_mux>; - assigned-clock-parents = <&dpll_core_h22x2_ck>; - }; - dummy_ck: dummy_ck { #clock-cells = <0>; compatible = "fixed-clock"; @@ -1564,6 +1554,8 @@ compatible = "ti,clkctrl"; reg = <0x20 0x4>; #clock-cells = <2>; + assigned-clocks = <&ipu1_clkctrl DRA7_IPU1_MMU_IPU1_CLKCTRL 24>; + assigned-clock-parents = <&dpll_core_h22x2_ck>; }; ipu_clkctrl: ipu-clkctrl@50 { --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/imx6dl-icore-mipi.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/imx6dl-icore-mipi.dts @@ -8,7 +8,7 @@ /dts-v1/; #include "imx6dl.dtsi" -#include "imx6qdl-icore.dtsi" +#include "imx6qdl-icore-1.5.dtsi" / { model = "Engicam i.CoreM6 DualLite/Solo MIPI Starter Kit"; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/imx6q-dhcom-pdk2.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/imx6q-dhcom-pdk2.dts @@ -55,7 +55,7 @@ #sound-dai-cells = <0>; clocks = <&clk_ext_audio_codec>; VDDA-supply = <®_3p3v>; - VDDIO-supply = <®_3p3v>; + VDDIO-supply = <&sw2_reg>; }; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/imx6q-dhcom-som.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/imx6q-dhcom-som.dtsi @@ -206,7 +206,7 @@ }; rtc@56 { - compatible = "rv3029c2"; + compatible = "microcrystal,rv3029"; pinctrl-names = "default"; pinctrl-0 = <&pinctrl_rtc_hw300>; reg = <0x56>; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/imx6qdl-phytec-phycore-som.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/imx6qdl-phytec-phycore-som.dtsi @@ -107,14 +107,14 @@ regulators { vdd_arm: buck1 { regulator-name = "vdd_arm"; - regulator-min-microvolt = <730000>; + regulator-min-microvolt = <925000>; regulator-max-microvolt = <1380000>; regulator-always-on; }; vdd_soc: buck2 { regulator-name = "vdd_soc"; - regulator-min-microvolt = <730000>; + regulator-min-microvolt = <1150000>; regulator-max-microvolt = <1380000>; regulator-always-on; }; @@ -183,7 +183,6 @@ pinctrl-0 = <&pinctrl_usdhc4>; bus-width = <8>; non-removable; - vmmc-supply = <&vdd_emmc_1p8>; status = "disabled"; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/imx6qdl-sabresd.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/imx6qdl-sabresd.dtsi @@ -749,10 +749,6 @@ vin-supply = <&vgen5_reg>; }; -®_vdd3p0 { - vin-supply = <&sw2_reg>; -}; - ®_vdd2p5 { vin-supply = <&vgen5_reg>; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/imx6qdl-zii-rdu2.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/imx6qdl-zii-rdu2.dtsi @@ -627,7 +627,7 @@ pinctrl-0 = <&pinctrl_usdhc2>; bus-width = <4>; cd-gpios = <&gpio2 2 GPIO_ACTIVE_LOW>; - wp-gpios = <&gpio2 3 GPIO_ACTIVE_HIGH>; + disable-wp; vmmc-supply = <®_3p3v_sd>; vqmmc-supply = <®_3p3v>; no-1-8-v; @@ -640,7 +640,7 @@ pinctrl-0 = <&pinctrl_usdhc3>; bus-width = <4>; cd-gpios = <&gpio2 0 GPIO_ACTIVE_LOW>; - wp-gpios = <&gpio2 1 GPIO_ACTIVE_HIGH>; + disable-wp; vmmc-supply = <®_3p3v_sd>; vqmmc-supply = <®_3p3v>; no-1-8-v; @@ -774,6 +774,7 @@ &usbh1 { vbus-supply = <®_5p0v_main>; disable-over-current; + maximum-speed = "full-speed"; status = "okay"; }; @@ -1055,7 +1056,6 @@ MX6QDL_PAD_SD2_DAT1__SD2_DATA1 0x17059 MX6QDL_PAD_SD2_DAT2__SD2_DATA2 0x17059 MX6QDL_PAD_SD2_DAT3__SD2_DATA3 0x17059 - MX6QDL_PAD_NANDF_D3__GPIO2_IO03 0x40010040 MX6QDL_PAD_NANDF_D2__GPIO2_IO02 0x40010040 >; }; @@ -1068,7 +1068,6 @@ MX6QDL_PAD_SD3_DAT1__SD3_DATA1 0x17059 MX6QDL_PAD_SD3_DAT2__SD3_DATA2 0x17059 MX6QDL_PAD_SD3_DAT3__SD3_DATA3 0x17059 - MX6QDL_PAD_NANDF_D1__GPIO2_IO01 0x40010040 MX6QDL_PAD_NANDF_D0__GPIO2_IO00 0x40010040 >; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/imx6sl-evk.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/imx6sl-evk.dts @@ -584,10 +584,6 @@ vin-supply = <&sw2_reg>; }; -®_vdd3p0 { - vin-supply = <&sw2_reg>; -}; - ®_vdd2p5 { vin-supply = <&sw2_reg>; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/imx6sll-evk.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/imx6sll-evk.dts @@ -265,10 +265,6 @@ status = "okay"; }; -®_3p0 { - vin-supply = <&sw2_reg>; -}; - &snvs_poweroff { status = "okay"; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/imx6sx-sdb-reva.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/imx6sx-sdb-reva.dts @@ -159,10 +159,6 @@ vin-supply = <&vgen6_reg>; }; -®_vdd3p0 { - vin-supply = <&sw2_reg>; -}; - ®_vdd2p5 { vin-supply = <&vgen6_reg>; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/imx6sx-sdb.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/imx6sx-sdb.dts @@ -141,10 +141,6 @@ vin-supply = <&vgen6_reg>; }; -®_vdd3p0 { - vin-supply = <&sw2_reg>; -}; - ®_vdd2p5 { vin-supply = <&vgen6_reg>; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/imx6ul-14x14-evk.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/imx6ul-14x14-evk.dtsi @@ -215,7 +215,7 @@ flash0: n25q256a@0 { #address-cells = <1>; #size-cells = <1>; - compatible = "micron,n25q256a"; + compatible = "micron,n25q256a", "jedec,spi-nor"; spi-max-frequency = <29000000>; spi-rx-bus-width = <4>; spi-tx-bus-width = <4>; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/imx6ul-kontron-n6310-s.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/imx6ul-kontron-n6310-s.dts @@ -157,10 +157,6 @@ status = "okay"; }; -&snvs_poweroff { - status = "okay"; -}; - &uart1 { pinctrl-names = "default"; pinctrl-0 = <&pinctrl_uart1>; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/imx7-colibri.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/imx7-colibri.dtsi @@ -337,7 +337,6 @@ assigned-clock-rates = <400000000>; bus-width = <8>; fsl,tuning-step = <2>; - max-frequency = <100000000>; vmmc-supply = <®_module_3v3>; vqmmc-supply = <®_DCDC3>; non-removable; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/imx7s-colibri.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/imx7s-colibri.dtsi @@ -49,3 +49,7 @@ reg = <0x80000000 0x10000000>; }; }; + +&gpmi { + status = "okay"; +}; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/imx7ulp.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/imx7ulp.dtsi @@ -37,10 +37,10 @@ #address-cells = <1>; #size-cells = <0>; - cpu0: cpu@0 { + cpu0: cpu@f00 { compatible = "arm,cortex-a7"; device_type = "cpu"; - reg = <0>; + reg = <0xf00>; }; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/logicpd-torpedo-37xx-devkit-28.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/logicpd-torpedo-37xx-devkit-28.dts @@ -11,22 +11,6 @@ #include "logicpd-torpedo-37xx-devkit.dts" &lcd0 { - - label = "28"; - - panel-timing { - clock-frequency = <9000000>; - hactive = <480>; - vactive = <272>; - hfront-porch = <3>; - hback-porch = <2>; - hsync-len = <42>; - vback-porch = <3>; - vfront-porch = <2>; - vsync-len = <11>; - hsync-active = <1>; - vsync-active = <1>; - de-active = <1>; - pixelclk-active = <0>; - }; + /* To make it work, set CONFIG_OMAP2_DSS_MIN_FCK_PER_PCK=4 */ + compatible = "logicpd,type28"; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/ls1021a.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/ls1021a.dtsi @@ -728,7 +728,7 @@ }; mdio0: mdio@2d24000 { - compatible = "fsl,etsec2-mdio"; + compatible = "gianfar"; device_type = "mdio"; #address-cells = <1>; #size-cells = <0>; @@ -737,7 +737,7 @@ }; mdio1: mdio@2d64000 { - compatible = "fsl,etsec2-mdio"; + compatible = "gianfar"; device_type = "mdio"; #address-cells = <1>; #size-cells = <0>; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/meson8.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/meson8.dtsi @@ -129,8 +129,8 @@ gpu_opp_table: gpu-opp-table { compatible = "operating-points-v2"; - opp-182150000 { - opp-hz = /bits/ 64 <182150000>; + opp-182142857 { + opp-hz = /bits/ 64 <182142857>; opp-microvolt = <1150000>; }; opp-318750000 { @@ -253,7 +253,7 @@ &aobus { pmu: pmu@e0 { compatible = "amlogic,meson8-pmu", "syscon"; - reg = <0xe0 0x8>; + reg = <0xe0 0x18>; }; pinctrl_aobus: pinctrl@84 { --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/meson8b.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/meson8b.dtsi @@ -125,8 +125,8 @@ opp-hz = /bits/ 64 <255000000>; opp-microvolt = <1100000>; }; - opp-364300000 { - opp-hz = /bits/ 64 <364300000>; + opp-364285714 { + opp-hz = /bits/ 64 <364285714>; opp-microvolt = <1100000>; }; opp-425000000 { --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/motorola-cpcap-mapphone.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/motorola-cpcap-mapphone.dtsi @@ -160,12 +160,12 @@ regulator-enable-ramp-delay = <1000>; }; - /* Used by DSS */ + /* Used by DSS and is the "zerov_regulator" trigger for SoC off mode */ vcsi: VCSI { regulator-min-microvolt = <1800000>; regulator-max-microvolt = <1800000>; regulator-enable-ramp-delay = <1000>; - regulator-boot-on; + regulator-always-on; }; vdac: VDAC { --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/omap3-n900.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/omap3-n900.dts @@ -155,6 +155,12 @@ pwms = <&pwm9 0 26316 0>; /* 38000 Hz */ }; + rom_rng: rng { + compatible = "nokia,n900-rom-rng"; + clocks = <&rng_ick>; + clock-names = "ick"; + }; + /* controlled (enabled/disabled) directly by bcm2048 and wl1251 */ vctcxo: vctcxo { compatible = "fixed-clock"; @@ -843,34 +849,46 @@ compatible = "ti,omap2-onenand"; reg = <0 0 0x20000>; /* CS0, offset 0, IO size 128K */ + /* + * These timings are based on CONFIG_OMAP_GPMC_DEBUG=y reported + * bootloader set values when booted with v5.1 + * (OneNAND Manufacturer: Samsung): + * + * cs0 GPMC_CS_CONFIG1: 0xfb001202 + * cs0 GPMC_CS_CONFIG2: 0x00111100 + * cs0 GPMC_CS_CONFIG3: 0x00020200 + * cs0 GPMC_CS_CONFIG4: 0x11001102 + * cs0 GPMC_CS_CONFIG5: 0x03101616 + * cs0 GPMC_CS_CONFIG6: 0x90060000 + */ gpmc,sync-read; gpmc,sync-write; gpmc,burst-length = <16>; gpmc,burst-read; gpmc,burst-wrap; gpmc,burst-write; - gpmc,device-width = <2>; /* GPMC_DEVWIDTH_16BIT */ - gpmc,mux-add-data = <2>; /* GPMC_MUX_AD */ + gpmc,device-width = <2>; + gpmc,mux-add-data = <2>; gpmc,cs-on-ns = <0>; - gpmc,cs-rd-off-ns = <87>; - gpmc,cs-wr-off-ns = <87>; + gpmc,cs-rd-off-ns = <102>; + gpmc,cs-wr-off-ns = <102>; gpmc,adv-on-ns = <0>; - gpmc,adv-rd-off-ns = <10>; - gpmc,adv-wr-off-ns = <10>; - gpmc,oe-on-ns = <15>; - gpmc,oe-off-ns = <87>; + gpmc,adv-rd-off-ns = <12>; + gpmc,adv-wr-off-ns = <12>; + gpmc,oe-on-ns = <12>; + gpmc,oe-off-ns = <102>; gpmc,we-on-ns = <0>; - gpmc,we-off-ns = <87>; - gpmc,rd-cycle-ns = <112>; - gpmc,wr-cycle-ns = <112>; - gpmc,access-ns = <81>; - gpmc,page-burst-access-ns = <15>; + gpmc,we-off-ns = <102>; + gpmc,rd-cycle-ns = <132>; + gpmc,wr-cycle-ns = <132>; + gpmc,access-ns = <96>; + gpmc,page-burst-access-ns = <18>; gpmc,bus-turnaround-ns = <0>; gpmc,cycle2cycle-delay-ns = <0>; gpmc,wait-monitoring-ns = <0>; - gpmc,clk-activation-ns = <5>; - gpmc,wr-data-mux-bus-ns = <30>; - gpmc,wr-access-ns = <81>; + gpmc,clk-activation-ns = <6>; + gpmc,wr-data-mux-bus-ns = <36>; + gpmc,wr-access-ns = <96>; gpmc,sync-clk-ps = <15000>; /* --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/omap3-pandora-common.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/omap3-pandora-common.dtsi @@ -226,6 +226,17 @@ gpio = <&gpio6 4 GPIO_ACTIVE_HIGH>; /* GPIO_164 */ }; + /* wl1251 wifi+bt module */ + wlan_en: fixed-regulator-wg7210_en { + compatible = "regulator-fixed"; + regulator-name = "vwlan"; + regulator-min-microvolt = <1800000>; + regulator-max-microvolt = <1800000>; + startup-delay-us = <50000>; + enable-active-high; + gpio = <&gpio1 23 GPIO_ACTIVE_HIGH>; + }; + /* wg7210 (wifi+bt module) 32k clock buffer */ wg7210_32k: fixed-regulator-wg7210_32k { compatible = "regulator-fixed"; @@ -522,9 +533,30 @@ /*wp-gpios = <&gpio4 31 GPIO_ACTIVE_HIGH>;*/ /* GPIO_127 */ }; -/* mmc3 is probed using pdata-quirks to pass wl1251 card data */ &mmc3 { - status = "disabled"; + vmmc-supply = <&wlan_en>; + + bus-width = <4>; + non-removable; + ti,non-removable; + cap-power-off-card; + + pinctrl-names = "default"; + pinctrl-0 = <&mmc3_pins>; + + #address-cells = <1>; + #size-cells = <0>; + + wlan: wifi@1 { + compatible = "ti,wl1251"; + + reg = <1>; + + interrupt-parent = <&gpio1>; + interrupts = <21 IRQ_TYPE_LEVEL_HIGH>; /* GPIO_21 */ + + ti,wl1251-has-eeprom; + }; }; /* bluetooth*/ --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/omap3-tao3530.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/omap3-tao3530.dtsi @@ -222,7 +222,7 @@ pinctrl-0 = <&mmc1_pins>; vmmc-supply = <&vmmc1>; vqmmc-supply = <&vsim>; - cd-gpios = <&twl_gpio 0 GPIO_ACTIVE_HIGH>; + cd-gpios = <&twl_gpio 0 GPIO_ACTIVE_LOW>; bus-width = <8>; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/omap4.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/omap4.dtsi @@ -330,8 +330,8 @@ target-module@56000000 { compatible = "ti,sysc-omap4", "ti,sysc"; - reg = <0x5601fc00 0x4>, - <0x5601fc10 0x4>; + reg = <0x5600fe00 0x4>, + <0x5600fe10 0x4>; reg-names = "rev", "sysc"; ti,sysc-midle = , , --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/omap5.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/omap5.dtsi @@ -143,6 +143,7 @@ #address-cells = <1>; #size-cells = <1>; ranges = <0 0 0 0xc0000000>; + dma-ranges = <0x80000000 0x0 0x80000000 0x80000000>; ti,hwmods = "l3_main_1", "l3_main_2", "l3_main_3"; reg = <0 0x44000000 0 0x2000>, <0 0x44800000 0 0x3000>, --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/ox810se.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/ox810se.dtsi @@ -323,8 +323,8 @@ interrupt-controller; reg = <0 0x200>; #interrupt-cells = <1>; - valid-mask = <0xFFFFFFFF>; - clear-mask = <0>; + valid-mask = <0xffffffff>; + clear-mask = <0xffffffff>; }; timer0: timer@200 { --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/ox820.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/ox820.dtsi @@ -240,8 +240,8 @@ reg = <0 0x200>; interrupts = ; #interrupt-cells = <1>; - valid-mask = <0xFFFFFFFF>; - clear-mask = <0>; + valid-mask = <0xffffffff>; + clear-mask = <0xffffffff>; }; timer0: timer@200 { --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/r8a7779.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/r8a7779.dtsi @@ -68,6 +68,14 @@ <0xf0000100 0x100>; }; + timer@f0000200 { + compatible = "arm,cortex-a9-global-timer"; + reg = <0xf0000200 0x100>; + interrupts = ; + clocks = <&cpg_clocks R8A7779_CLK_ZS>; + }; + timer@f0000600 { compatible = "arm,cortex-a9-twd-timer"; reg = <0xf0000600 0x20>; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/rk3188-bqedison2qc.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/rk3188-bqedison2qc.dts @@ -466,9 +466,12 @@ pinctrl-names = "default"; pinctrl-0 = <&sd1_clk>, <&sd1_cmd>, <&sd1_bus4>; vmmcq-supply = <&vccio_wl>; + #address-cells = <1>; + #size-cells = <0>; status = "okay"; brcmf: wifi@1 { + reg = <1>; compatible = "brcm,bcm4329-fmac"; interrupt-parent = <&gpio3>; interrupts = ; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/s3c6410-mini6410.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/s3c6410-mini6410.dts @@ -165,6 +165,10 @@ }; }; +&clocks { + clocks = <&fin_pll>; +}; + &sdhci0 { pinctrl-names = "default"; pinctrl-0 = <&sd0_clk>, <&sd0_cmd>, <&sd0_cd>, <&sd0_bus4>; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/s3c6410-smdk6410.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/s3c6410-smdk6410.dts @@ -69,6 +69,10 @@ }; }; +&clocks { + clocks = <&fin_pll>; +}; + &sdhci0 { pinctrl-names = "default"; pinctrl-0 = <&sd0_clk>, <&sd0_cmd>, <&sd0_cd>, <&sd0_bus4>; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/sama5d3.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/sama5d3.dtsi @@ -1188,49 +1188,49 @@ usart0_clk: usart0_clk { #clock-cells = <0>; reg = <12>; - atmel,clk-output-range = <0 66000000>; + atmel,clk-output-range = <0 83000000>; }; usart1_clk: usart1_clk { #clock-cells = <0>; reg = <13>; - atmel,clk-output-range = <0 66000000>; + atmel,clk-output-range = <0 83000000>; }; usart2_clk: usart2_clk { #clock-cells = <0>; reg = <14>; - atmel,clk-output-range = <0 66000000>; + atmel,clk-output-range = <0 83000000>; }; usart3_clk: usart3_clk { #clock-cells = <0>; reg = <15>; - atmel,clk-output-range = <0 66000000>; + atmel,clk-output-range = <0 83000000>; }; uart0_clk: uart0_clk { #clock-cells = <0>; reg = <16>; - atmel,clk-output-range = <0 66000000>; + atmel,clk-output-range = <0 83000000>; }; twi0_clk: twi0_clk { reg = <18>; #clock-cells = <0>; - atmel,clk-output-range = <0 16625000>; + atmel,clk-output-range = <0 41500000>; }; twi1_clk: twi1_clk { #clock-cells = <0>; reg = <19>; - atmel,clk-output-range = <0 16625000>; + atmel,clk-output-range = <0 41500000>; }; twi2_clk: twi2_clk { #clock-cells = <0>; reg = <20>; - atmel,clk-output-range = <0 16625000>; + atmel,clk-output-range = <0 41500000>; }; mci0_clk: mci0_clk { @@ -1246,19 +1246,19 @@ spi0_clk: spi0_clk { #clock-cells = <0>; reg = <24>; - atmel,clk-output-range = <0 133000000>; + atmel,clk-output-range = <0 166000000>; }; spi1_clk: spi1_clk { #clock-cells = <0>; reg = <25>; - atmel,clk-output-range = <0 133000000>; + atmel,clk-output-range = <0 166000000>; }; tcb0_clk: tcb0_clk { #clock-cells = <0>; reg = <26>; - atmel,clk-output-range = <0 133000000>; + atmel,clk-output-range = <0 166000000>; }; pwm_clk: pwm_clk { @@ -1269,7 +1269,7 @@ adc_clk: adc_clk { #clock-cells = <0>; reg = <29>; - atmel,clk-output-range = <0 66000000>; + atmel,clk-output-range = <0 83000000>; }; dma0_clk: dma0_clk { @@ -1300,13 +1300,13 @@ ssc0_clk: ssc0_clk { #clock-cells = <0>; reg = <38>; - atmel,clk-output-range = <0 66000000>; + atmel,clk-output-range = <0 83000000>; }; ssc1_clk: ssc1_clk { #clock-cells = <0>; reg = <39>; - atmel,clk-output-range = <0 66000000>; + atmel,clk-output-range = <0 83000000>; }; sha_clk: sha_clk { --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/sama5d3_can.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/sama5d3_can.dtsi @@ -36,13 +36,13 @@ can0_clk: can0_clk { #clock-cells = <0>; reg = <40>; - atmel,clk-output-range = <0 66000000>; + atmel,clk-output-range = <0 83000000>; }; can1_clk: can1_clk { #clock-cells = <0>; reg = <41>; - atmel,clk-output-range = <0 66000000>; + atmel,clk-output-range = <0 83000000>; }; }; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/sama5d3_tcb1.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/sama5d3_tcb1.dtsi @@ -22,6 +22,7 @@ tcb1_clk: tcb1_clk { #clock-cells = <0>; reg = <27>; + atmel,clk-output-range = <0 166000000>; }; }; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/sama5d3_uart.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/sama5d3_uart.dtsi @@ -41,13 +41,13 @@ uart0_clk: uart0_clk { #clock-cells = <0>; reg = <16>; - atmel,clk-output-range = <0 66000000>; + atmel,clk-output-range = <0 83000000>; }; uart1_clk: uart1_clk { #clock-cells = <0>; reg = <17>; - atmel,clk-output-range = <0 66000000>; + atmel,clk-output-range = <0 83000000>; }; }; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/stihxxx-b2120.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/stihxxx-b2120.dtsi @@ -46,7 +46,7 @@ /* DAC */ format = "i2s"; mclk-fs = <256>; - frame-inversion = <1>; + frame-inversion; cpu { sound-dai = <&sti_uni_player2>; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/stm32f469-disco.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/stm32f469-disco.dts @@ -76,6 +76,13 @@ regulator-max-microvolt = <3300000>; }; + vdd_dsi: vdd-dsi { + compatible = "regulator-fixed"; + regulator-name = "vdd_dsi"; + regulator-min-microvolt = <3300000>; + regulator-max-microvolt = <3300000>; + }; + soc { dma-ranges = <0xc0000000 0x0 0x10000000>; }; @@ -155,6 +162,7 @@ compatible = "orisetech,otm8009a"; reg = <0>; /* dsi virtual channel (0..3) */ reset-gpios = <&gpioh 7 GPIO_ACTIVE_LOW>; + power-supply = <&vdd_dsi>; status = "okay"; port { --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/sun8i-a83t-cubietruck-plus.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/sun8i-a83t-cubietruck-plus.dts @@ -101,7 +101,7 @@ initial-mode = <1>; /* initialize in HUB mode */ disabled-ports = <1>; intn-gpios = <&pio 7 5 GPIO_ACTIVE_HIGH>; /* PH5 */ - reset-gpios = <&pio 4 16 GPIO_ACTIVE_HIGH>; /* PE16 */ + reset-gpios = <&pio 4 16 GPIO_ACTIVE_LOW>; /* PE16 */ connect-gpios = <&pio 4 17 GPIO_ACTIVE_HIGH>; /* PE17 */ refclk-frequency = <19200000>; }; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/sun8i-a83t-tbs-a711.dts +++ linux-riscv-5.4.0/arch/arm/boot/dts/sun8i-a83t-tbs-a711.dts @@ -482,7 +482,8 @@ }; &usbphy { - usb0_id_det-gpios = <&pio 7 11 GPIO_ACTIVE_HIGH>; /* PH11 */ + usb0_id_det-gpios = <&pio 7 11 (GPIO_ACTIVE_HIGH | GPIO_PULL_UP)>; /* PH11 */ + usb0_vbus_power-supply = <&usb_power_supply>; usb0_vbus-supply = <®_drivevbus>; usb1_vbus-supply = <®_vmain>; usb2_vbus-supply = <®_vmain>; --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/sun8i-h3.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/sun8i-h3.dtsi @@ -80,7 +80,7 @@ #cooling-cells = <2>; }; - cpu@1 { + cpu1: cpu@1 { compatible = "arm,cortex-a7"; device_type = "cpu"; reg = <1>; @@ -90,7 +90,7 @@ #cooling-cells = <2>; }; - cpu@2 { + cpu2: cpu@2 { compatible = "arm,cortex-a7"; device_type = "cpu"; reg = <2>; @@ -100,7 +100,7 @@ #cooling-cells = <2>; }; - cpu@3 { + cpu3: cpu@3 { compatible = "arm,cortex-a7"; device_type = "cpu"; reg = <3>; @@ -111,6 +111,15 @@ }; }; + pmu { + compatible = "arm,cortex-a7-pmu"; + interrupts = , + , + , + ; + interrupt-affinity = <&cpu0>, <&cpu1>, <&cpu2>, <&cpu3>; + }; + timer { compatible = "arm,armv7-timer"; interrupts = , --- linux-riscv-5.4.0.orig/arch/arm/boot/dts/sun8i-r40.dtsi +++ linux-riscv-5.4.0/arch/arm/boot/dts/sun8i-r40.dtsi @@ -266,6 +266,16 @@ #phy-cells = <1>; }; + ahci: sata@1c18000 { + compatible = "allwinner,sun8i-r40-ahci"; + reg = <0x01c18000 0x1000>; + interrupts = ; + clocks = <&ccu CLK_BUS_SATA>, <&ccu CLK_SATA>; + resets = <&ccu RST_BUS_SATA>; + reset-names = "ahci"; + status = "disabled"; + }; + ehci1: usb@1c19000 { compatible = "allwinner,sun8i-r40-ehci", "generic-ehci"; reg = <0x01c19000 0x100>; @@ -557,17 +567,6 @@ #size-cells = <0>; }; - ahci: sata@1c18000 { - compatible = "allwinner,sun8i-r40-ahci"; - reg = <0x01c18000 0x1000>; - interrupts = ; - clocks = <&ccu CLK_BUS_SATA>, <&ccu CLK_SATA>; - resets = <&ccu RST_BUS_SATA>; - reset-names = "ahci"; - status = "disabled"; - - }; - gmac: ethernet@1c50000 { compatible = "allwinner,sun8i-r40-gmac"; syscon = <&ccu>; --- linux-riscv-5.4.0.orig/arch/arm/configs/aspeed_g5_defconfig +++ linux-riscv-5.4.0/arch/arm/configs/aspeed_g5_defconfig @@ -139,6 +139,7 @@ CONFIG_SERIAL_8250_EXTENDED=y CONFIG_SERIAL_8250_ASPEED_VUART=y CONFIG_SERIAL_8250_SHARE_IRQ=y +CONFIG_SERIAL_8250_DW=y CONFIG_SERIAL_OF_PLATFORM=y CONFIG_ASPEED_KCS_IPMI_BMC=y CONFIG_ASPEED_BT_IPMI_BMC=y --- linux-riscv-5.4.0.orig/arch/arm/configs/exynos_defconfig +++ linux-riscv-5.4.0/arch/arm/configs/exynos_defconfig @@ -38,6 +38,7 @@ CONFIG_CRYPTO_SHA512_ARM=m CONFIG_CRYPTO_AES_ARM_BS=m CONFIG_CRYPTO_CHACHA20_NEON=m +CONFIG_KALLSYMS_ALL=y CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_PARTITION_ADVANCED=y @@ -92,6 +93,7 @@ CONFIG_BLK_DEV_CRYPTOLOOP=y CONFIG_BLK_DEV_RAM=y CONFIG_BLK_DEV_RAM_SIZE=8192 +CONFIG_SCSI=y CONFIG_BLK_DEV_SD=y CONFIG_CHR_DEV_SG=y CONFIG_ATA=y @@ -290,6 +292,7 @@ CONFIG_COMMON_CLK_MAX77686=y CONFIG_COMMON_CLK_S2MPS11=y CONFIG_EXYNOS_IOMMU=y +CONFIG_PM_DEVFREQ=y CONFIG_DEVFREQ_GOV_PERFORMANCE=y CONFIG_DEVFREQ_GOV_POWERSAVE=y CONFIG_DEVFREQ_GOV_USERSPACE=y @@ -348,9 +351,13 @@ CONFIG_DYNAMIC_DEBUG=y CONFIG_DEBUG_INFO=y CONFIG_MAGIC_SYSRQ=y +CONFIG_DEBUG_FS=y CONFIG_DEBUG_KERNEL=y CONFIG_SOFTLOCKUP_DETECTOR=y # CONFIG_DETECT_HUNG_TASK is not set CONFIG_PROVE_LOCKING=y CONFIG_DEBUG_ATOMIC_SLEEP=y +CONFIG_DEBUG_RT_MUTEXES=y +CONFIG_DEBUG_SPINLOCK=y +CONFIG_DEBUG_MUTEXES=y CONFIG_DEBUG_USER=y --- linux-riscv-5.4.0.orig/arch/arm/configs/imx_v6_v7_defconfig +++ linux-riscv-5.4.0/arch/arm/configs/imx_v6_v7_defconfig @@ -460,6 +460,7 @@ CONFIG_FONT_8x16=y CONFIG_PRINTK_TIME=y CONFIG_MAGIC_SYSRQ=y +CONFIG_DEBUG_FS=y # CONFIG_SCHED_DEBUG is not set CONFIG_PROVE_LOCKING=y # CONFIG_DEBUG_BUGVERBOSE is not set --- linux-riscv-5.4.0.orig/arch/arm/configs/omap2plus_defconfig +++ linux-riscv-5.4.0/arch/arm/configs/omap2plus_defconfig @@ -552,5 +552,6 @@ CONFIG_DEBUG_INFO_SPLIT=y CONFIG_DEBUG_INFO_DWARF4=y CONFIG_MAGIC_SYSRQ=y +CONFIG_DEBUG_FS=y CONFIG_SCHEDSTATS=y # CONFIG_DEBUG_BUGVERBOSE is not set --- linux-riscv-5.4.0.orig/arch/arm/include/asm/kvm_emulate.h +++ linux-riscv-5.4.0/arch/arm/include/asm/kvm_emulate.h @@ -14,13 +14,25 @@ #include /* arm64 compatibility macros */ +#define PSR_AA32_MODE_FIQ FIQ_MODE +#define PSR_AA32_MODE_SVC SVC_MODE #define PSR_AA32_MODE_ABT ABT_MODE #define PSR_AA32_MODE_UND UND_MODE #define PSR_AA32_T_BIT PSR_T_BIT +#define PSR_AA32_F_BIT PSR_F_BIT #define PSR_AA32_I_BIT PSR_I_BIT #define PSR_AA32_A_BIT PSR_A_BIT #define PSR_AA32_E_BIT PSR_E_BIT #define PSR_AA32_IT_MASK PSR_IT_MASK +#define PSR_AA32_GE_MASK 0x000f0000 +#define PSR_AA32_DIT_BIT 0x00200000 +#define PSR_AA32_PAN_BIT 0x00400000 +#define PSR_AA32_SSBS_BIT 0x00800000 +#define PSR_AA32_Q_BIT PSR_Q_BIT +#define PSR_AA32_V_BIT PSR_V_BIT +#define PSR_AA32_C_BIT PSR_C_BIT +#define PSR_AA32_Z_BIT PSR_Z_BIT +#define PSR_AA32_N_BIT PSR_N_BIT unsigned long *vcpu_reg(struct kvm_vcpu *vcpu, u8 reg_num); @@ -41,6 +53,11 @@ *__vcpu_spsr(vcpu) = v; } +static inline unsigned long host_spsr_to_spsr32(unsigned long spsr) +{ + return spsr; +} + static inline unsigned long vcpu_get_reg(struct kvm_vcpu *vcpu, u8 reg_num) { @@ -177,6 +194,11 @@ return kvm_vcpu_get_hsr(vcpu) & HSR_SSE; } +static inline bool kvm_vcpu_dabt_issf(const struct kvm_vcpu *vcpu) +{ + return false; +} + static inline int kvm_vcpu_dabt_get_rd(struct kvm_vcpu *vcpu) { return (kvm_vcpu_get_hsr(vcpu) & HSR_SRT_MASK) >> HSR_SRT_SHIFT; --- linux-riscv-5.4.0.orig/arch/arm/include/asm/kvm_mmio.h +++ linux-riscv-5.4.0/arch/arm/include/asm/kvm_mmio.h @@ -14,6 +14,8 @@ struct kvm_decode { unsigned long rt; bool sign_extend; + /* Not used on 32-bit arm */ + bool sixty_four; }; void kvm_mmio_write_buf(void *buf, unsigned int len, unsigned long data); --- linux-riscv-5.4.0.orig/arch/arm/kernel/hyp-stub.S +++ linux-riscv-5.4.0/arch/arm/kernel/hyp-stub.S @@ -146,10 +146,9 @@ #if !defined(ZIMAGE) && defined(CONFIG_ARM_ARCH_TIMER) @ make CNTP_* and CNTPCT accessible from PL1 mrc p15, 0, r7, c0, c1, 1 @ ID_PFR1 - lsr r7, #16 - and r7, #0xf - cmp r7, #1 - bne 1f + ubfx r7, r7, #16, #4 + teq r7, #0 + beq 1f mrc p15, 4, r7, c14, c1, 0 @ CNTHCTL orr r7, r7, #3 @ PL1PCEN | PL1PCTEN mcr p15, 4, r7, c14, c1, 0 @ CNTHCTL --- linux-riscv-5.4.0.orig/arch/arm/kernel/process.c +++ linux-riscv-5.4.0/arch/arm/kernel/process.c @@ -224,8 +224,8 @@ asmlinkage void ret_from_fork(void) __asm__("ret_from_fork"); int -copy_thread(unsigned long clone_flags, unsigned long stack_start, - unsigned long stk_sz, struct task_struct *p) +copy_thread_tls(unsigned long clone_flags, unsigned long stack_start, + unsigned long stk_sz, struct task_struct *p, unsigned long tls) { struct thread_info *thread = task_thread_info(p); struct pt_regs *childregs = task_pt_regs(p); @@ -259,7 +259,7 @@ clear_ptrace_hw_breakpoint(p); if (clone_flags & CLONE_SETTLS) - thread->tp_value[0] = childregs->ARM_r3; + thread->tp_value[0] = tls; thread->tp_value[1] = get_tpuser(); thread_notify(THREAD_NOTIFY_COPY, thread); --- linux-riscv-5.4.0.orig/arch/arm/kernel/smp.c +++ linux-riscv-5.4.0/arch/arm/kernel/smp.c @@ -240,6 +240,10 @@ if (ret) return ret; +#ifdef CONFIG_GENERIC_ARCH_TOPOLOGY + remove_cpu_topology(cpu); +#endif + /* * Take this CPU offline. Once we clear this, we can't return, * and we must not schedule until we're ready to give up the cpu. --- linux-riscv-5.4.0.orig/arch/arm/kernel/topology.c +++ linux-riscv-5.4.0/arch/arm/kernel/topology.c @@ -196,9 +196,8 @@ struct cpu_topology *cpuid_topo = &cpu_topology[cpuid]; unsigned int mpidr; - /* If the cpu topology has been already set, just return */ - if (cpuid_topo->core_id != -1) - return; + if (cpuid_topo->package_id != -1) + goto topology_populated; mpidr = read_cpuid_mpidr(); @@ -231,14 +230,15 @@ cpuid_topo->package_id = -1; } - update_siblings_masks(cpuid); - update_cpu_capacity(cpuid); pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n", cpuid, cpu_topology[cpuid].thread_id, cpu_topology[cpuid].core_id, cpu_topology[cpuid].package_id, mpidr); + +topology_populated: + update_siblings_masks(cpuid); } static inline int cpu_corepower_flags(void) --- linux-riscv-5.4.0.orig/arch/arm/kernel/vdso.c +++ linux-riscv-5.4.0/arch/arm/kernel/vdso.c @@ -93,6 +93,8 @@ */ np = of_find_compatible_node(NULL, NULL, "arm,armv7-timer"); if (!np) + np = of_find_compatible_node(NULL, NULL, "arm,armv8-timer"); + if (!np) goto out_put; if (of_property_read_bool(np, "arm,cpu-registers-not-fw-configured")) --- linux-riscv-5.4.0.orig/arch/arm/lib/copy_from_user.S +++ linux-riscv-5.4.0/arch/arm/lib/copy_from_user.S @@ -118,7 +118,7 @@ ENDPROC(arm_copy_from_user) - .pushsection .fixup,"ax" + .pushsection .text.fixup,"ax" .align 0 copy_abort_preamble ldmfd sp!, {r1, r2, r3} --- linux-riscv-5.4.0.orig/arch/arm/mach-at91/pm.c +++ linux-riscv-5.4.0/arch/arm/mach-at91/pm.c @@ -691,6 +691,12 @@ soc_pm.data.suspend_mode = AT91_PM_ULP0; } +static const struct of_device_id atmel_shdwc_ids[] = { + { .compatible = "atmel,sama5d2-shdwc" }, + { .compatible = "microchip,sam9x60-shdwc" }, + { /* sentinel. */ } +}; + static void __init at91_pm_modes_init(void) { struct device_node *np; @@ -700,7 +706,7 @@ !at91_is_pm_mode_active(AT91_PM_ULP1)) return; - np = of_find_compatible_node(NULL, NULL, "atmel,sama5d2-shdwc"); + np = of_find_matching_node(NULL, atmel_shdwc_ids); if (!np) { pr_warn("%s: failed to find shdwc!\n", __func__); goto ulp1_default; @@ -751,6 +757,7 @@ { .compatible = "atmel,sama5d3-pmc", .data = &pmc_infos[1] }, { .compatible = "atmel,sama5d4-pmc", .data = &pmc_infos[1] }, { .compatible = "atmel,sama5d2-pmc", .data = &pmc_infos[1] }, + { .compatible = "microchip,sam9x60-pmc", .data = &pmc_infos[1] }, { /* sentinel */ }, }; --- linux-riscv-5.4.0.orig/arch/arm/mach-davinci/Kconfig +++ linux-riscv-5.4.0/arch/arm/mach-davinci/Kconfig @@ -9,6 +9,7 @@ select PM_GENERIC_DOMAINS if PM select PM_GENERIC_DOMAINS_OF if PM && OF select REGMAP_MMIO + select RESET_CONTROLLER select HAVE_IDE select PINCTRL_SINGLE --- linux-riscv-5.4.0.orig/arch/arm/mach-imx/Makefile +++ linux-riscv-5.4.0/arch/arm/mach-imx/Makefile @@ -91,6 +91,8 @@ obj-$(CONFIG_SOC_IMX6) += suspend-imx6.o obj-$(CONFIG_SOC_IMX53) += suspend-imx53.o endif +AFLAGS_resume-imx6.o :=-Wa,-march=armv7-a +obj-$(CONFIG_SOC_IMX6) += resume-imx6.o obj-$(CONFIG_SOC_IMX6) += pm-imx6.o obj-$(CONFIG_SOC_IMX1) += mach-imx1.o --- linux-riscv-5.4.0.orig/arch/arm/mach-imx/common.h +++ linux-riscv-5.4.0/arch/arm/mach-imx/common.h @@ -109,17 +109,17 @@ int imx_cpu_kill(unsigned int cpu); #ifdef CONFIG_SUSPEND -void v7_cpu_resume(void); void imx53_suspend(void __iomem *ocram_vbase); extern const u32 imx53_suspend_sz; void imx6_suspend(void __iomem *ocram_vbase); #else -static inline void v7_cpu_resume(void) {} static inline void imx53_suspend(void __iomem *ocram_vbase) {} static const u32 imx53_suspend_sz; static inline void imx6_suspend(void __iomem *ocram_vbase) {} #endif +void v7_cpu_resume(void); + void imx6_pm_ccm_init(const char *ccm_compat); void imx6q_pm_init(void); void imx6dl_pm_init(void); --- linux-riscv-5.4.0.orig/arch/arm/mach-imx/resume-imx6.S +++ linux-riscv-5.4.0/arch/arm/mach-imx/resume-imx6.S @@ -0,0 +1,24 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Copyright 2014 Freescale Semiconductor, Inc. + */ + +#include +#include +#include +#include +#include "hardware.h" + +/* + * The following code must assume it is running from physical address + * where absolute virtual addresses to the data section have to be + * turned into relative ones. + */ + +ENTRY(v7_cpu_resume) + bl v7_invalidate_l1 +#ifdef CONFIG_CACHE_L2X0 + bl l2c310_early_resume +#endif + b cpu_resume +ENDPROC(v7_cpu_resume) --- linux-riscv-5.4.0.orig/arch/arm/mach-imx/suspend-imx6.S +++ linux-riscv-5.4.0/arch/arm/mach-imx/suspend-imx6.S @@ -327,17 +327,3 @@ ret lr ENDPROC(imx6_suspend) - -/* - * The following code must assume it is running from physical address - * where absolute virtual addresses to the data section have to be - * turned into relative ones. - */ - -ENTRY(v7_cpu_resume) - bl v7_invalidate_l1 -#ifdef CONFIG_CACHE_L2X0 - bl l2c310_early_resume -#endif - b cpu_resume -ENDPROC(v7_cpu_resume) --- linux-riscv-5.4.0.orig/arch/arm/mach-npcm/Kconfig +++ linux-riscv-5.4.0/arch/arm/mach-npcm/Kconfig @@ -11,7 +11,7 @@ depends on ARCH_MULTI_V7 select PINCTRL_NPCM7XX select NPCM7XX_TIMER - select ARCH_REQUIRE_GPIOLIB + select GPIOLIB select CACHE_L2X0 select ARM_GIC select HAVE_ARM_TWD if SMP --- linux-riscv-5.4.0.orig/arch/arm/mach-omap2/display.c +++ linux-riscv-5.4.0/arch/arm/mach-omap2/display.c @@ -265,6 +265,7 @@ r = of_platform_populate(node, NULL, NULL, &pdev->dev); if (r) { pr_err("Unable to populate DSS submodule devices\n"); + put_device(&pdev->dev); return r; } --- linux-riscv-5.4.0.orig/arch/arm/mach-omap2/pdata-quirks.c +++ linux-riscv-5.4.0/arch/arm/mach-omap2/pdata-quirks.c @@ -7,7 +7,6 @@ #include #include #include -#include #include #include #include @@ -269,14 +268,6 @@ am35xx_emac_reset(); } -static struct platform_device omap3_rom_rng_device = { - .name = "omap3-rom-rng", - .id = -1, - .dev = { - .platform_data = rx51_secure_rng_call, - }, -}; - static void __init nokia_n900_legacy_init(void) { hsmmc2_internal_input_clk(); @@ -292,9 +283,6 @@ pr_warn("RX-51: Not enabling ARM errata 430973 workaround\n"); pr_warn("Thumb binaries may crash randomly without this workaround\n"); } - - pr_info("RX-51: Registering OMAP3 HWRNG device\n"); - platform_device_register(&omap3_rom_rng_device); } } @@ -311,118 +299,15 @@ } /* omap3pandora legacy devices */ -#define PANDORA_WIFI_IRQ_GPIO 21 -#define PANDORA_WIFI_NRESET_GPIO 23 static struct platform_device pandora_backlight = { .name = "pandora-backlight", .id = -1, }; -static struct regulator_consumer_supply pandora_vmmc3_supply[] = { - REGULATOR_SUPPLY("vmmc", "omap_hsmmc.2"), -}; - -static struct regulator_init_data pandora_vmmc3 = { - .constraints = { - .valid_ops_mask = REGULATOR_CHANGE_STATUS, - }, - .num_consumer_supplies = ARRAY_SIZE(pandora_vmmc3_supply), - .consumer_supplies = pandora_vmmc3_supply, -}; - -static struct fixed_voltage_config pandora_vwlan = { - .supply_name = "vwlan", - .microvolts = 1800000, /* 1.8V */ - .startup_delay = 50000, /* 50ms */ - .init_data = &pandora_vmmc3, -}; - -static struct platform_device pandora_vwlan_device = { - .name = "reg-fixed-voltage", - .id = 1, - .dev = { - .platform_data = &pandora_vwlan, - }, -}; - -static struct gpiod_lookup_table pandora_vwlan_gpiod_table = { - .dev_id = "reg-fixed-voltage.1", - .table = { - /* - * As this is a low GPIO number it should be at the first - * GPIO bank. - */ - GPIO_LOOKUP("gpio-0-31", PANDORA_WIFI_NRESET_GPIO, - NULL, GPIO_ACTIVE_HIGH), - { }, - }, -}; - -static void pandora_wl1251_init_card(struct mmc_card *card) -{ - /* - * We have TI wl1251 attached to MMC3. Pass this information to - * SDIO core because it can't be probed by normal methods. - */ - if (card->type == MMC_TYPE_SDIO || card->type == MMC_TYPE_SD_COMBO) { - card->quirks |= MMC_QUIRK_NONSTD_SDIO; - card->cccr.wide_bus = 1; - card->cis.vendor = 0x104c; - card->cis.device = 0x9066; - card->cis.blksize = 512; - card->cis.max_dtr = 24000000; - card->ocr = 0x80; - } -} - -static struct omap2_hsmmc_info pandora_mmc3[] = { - { - .mmc = 3, - .caps = MMC_CAP_4_BIT_DATA | MMC_CAP_POWER_OFF_CARD, - .init_card = pandora_wl1251_init_card, - }, - {} /* Terminator */ -}; - -static void __init pandora_wl1251_init(void) -{ - struct wl1251_platform_data pandora_wl1251_pdata; - int ret; - - memset(&pandora_wl1251_pdata, 0, sizeof(pandora_wl1251_pdata)); - - pandora_wl1251_pdata.power_gpio = -1; - - ret = gpio_request_one(PANDORA_WIFI_IRQ_GPIO, GPIOF_IN, "wl1251 irq"); - if (ret < 0) - goto fail; - - pandora_wl1251_pdata.irq = gpio_to_irq(PANDORA_WIFI_IRQ_GPIO); - if (pandora_wl1251_pdata.irq < 0) - goto fail_irq; - - pandora_wl1251_pdata.use_eeprom = true; - ret = wl1251_set_platform_data(&pandora_wl1251_pdata); - if (ret < 0) - goto fail_irq; - - return; - -fail_irq: - gpio_free(PANDORA_WIFI_IRQ_GPIO); -fail: - pr_err("wl1251 board initialisation failed\n"); -} - static void __init omap3_pandora_legacy_init(void) { platform_device_register(&pandora_backlight); - gpiod_add_lookup_table(&pandora_vwlan_gpiod_table); - platform_device_register(&pandora_vwlan_device); - omap_hsmmc_init(pandora_mmc3); - omap_hsmmc_late_init(pandora_mmc3); - pandora_wl1251_init(); } #endif /* CONFIG_ARCH_OMAP3 */ @@ -472,10 +357,14 @@ static struct clockdomain *ti_sysc_find_one_clockdomain(struct clk *clk) { + struct clk_hw *hw = __clk_get_hw(clk); struct clockdomain *clkdm = NULL; struct clk_hw_omap *hwclk; - hwclk = to_clk_hw_omap(__clk_get_hw(clk)); + hwclk = to_clk_hw_omap(hw); + if (!omap2_clk_is_hw_omap(hw)) + return NULL; + if (hwclk && hwclk->clkdm_name) clkdm = clkdm_lookup(hwclk->clkdm_name); @@ -638,6 +527,7 @@ OF_DEV_AUXDATA("ti,davinci_mdio", 0x5c030000, "davinci_mdio.0", NULL), OF_DEV_AUXDATA("ti,am3517-emac", 0x5c000000, "davinci_emac.0", &am35xx_emac_pdata), + OF_DEV_AUXDATA("nokia,n900-rom-rng", 0, NULL, rx51_secure_rng_call), /* McBSP modules with sidetone core */ #if IS_ENABLED(CONFIG_SND_SOC_OMAP_MCBSP) OF_DEV_AUXDATA("ti,omap3-mcbsp", 0x49022000, "49022000.mcbsp", &mcbsp_pdata), --- linux-riscv-5.4.0.orig/arch/arm/mach-tegra/reset-handler.S +++ linux-riscv-5.4.0/arch/arm/mach-tegra/reset-handler.S @@ -44,16 +44,16 @@ cmp r6, #TEGRA20 beq 1f @ Yes /* Clear the flow controller flags for this CPU. */ - cpu_to_csr_reg r1, r0 + cpu_to_csr_reg r3, r0 mov32 r2, TEGRA_FLOW_CTRL_BASE - ldr r1, [r2, r1] + ldr r1, [r2, r3] /* Clear event & intr flag */ orr r1, r1, \ #FLOW_CTRL_CSR_INTR_FLAG | FLOW_CTRL_CSR_EVENT_FLAG movw r0, #0x3FFD @ enable, cluster_switch, immed, bitmaps @ & ext flags for CPU power mgnt bic r1, r1, r0 - str r1, [r2] + str r1, [r2, r3] 1: mov32 r9, 0xc09 --- linux-riscv-5.4.0.orig/arch/arm/mach-tegra/sleep-tegra30.S +++ linux-riscv-5.4.0/arch/arm/mach-tegra/sleep-tegra30.S @@ -370,6 +370,14 @@ pll_locked r1, r0, CLK_RESET_PLLC_BASE pll_locked r1, r0, CLK_RESET_PLLX_BASE + tegra_get_soc_id TEGRA_APB_MISC_BASE, r1 + cmp r1, #TEGRA30 + beq 1f + ldr r1, [r0, #CLK_RESET_PLLP_BASE] + bic r1, r1, #(1<<31) @ disable PllP bypass + str r1, [r0, #CLK_RESET_PLLP_BASE] +1: + mov32 r7, TEGRA_TMRUS_BASE ldr r1, [r7] add r1, r1, #LOCK_DELAY @@ -630,7 +638,10 @@ str r0, [r4, #PMC_PLLP_WB0_OVERRIDE] /* disable PLLP, PLLA, PLLC and PLLX */ + tegra_get_soc_id TEGRA_APB_MISC_BASE, r1 + cmp r1, #TEGRA30 ldr r0, [r5, #CLK_RESET_PLLP_BASE] + orrne r0, r0, #(1 << 31) @ enable PllP bypass on fast cluster bic r0, r0, #(1 << 30) str r0, [r5, #CLK_RESET_PLLP_BASE] ldr r0, [r5, #CLK_RESET_PLLA_BASE] --- linux-riscv-5.4.0.orig/arch/arm/mach-vexpress/spc.c +++ linux-riscv-5.4.0/arch/arm/mach-vexpress/spc.c @@ -551,8 +551,9 @@ static int __init ve_spc_clk_init(void) { - int cpu; + int cpu, cluster; struct clk *clk; + bool init_opp_table[MAX_CLUSTERS] = { false }; if (!info) return 0; /* Continue only if SPC is initialised */ @@ -578,8 +579,17 @@ continue; } + cluster = topology_physical_package_id(cpu_dev->id); + if (init_opp_table[cluster]) + continue; + if (ve_init_opp_table(cpu_dev)) pr_warn("failed to initialise cpu%d opp table\n", cpu); + else if (dev_pm_opp_set_sharing_cpus(cpu_dev, + topology_core_cpumask(cpu_dev->id))) + pr_warn("failed to mark OPPs shared for cpu%d\n", cpu); + else + init_opp_table[cluster] = true; } platform_device_register_simple("vexpress-spc-cpufreq", -1, NULL, 0); --- linux-riscv-5.4.0.orig/arch/arm/mm/dma-mapping-nommu.c +++ linux-riscv-5.4.0/arch/arm/mm/dma-mapping-nommu.c @@ -35,7 +35,7 @@ unsigned long attrs) { - void *ret = dma_alloc_from_global_coherent(size, dma_handle); + void *ret = dma_alloc_from_global_coherent(dev, size, dma_handle); /* * dma_alloc_from_global_coherent() may fail because: --- linux-riscv-5.4.0.orig/arch/arm/mm/dma-mapping.c +++ linux-riscv-5.4.0/arch/arm/mm/dma-mapping.c @@ -221,7 +221,7 @@ static int __dma_supported(struct device *dev, u64 mask, bool warn) { - unsigned long max_dma_pfn = min(max_pfn, arm_dma_pfn_limit); + unsigned long max_dma_pfn = min(max_pfn - 1, arm_dma_pfn_limit); /* * Translate the device's DMA mask to a PFN limit. This --- linux-riscv-5.4.0.orig/arch/arm/mm/init.c +++ linux-riscv-5.4.0/arch/arm/mm/init.c @@ -323,7 +323,7 @@ *p++ = 0xe7fddef0; } -static inline void +static inline void __init free_memmap(unsigned long start_pfn, unsigned long end_pfn) { struct page *start_pg, *end_pg; --- linux-riscv-5.4.0.orig/arch/arm/mm/proc-v7-bugs.c +++ linux-riscv-5.4.0/arch/arm/mm/proc-v7-bugs.c @@ -65,6 +65,9 @@ break; #ifdef CONFIG_ARM_PSCI + case ARM_CPU_PART_BRAHMA_B53: + /* Requires no workaround */ + break; default: /* Other ARM CPUs require no workaround */ if (read_cpuid_implementor() == ARM_CPU_IMP_ARM) --- linux-riscv-5.4.0.orig/arch/arm64/Kconfig +++ linux-riscv-5.4.0/arch/arm64/Kconfig @@ -139,6 +139,7 @@ select HAVE_CMPXCHG_DOUBLE select HAVE_CMPXCHG_LOCAL select HAVE_CONTEXT_TRACKING + select HAVE_COPY_THREAD_TLS select HAVE_DEBUG_BUGVERBOSE select HAVE_DEBUG_KMEMLEAK select HAVE_DMA_CONTIGUOUS @@ -1048,6 +1049,7 @@ config FORCE_MAX_ZONEORDER int default "14" if (ARM64_64K_PAGES && TRANSPARENT_HUGEPAGE) + default "13" if (ARCH_THUNDER && ARM64_4K_PAGES) default "12" if (ARM64_16K_PAGES && TRANSPARENT_HUGEPAGE) default "11" help --- linux-riscv-5.4.0.orig/arch/arm64/boot/Makefile +++ linux-riscv-5.4.0/arch/arm64/boot/Makefile @@ -16,7 +16,7 @@ OBJCOPYFLAGS_Image :=-O binary -R .note -R .note.gnu.build-id -R .comment -S -targets := Image Image.gz +targets := Image Image.bz2 Image.gz Image.lz4 Image.lzma Image.lzo $(obj)/Image: vmlinux FORCE $(call if_changed,objcopy) --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/allwinner/sun50i-a64-olinuxino-emmc.dts +++ linux-riscv-5.4.0/arch/arm64/boot/dts/allwinner/sun50i-a64-olinuxino-emmc.dts @@ -15,7 +15,7 @@ pinctrl-names = "default"; pinctrl-0 = <&mmc2_pins>; vmmc-supply = <®_dcdc1>; - vqmmc-supply = <®_dcdc1>; + vqmmc-supply = <®_eldo1>; bus-width = <8>; non-removable; cap-mmc-hw-reset; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/allwinner/sun50i-a64-olinuxino.dts +++ linux-riscv-5.4.0/arch/arm64/boot/dts/allwinner/sun50i-a64-olinuxino.dts @@ -140,7 +140,7 @@ &mmc1 { pinctrl-names = "default"; pinctrl-0 = <&mmc1_pins>; - vmmc-supply = <®_aldo2>; + vmmc-supply = <®_dcdc1>; vqmmc-supply = <®_dldo4>; mmc-pwrseq = <&wifi_pwrseq>; bus-width = <4>; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi @@ -142,6 +142,15 @@ clock-output-names = "ext-osc32k"; }; + pmu { + compatible = "arm,cortex-a53-pmu"; + interrupts = , + , + , + ; + interrupt-affinity = <&cpu0>, <&cpu1>, <&cpu2>, <&cpu3>; + }; + psci { compatible = "arm,psci-0.2"; method = "smc"; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/allwinner/sun50i-h5.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/allwinner/sun50i-h5.dtsi @@ -54,21 +54,21 @@ enable-method = "psci"; }; - cpu@1 { + cpu1: cpu@1 { compatible = "arm,cortex-a53"; device_type = "cpu"; reg = <1>; enable-method = "psci"; }; - cpu@2 { + cpu2: cpu@2 { compatible = "arm,cortex-a53"; device_type = "cpu"; reg = <2>; enable-method = "psci"; }; - cpu@3 { + cpu3: cpu@3 { compatible = "arm,cortex-a53"; device_type = "cpu"; reg = <3>; @@ -76,6 +76,16 @@ }; }; + pmu { + compatible = "arm,cortex-a53-pmu", + "arm,armv8-pmuv3"; + interrupts = , + , + , + ; + interrupt-affinity = <&cpu0>, <&cpu1>, <&cpu2>, <&cpu3>; + }; + psci { compatible = "arm,psci-0.2"; method = "smc"; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi @@ -70,6 +70,16 @@ clock-output-names = "ext_osc32k"; }; + pmu { + compatible = "arm,cortex-a53-pmu", + "arm,armv8-pmuv3"; + interrupts = , + , + , + ; + interrupt-affinity = <&cpu0>, <&cpu1>, <&cpu2>, <&cpu3>; + }; + psci { compatible = "arm,psci-0.2"; method = "smc"; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi @@ -61,10 +61,10 @@ pmu { compatible = "arm,armv8-pmuv3"; - interrupts = <0 120 8>, - <0 121 8>, - <0 122 8>, - <0 123 8>; + interrupts = <0 170 4>, + <0 171 4>, + <0 172 4>, + <0 173 4>; interrupt-affinity = <&cpu0>, <&cpu1>, <&cpu2>, --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/amlogic/meson-axg.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/amlogic/meson-axg.dtsi @@ -1162,7 +1162,7 @@ toddr_a: audio-controller@100 { compatible = "amlogic,axg-toddr"; - reg = <0x0 0x100 0x0 0x1c>; + reg = <0x0 0x100 0x0 0x2c>; #sound-dai-cells = <0>; sound-name-prefix = "TODDR_A"; interrupts = ; @@ -1173,7 +1173,7 @@ toddr_b: audio-controller@140 { compatible = "amlogic,axg-toddr"; - reg = <0x0 0x140 0x0 0x1c>; + reg = <0x0 0x140 0x0 0x2c>; #sound-dai-cells = <0>; sound-name-prefix = "TODDR_B"; interrupts = ; @@ -1184,7 +1184,7 @@ toddr_c: audio-controller@180 { compatible = "amlogic,axg-toddr"; - reg = <0x0 0x180 0x0 0x1c>; + reg = <0x0 0x180 0x0 0x2c>; #sound-dai-cells = <0>; sound-name-prefix = "TODDR_C"; interrupts = ; @@ -1195,7 +1195,7 @@ frddr_a: audio-controller@1c0 { compatible = "amlogic,axg-frddr"; - reg = <0x0 0x1c0 0x0 0x1c>; + reg = <0x0 0x1c0 0x0 0x2c>; #sound-dai-cells = <0>; sound-name-prefix = "FRDDR_A"; interrupts = ; @@ -1206,7 +1206,7 @@ frddr_b: audio-controller@200 { compatible = "amlogic,axg-frddr"; - reg = <0x0 0x200 0x0 0x1c>; + reg = <0x0 0x200 0x0 0x2c>; #sound-dai-cells = <0>; sound-name-prefix = "FRDDR_B"; interrupts = ; @@ -1217,7 +1217,7 @@ frddr_c: audio-controller@240 { compatible = "amlogic,axg-frddr"; - reg = <0x0 0x240 0x0 0x1c>; + reg = <0x0 0x240 0x0 0x2c>; #sound-dai-cells = <0>; sound-name-prefix = "FRDDR_C"; interrupts = ; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi @@ -1509,7 +1509,7 @@ toddr_a: audio-controller@100 { compatible = "amlogic,g12a-toddr", "amlogic,axg-toddr"; - reg = <0x0 0x100 0x0 0x1c>; + reg = <0x0 0x100 0x0 0x2c>; #sound-dai-cells = <0>; sound-name-prefix = "TODDR_A"; interrupts = ; @@ -1521,7 +1521,7 @@ toddr_b: audio-controller@140 { compatible = "amlogic,g12a-toddr", "amlogic,axg-toddr"; - reg = <0x0 0x140 0x0 0x1c>; + reg = <0x0 0x140 0x0 0x2c>; #sound-dai-cells = <0>; sound-name-prefix = "TODDR_B"; interrupts = ; @@ -1533,7 +1533,7 @@ toddr_c: audio-controller@180 { compatible = "amlogic,g12a-toddr", "amlogic,axg-toddr"; - reg = <0x0 0x180 0x0 0x1c>; + reg = <0x0 0x180 0x0 0x2c>; #sound-dai-cells = <0>; sound-name-prefix = "TODDR_C"; interrupts = ; @@ -1545,7 +1545,7 @@ frddr_a: audio-controller@1c0 { compatible = "amlogic,g12a-frddr", "amlogic,axg-frddr"; - reg = <0x0 0x1c0 0x0 0x1c>; + reg = <0x0 0x1c0 0x0 0x2c>; #sound-dai-cells = <0>; sound-name-prefix = "FRDDR_A"; interrupts = ; @@ -1557,7 +1557,7 @@ frddr_b: audio-controller@200 { compatible = "amlogic,g12a-frddr", "amlogic,axg-frddr"; - reg = <0x0 0x200 0x0 0x1c>; + reg = <0x0 0x200 0x0 0x2c>; #sound-dai-cells = <0>; sound-name-prefix = "FRDDR_B"; interrupts = ; @@ -1569,7 +1569,7 @@ frddr_c: audio-controller@240 { compatible = "amlogic,g12a-frddr", "amlogic,axg-frddr"; - reg = <0x0 0x240 0x0 0x1c>; + reg = <0x0 0x240 0x0 0x2c>; #sound-dai-cells = <0>; sound-name-prefix = "FRDDR_C"; interrupts = ; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/amlogic/meson-gxbb-odroidc2.dts +++ linux-riscv-5.4.0/arch/arm64/boot/dts/amlogic/meson-gxbb-odroidc2.dts @@ -296,7 +296,7 @@ }; &usb0_phy { - status = "okay"; + status = "disabled"; phy-supply = <&usb_otg_pwr>; }; @@ -306,7 +306,7 @@ }; &usb0 { - status = "okay"; + status = "disabled"; }; &usb1 { --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/amlogic/meson-gxl-s905x-khadas-vim.dts +++ linux-riscv-5.4.0/arch/arm64/boot/dts/amlogic/meson-gxl-s905x-khadas-vim.dts @@ -33,11 +33,9 @@ gpio-keys-polled { compatible = "gpio-keys-polled"; - #address-cells = <1>; - #size-cells = <0>; poll-interval = <100>; - button@0 { + power-button { label = "power"; linux,code = ; gpios = <&gpio_ao GPIOAO_2 GPIO_ACTIVE_LOW>; @@ -192,6 +190,9 @@ bluetooth { compatible = "brcm,bcm43438-bt"; shutdown-gpios = <&gpio GPIOX_17 GPIO_ACTIVE_HIGH>; + max-speed = <2000000>; + clocks = <&wifi32k>; + clock-names = "lpo"; }; }; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/amlogic/meson-gxm-khadas-vim2.dts +++ linux-riscv-5.4.0/arch/arm64/boot/dts/amlogic/meson-gxm-khadas-vim2.dts @@ -327,7 +327,7 @@ #size-cells = <0>; bus-width = <4>; - max-frequency = <50000000>; + max-frequency = <60000000>; non-removable; disable-wp; @@ -409,6 +409,9 @@ bluetooth { compatible = "brcm,bcm43438-bt"; shutdown-gpios = <&gpio GPIOX_17 GPIO_ACTIVE_HIGH>; + max-speed = <2000000>; + clocks = <&wifi32k>; + clock-names = "lpo"; }; }; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/amlogic/meson-sm1-sei610.dts +++ linux-riscv-5.4.0/arch/arm64/boot/dts/amlogic/meson-sm1-sei610.dts @@ -361,6 +361,9 @@ bluetooth { compatible = "brcm,bcm43438-bt"; + interrupt-parent = <&gpio_intc>; + interrupts = <95 IRQ_TYPE_LEVEL_HIGH>; + interrupt-names = "host-wakeup"; shutdown-gpios = <&gpio GPIOX_17 GPIO_ACTIVE_HIGH>; max-speed = <2000000>; clocks = <&wifi32k>; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/arm/fvp-base-revc.dts +++ linux-riscv-5.4.0/arch/arm64/boot/dts/arm/fvp-base-revc.dts @@ -161,10 +161,10 @@ bus-range = <0x0 0x1>; reg = <0x0 0x40000000 0x0 0x10000000>; ranges = <0x2000000 0x0 0x50000000 0x0 0x50000000 0x0 0x10000000>; - interrupt-map = <0 0 0 1 &gic GIC_SPI 168 IRQ_TYPE_LEVEL_HIGH>, - <0 0 0 2 &gic GIC_SPI 169 IRQ_TYPE_LEVEL_HIGH>, - <0 0 0 3 &gic GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>, - <0 0 0 4 &gic GIC_SPI 171 IRQ_TYPE_LEVEL_HIGH>; + interrupt-map = <0 0 0 1 &gic 0 0 GIC_SPI 168 IRQ_TYPE_LEVEL_HIGH>, + <0 0 0 2 &gic 0 0 GIC_SPI 169 IRQ_TYPE_LEVEL_HIGH>, + <0 0 0 3 &gic 0 0 GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>, + <0 0 0 4 &gic 0 0 GIC_SPI 171 IRQ_TYPE_LEVEL_HIGH>; interrupt-map-mask = <0x0 0x0 0x0 0x7>; msi-map = <0x0 &its 0x0 0x10000>; iommu-map = <0x0 &smmu 0x0 0x10000>; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/arm/juno-base.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/arm/juno-base.dtsi @@ -6,7 +6,6 @@ /* * Devices shared by all Juno boards */ - dma-ranges = <0 0 0 0 0x100 0>; memtimer: timer@2a810000 { compatible = "arm,armv7-timer-mem"; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/arm/juno-clocks.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/arm/juno-clocks.dtsi @@ -8,10 +8,10 @@ */ / { /* SoC fixed clocks */ - soc_uartclk: refclk7273800hz { + soc_uartclk: refclk7372800hz { compatible = "fixed-clock"; #clock-cells = <0>; - clock-frequency = <7273800>; + clock-frequency = <7372800>; clock-output-names = "juno:uartclk"; }; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/exynos/exynos5433.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/exynos/exynos5433.dtsi @@ -18,8 +18,8 @@ / { compatible = "samsung,exynos5433"; - #address-cells = <1>; - #size-cells = <1>; + #address-cells = <2>; + #size-cells = <2>; interrupt-parent = <&gic>; @@ -311,7 +311,7 @@ compatible = "simple-bus"; #address-cells = <1>; #size-cells = <1>; - ranges; + ranges = <0x0 0x0 0x0 0x18000000>; chipid@10000000 { compatible = "samsung,exynos4210-chipid"; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/exynos/exynos7.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/exynos/exynos7.dtsi @@ -12,8 +12,8 @@ / { compatible = "samsung,exynos7"; interrupt-parent = <&gic>; - #address-cells = <1>; - #size-cells = <1>; + #address-cells = <2>; + #size-cells = <2>; aliases { pinctrl0 = &pinctrl_alive; @@ -98,7 +98,7 @@ compatible = "simple-bus"; #address-cells = <1>; #size-cells = <1>; - ranges; + ranges = <0 0 0 0x18000000>; chipid@10000000 { compatible = "samsung,exynos4210-chipid"; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi @@ -102,7 +102,7 @@ reboot { compatible ="syscon-reboot"; - regmap = <&dcfg>; + regmap = <&rst>; offset = <0xb0>; mask = <0x02>; }; @@ -158,7 +158,13 @@ dcfg: syscon@1e00000 { compatible = "fsl,ls1028a-dcfg", "syscon"; reg = <0x0 0x1e00000 0x0 0x10000>; - big-endian; + little-endian; + }; + + rst: syscon@1e60000 { + compatible = "syscon"; + reg = <0x0 0x1e60000 0x0 0x10000>; + little-endian; }; scfg: syscon@1fc0000 { @@ -567,7 +573,7 @@ 0x00010004 0x0000003d 0x00010005 0x00000045 0x00010006 0x0000004d - 0x00010007 0x00000045 + 0x00010007 0x00000055 0x00010008 0x0000005e 0x00010009 0x00000066 0x0001000a 0x0000006e --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/freescale/fsl-ls1043-post.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/freescale/fsl-ls1043-post.dtsi @@ -20,6 +20,8 @@ }; &fman0 { + fsl,erratum-a050385; + /* these aliases provide the FMan ports mapping */ enet0: ethernet@e0000 { }; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/freescale/fsl-ls1043a-rdb.dts +++ linux-riscv-5.4.0/arch/arm64/boot/dts/freescale/fsl-ls1043a-rdb.dts @@ -119,12 +119,12 @@ ethernet@e4000 { phy-handle = <&rgmii_phy1>; - phy-connection-type = "rgmii-txid"; + phy-connection-type = "rgmii-id"; }; ethernet@e6000 { phy-handle = <&rgmii_phy2>; - phy-connection-type = "rgmii-txid"; + phy-connection-type = "rgmii-id"; }; ethernet@e8000 { --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/freescale/fsl-ls1046a-rdb.dts +++ linux-riscv-5.4.0/arch/arm64/boot/dts/freescale/fsl-ls1046a-rdb.dts @@ -127,12 +127,12 @@ &fman0 { ethernet@e4000 { phy-handle = <&rgmii_phy1>; - phy-connection-type = "rgmii"; + phy-connection-type = "rgmii-id"; }; ethernet@e6000 { phy-handle = <&rgmii_phy2>; - phy-connection-type = "rgmii"; + phy-connection-type = "rgmii-id"; }; ethernet@e8000 { --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/freescale/imx8mm-evk.dts +++ linux-riscv-5.4.0/arch/arm64/boot/dts/freescale/imx8mm-evk.dts @@ -62,6 +62,8 @@ cpudai: simple-audio-card,cpu { sound-dai = <&sai3>; + dai-tdm-slot-num = <2>; + dai-tdm-slot-width = <32>; }; simple-audio-card,codec { --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/freescale/imx8mm.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/freescale/imx8mm.dtsi @@ -479,14 +479,18 @@ <&clk IMX8MM_CLK_AUDIO_AHB>, <&clk IMX8MM_CLK_IPG_AUDIO_ROOT>, <&clk IMX8MM_SYS_PLL3>, - <&clk IMX8MM_VIDEO_PLL1>; + <&clk IMX8MM_VIDEO_PLL1>, + <&clk IMX8MM_AUDIO_PLL1>, + <&clk IMX8MM_AUDIO_PLL2>; assigned-clock-parents = <&clk IMX8MM_SYS_PLL3_OUT>, <&clk IMX8MM_SYS_PLL1_800M>; assigned-clock-rates = <0>, <400000000>, <400000000>, <750000000>, - <594000000>; + <594000000>, + <393216000>, + <361267200>; }; src: reset-controller@30390000 { @@ -741,7 +745,7 @@ reg = <0x30bd0000 0x10000>; interrupts = ; clocks = <&clk IMX8MM_CLK_SDMA1_ROOT>, - <&clk IMX8MM_CLK_SDMA1_ROOT>; + <&clk IMX8MM_CLK_AHB>; clock-names = "ipg", "ahb"; #dma-cells = <3>; fsl,sdma-ram-script-name = "imx/sdma/sdma-imx7d.bin"; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/freescale/imx8mq-librem5-devkit.dts +++ linux-riscv-5.4.0/arch/arm64/boot/dts/freescale/imx8mq-librem5-devkit.dts @@ -421,7 +421,7 @@ pinctrl-names = "default"; pinctrl-0 = <&pinctrl_imu>; interrupt-parent = <&gpio3>; - interrupts = <19 IRQ_TYPE_LEVEL_LOW>; + interrupts = <19 IRQ_TYPE_LEVEL_HIGH>; vdd-supply = <®_3v3_p>; vddio-supply = <®_3v3_p>; }; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/freescale/imx8qxp-mek.dts +++ linux-riscv-5.4.0/arch/arm64/boot/dts/freescale/imx8qxp-mek.dts @@ -52,11 +52,6 @@ compatible = "ethernet-phy-ieee802.3-c22"; reg = <0>; }; - - ethphy1: ethernet-phy@1 { - compatible = "ethernet-phy-ieee802.3-c22"; - reg = <1>; - }; }; }; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/intel/socfpga_agilex.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/intel/socfpga_agilex.dtsi @@ -47,10 +47,10 @@ pmu { compatible = "arm,armv8-pmuv3"; - interrupts = <0 120 8>, - <0 121 8>, - <0 122 8>, - <0 123 8>; + interrupts = <0 170 4>, + <0 171 4>, + <0 172 4>, + <0 173 4>; interrupt-affinity = <&cpu0>, <&cpu1>, <&cpu2>, @@ -82,7 +82,7 @@ ranges = <0 0 0 0xffffffff>; gmac0: ethernet@ff800000 { - compatible = "altr,socfpga-stmmac", "snps,dwmac-3.74a", "snps,dwmac"; + compatible = "altr,socfpga-stmmac-a10-s10", "snps,dwmac-3.74a", "snps,dwmac"; reg = <0xff800000 0x2000>; interrupts = <0 90 4>; interrupt-names = "macirq"; @@ -97,7 +97,7 @@ }; gmac1: ethernet@ff802000 { - compatible = "altr,socfpga-stmmac", "snps,dwmac-3.74a", "snps,dwmac"; + compatible = "altr,socfpga-stmmac-a10-s10", "snps,dwmac-3.74a", "snps,dwmac"; reg = <0xff802000 0x2000>; interrupts = <0 91 4>; interrupt-names = "macirq"; @@ -112,7 +112,7 @@ }; gmac2: ethernet@ff804000 { - compatible = "altr,socfpga-stmmac", "snps,dwmac-3.74a", "snps,dwmac"; + compatible = "altr,socfpga-stmmac-a10-s10", "snps,dwmac-3.74a", "snps,dwmac"; reg = <0xff804000 0x2000>; interrupts = <0 92 4>; interrupt-names = "macirq"; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/marvell/armada-3720-uDPU.dts +++ linux-riscv-5.4.0/arch/arm64/boot/dts/marvell/armada-3720-uDPU.dts @@ -143,6 +143,7 @@ phy-mode = "sgmii"; status = "okay"; managed = "in-band-status"; + phys = <&comphy1 0>; sfp = <&sfp_eth0>; }; @@ -150,11 +151,14 @@ phy-mode = "sgmii"; status = "okay"; managed = "in-band-status"; + phys = <&comphy0 1>; sfp = <&sfp_eth1>; }; &usb3 { status = "okay"; + phys = <&usb2_utmi_otg_phy>; + phy-names = "usb2-utmi-otg-phy"; }; &uart0 { --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts +++ linux-riscv-5.4.0/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts @@ -408,6 +408,8 @@ reg = <5>; label = "cpu"; ethernet = <&cp1_eth2>; + phy-mode = "2500base-x"; + managed = "in-band-status"; }; }; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/marvell/armada-ap806-dual.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/marvell/armada-ap806-dual.dtsi @@ -21,6 +21,7 @@ reg = <0x000>; enable-method = "psci"; #cooling-cells = <2>; + clocks = <&cpu_clk 0>; }; cpu1: cpu@1 { device_type = "cpu"; @@ -28,6 +29,7 @@ reg = <0x001>; enable-method = "psci"; #cooling-cells = <2>; + clocks = <&cpu_clk 0>; }; }; }; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/marvell/armada-cp110.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/marvell/armada-cp110.dtsi @@ -438,10 +438,10 @@ CP110_LABEL(nand_controller): nand@720000 { /* - * Due to the limitation of the pins available - * this controller is only usable on the CPM - * for A7K and on the CPS for A8K. - */ + * Due to the limitation of the pins available + * this controller is only usable on the CPM + * for A7K and on the CPS for A8K. + */ compatible = "marvell,armada-8k-nand-controller", "marvell,armada370-nand-controller"; reg = <0x720000 0x54>; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi @@ -309,9 +309,8 @@ regulator-name = "VDD_12V"; regulator-min-microvolt = <1200000>; regulator-max-microvolt = <1200000>; - gpio = <&gpio TEGRA194_MAIN_GPIO(A, 1) GPIO_ACTIVE_LOW>; + gpio = <&gpio TEGRA194_MAIN_GPIO(A, 1) GPIO_ACTIVE_HIGH>; regulator-boot-on; - enable-active-low; }; }; }; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/nvidia/tegra210-p2597.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/nvidia/tegra210-p2597.dtsi @@ -1612,7 +1612,7 @@ regulator-name = "VDD_HDMI_5V0"; regulator-min-microvolt = <5000000>; regulator-max-microvolt = <5000000>; - gpio = <&exp1 12 GPIO_ACTIVE_LOW>; + gpio = <&exp1 12 GPIO_ACTIVE_HIGH>; enable-active-high; vin-supply = <&vdd_5v0_sys>; }; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/qcom/apq8096-db820c.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/qcom/apq8096-db820c.dtsi @@ -623,6 +623,8 @@ l21 { regulator-min-microvolt = <2950000>; regulator-max-microvolt = <2950000>; + regulator-allow-set-load; + regulator-system-load = <200000>; }; l22 { regulator-min-microvolt = <3300000>; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/qcom/msm8996.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/qcom/msm8996.dtsi @@ -1598,6 +1598,8 @@ interrupts = <0 138 IRQ_TYPE_LEVEL_HIGH>; phys = <&hsusb_phy2>; phy-names = "usb2-phy"; + snps,dis_u2_susphy_quirk; + snps,dis_enblslpm_quirk; }; }; @@ -1628,6 +1630,8 @@ interrupts = <0 131 IRQ_TYPE_LEVEL_HIGH>; phys = <&hsusb_phy1>, <&ssusb_phy_0>; phy-names = "usb2-phy", "usb3-phy"; + snps,dis_u2_susphy_quirk; + snps,dis_enblslpm_quirk; }; }; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/qcom/msm8998-clamshell.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/qcom/msm8998-clamshell.dtsi @@ -23,6 +23,43 @@ }; }; +/* + * The laptop FW does not appear to support the retention state as it is + * not advertised as enabled in ACPI, and enabling it in DT can cause boot + * hangs. + */ +&CPU0 { + cpu-idle-states = <&LITTLE_CPU_SLEEP_1>; +}; + +&CPU1 { + cpu-idle-states = <&LITTLE_CPU_SLEEP_1>; +}; + +&CPU2 { + cpu-idle-states = <&LITTLE_CPU_SLEEP_1>; +}; + +&CPU3 { + cpu-idle-states = <&LITTLE_CPU_SLEEP_1>; +}; + +&CPU4 { + cpu-idle-states = <&BIG_CPU_SLEEP_1>; +}; + +&CPU5 { + cpu-idle-states = <&BIG_CPU_SLEEP_1>; +}; + +&CPU6 { + cpu-idle-states = <&BIG_CPU_SLEEP_1>; +}; + +&CPU7 { + cpu-idle-states = <&BIG_CPU_SLEEP_1>; +}; + &qusb2phy { status = "okay"; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/qcom/msm8998-mtp.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/qcom/msm8998-mtp.dtsi @@ -27,6 +27,66 @@ status = "okay"; }; +&etf { + status = "okay"; +}; + +&etm1 { + status = "okay"; +}; + +&etm2 { + status = "okay"; +}; + +&etm3 { + status = "okay"; +}; + +&etm4 { + status = "okay"; +}; + +&etm5 { + status = "okay"; +}; + +&etm6 { + status = "okay"; +}; + +&etm7 { + status = "okay"; +}; + +&etm8 { + status = "okay"; +}; + +&etr { + status = "okay"; +}; + +&funnel1 { + status = "okay"; +}; + +&funnel2 { + status = "okay"; +}; + +&funnel3 { + status = "okay"; +}; + +&funnel4 { + status = "okay"; +}; + +&funnel5 { + status = "okay"; +}; + &pm8005_lsid1 { pm8005-regulators { compatible = "qcom,pm8005-regulators"; @@ -51,6 +111,10 @@ vdda-phy-dpdm-supply = <&vreg_l24a_3p075>; }; +&replicator1 { + status = "okay"; +}; + &rpm_requests { pm8998-regulators { compatible = "qcom,rpm-pm8998-regulators"; @@ -249,6 +313,10 @@ pinctrl-1 = <&sdc2_clk_off &sdc2_cmd_off &sdc2_data_off &sdc2_cd_off>; }; +&stm { + status = "okay"; +}; + &ufshc { vcc-supply = <&vreg_l20a_2p95>; vccq-supply = <&vreg_l26a_1p2>; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/qcom/msm8998.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/qcom/msm8998.dtsi @@ -985,7 +985,7 @@ tcsr_mutex_regs: syscon@1f40000 { compatible = "syscon"; - reg = <0x01f40000 0x20000>; + reg = <0x01f40000 0x40000>; }; tlmm: pinctrl@3400000 { @@ -998,11 +998,12 @@ #interrupt-cells = <0x2>; }; - stm@6002000 { + stm: stm@6002000 { compatible = "arm,coresight-stm", "arm,primecell"; reg = <0x06002000 0x1000>, <0x16280000 0x180000>; reg-names = "stm-base", "stm-data-base"; + status = "disabled"; clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>; clock-names = "apb_pclk", "atclk"; @@ -1016,9 +1017,10 @@ }; }; - funnel@6041000 { + funnel1: funnel@6041000 { compatible = "arm,coresight-dynamic-funnel", "arm,primecell"; reg = <0x06041000 0x1000>; + status = "disabled"; clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>; clock-names = "apb_pclk", "atclk"; @@ -1045,9 +1047,10 @@ }; }; - funnel@6042000 { + funnel2: funnel@6042000 { compatible = "arm,coresight-dynamic-funnel", "arm,primecell"; reg = <0x06042000 0x1000>; + status = "disabled"; clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>; clock-names = "apb_pclk", "atclk"; @@ -1075,9 +1078,10 @@ }; }; - funnel@6045000 { + funnel3: funnel@6045000 { compatible = "arm,coresight-dynamic-funnel", "arm,primecell"; reg = <0x06045000 0x1000>; + status = "disabled"; clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>; clock-names = "apb_pclk", "atclk"; @@ -1113,9 +1117,10 @@ }; }; - replicator@6046000 { + replicator1: replicator@6046000 { compatible = "arm,coresight-dynamic-replicator", "arm,primecell"; reg = <0x06046000 0x1000>; + status = "disabled"; clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>; clock-names = "apb_pclk", "atclk"; @@ -1137,9 +1142,10 @@ }; }; - etf@6047000 { + etf: etf@6047000 { compatible = "arm,coresight-tmc", "arm,primecell"; reg = <0x06047000 0x1000>; + status = "disabled"; clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>; clock-names = "apb_pclk", "atclk"; @@ -1163,9 +1169,10 @@ }; }; - etr@6048000 { + etr: etr@6048000 { compatible = "arm,coresight-tmc", "arm,primecell"; reg = <0x06048000 0x1000>; + status = "disabled"; clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>; clock-names = "apb_pclk", "atclk"; @@ -1181,9 +1188,10 @@ }; }; - etm@7840000 { + etm1: etm@7840000 { compatible = "arm,coresight-etm4x", "arm,primecell"; reg = <0x07840000 0x1000>; + status = "disabled"; clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>; clock-names = "apb_pclk", "atclk"; @@ -1200,9 +1208,10 @@ }; }; - etm@7940000 { + etm2: etm@7940000 { compatible = "arm,coresight-etm4x", "arm,primecell"; reg = <0x07940000 0x1000>; + status = "disabled"; clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>; clock-names = "apb_pclk", "atclk"; @@ -1219,9 +1228,10 @@ }; }; - etm@7a40000 { + etm3: etm@7a40000 { compatible = "arm,coresight-etm4x", "arm,primecell"; reg = <0x07a40000 0x1000>; + status = "disabled"; clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>; clock-names = "apb_pclk", "atclk"; @@ -1238,9 +1248,10 @@ }; }; - etm@7b40000 { + etm4: etm@7b40000 { compatible = "arm,coresight-etm4x", "arm,primecell"; reg = <0x07b40000 0x1000>; + status = "disabled"; clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>; clock-names = "apb_pclk", "atclk"; @@ -1257,9 +1268,10 @@ }; }; - funnel@7b60000 { /* APSS Funnel */ + funnel4: funnel@7b60000 { /* APSS Funnel */ compatible = "arm,coresight-etm4x", "arm,primecell"; reg = <0x07b60000 0x1000>; + status = "disabled"; clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>; clock-names = "apb_pclk", "atclk"; @@ -1343,9 +1355,10 @@ }; }; - funnel@7b70000 { + funnel5: funnel@7b70000 { compatible = "arm,coresight-dynamic-funnel", "arm,primecell"; reg = <0x07b70000 0x1000>; + status = "disabled"; clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>; clock-names = "apb_pclk", "atclk"; @@ -1369,9 +1382,10 @@ }; }; - etm@7c40000 { + etm5: etm@7c40000 { compatible = "arm,coresight-etm4x", "arm,primecell"; reg = <0x07c40000 0x1000>; + status = "disabled"; clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>; clock-names = "apb_pclk", "atclk"; @@ -1385,9 +1399,10 @@ }; }; - etm@7d40000 { + etm6: etm@7d40000 { compatible = "arm,coresight-etm4x", "arm,primecell"; reg = <0x07d40000 0x1000>; + status = "disabled"; clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>; clock-names = "apb_pclk", "atclk"; @@ -1401,9 +1416,10 @@ }; }; - etm@7e40000 { + etm7: etm@7e40000 { compatible = "arm,coresight-etm4x", "arm,primecell"; reg = <0x07e40000 0x1000>; + status = "disabled"; clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>; clock-names = "apb_pclk", "atclk"; @@ -1417,9 +1433,10 @@ }; }; - etm@7f40000 { + etm8: etm@7f40000 { compatible = "arm,coresight-etm4x", "arm,primecell"; reg = <0x07f40000 0x1000>; + status = "disabled"; clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>; clock-names = "apb_pclk", "atclk"; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/qcom/qcs404-evb.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/qcom/qcs404-evb.dtsi @@ -73,6 +73,7 @@ regulator-always-on; regulator-boot-on; regulator-name = "vdd_apc"; + regulator-initial-mode = <1>; regulator-min-microvolt = <1048000>; regulator-max-microvolt = <1384000>; }; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi @@ -165,6 +165,8 @@ /delete-node/ &venus_mem; /delete-node/ &cdsp_mem; /delete-node/ &cdsp_pas; +/delete-node/ &zap_shader; +/delete-node/ &gpu_mem; /* Increase the size from 120 MB to 128 MB */ &mpss_region { --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/qcom/sdm845-db845c.dts +++ linux-riscv-5.4.0/arch/arm64/boot/dts/qcom/sdm845-db845c.dts @@ -517,6 +517,8 @@ vdd-1.8-xo-supply = <&vreg_l7a_1p8>; vdd-1.3-rfa-supply = <&vreg_l17a_1p3>; vdd-3.3-ch0-supply = <&vreg_l25a_3p3>; + + qcom,snoc-host-cap-8bit-quirk; }; /* PINCTRL - additions to nodes defined in sdm845.dtsi */ --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/qcom/sdm845.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/qcom/sdm845.dtsi @@ -2824,7 +2824,7 @@ qcom,gmu = <&gmu>; - zap-shader { + zap_shader: zap-shader { memory-region = <&gpu_mem>; }; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/renesas/hihope-common.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/renesas/hihope-common.dtsi @@ -86,7 +86,7 @@ label = "rcar-sound"; - dais = <&rsnd_port0>; + dais = <&rsnd_port>; }; vbus0_usb2: regulator-vbus0-usb2 { @@ -191,7 +191,7 @@ port@2 { reg = <2>; dw_hdmi0_snd_in: endpoint { - remote-endpoint = <&rsnd_endpoint0>; + remote-endpoint = <&rsnd_endpoint>; }; }; }; @@ -327,17 +327,15 @@ /* Single DAI */ #sound-dai-cells = <0>; - ports { - rsnd_port0: port@0 { - rsnd_endpoint0: endpoint { - remote-endpoint = <&dw_hdmi0_snd_in>; - - dai-format = "i2s"; - bitclock-master = <&rsnd_endpoint0>; - frame-master = <&rsnd_endpoint0>; + rsnd_port: port { + rsnd_endpoint: endpoint { + remote-endpoint = <&dw_hdmi0_snd_in>; + + dai-format = "i2s"; + bitclock-master = <&rsnd_endpoint>; + frame-master = <&rsnd_endpoint>; - playback = <&ssi2>; - }; + playback = <&ssi2>; }; }; }; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/renesas/r8a774a1.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/renesas/r8a774a1.dtsi @@ -1726,17 +1726,6 @@ "ssi.1", "ssi.0"; status = "disabled"; - ports { - #address-cells = <1>; - #size-cells = <0>; - port@0 { - reg = <0>; - }; - port@1 { - reg = <1>; - }; - }; - rcar_sound,ctu { ctu00: ctu-0 { }; ctu01: ctu-1 { }; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/renesas/r8a77970.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/renesas/r8a77970.dtsi @@ -652,7 +652,7 @@ }; pwm3: pwm@e6e33000 { - compatible = "renesas,pwm-r8a7790", "renesas,pwm-rcar"; + compatible = "renesas,pwm-r8a77970", "renesas,pwm-rcar"; reg = <0 0xe6e33000 0 8>; #pwm-cells = <2>; clocks = <&cpg CPG_MOD 523>; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/renesas/r8a77990-ebisu.dts +++ linux-riscv-5.4.0/arch/arm64/boot/dts/renesas/r8a77990-ebisu.dts @@ -636,7 +636,6 @@ /* audio_clkout0/1/2/3 */ #clock-cells = <1>; clock-frequency = <12288000 11289600>; - clkout-lr-synchronous; status = "okay"; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/rockchip/px30.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/rockchip/px30.dtsi @@ -768,7 +768,7 @@ interrupts = ; clocks = <&cru HCLK_SDMMC>, <&cru SCLK_SDMMC>, <&cru SCLK_SDMMC_DRV>, <&cru SCLK_SDMMC_SAMPLE>; - clock-names = "biu", "ciu", "ciu-drv", "ciu-sample"; + clock-names = "biu", "ciu", "ciu-drive", "ciu-sample"; fifo-depth = <0x100>; max-frequency = <150000000>; pinctrl-names = "default"; @@ -783,7 +783,7 @@ interrupts = ; clocks = <&cru HCLK_SDIO>, <&cru SCLK_SDIO>, <&cru SCLK_SDIO_DRV>, <&cru SCLK_SDIO_SAMPLE>; - clock-names = "biu", "ciu", "ciu-drv", "ciu-sample"; + clock-names = "biu", "ciu", "ciu-drive", "ciu-sample"; fifo-depth = <0x100>; max-frequency = <150000000>; pinctrl-names = "default"; @@ -798,7 +798,7 @@ interrupts = ; clocks = <&cru HCLK_EMMC>, <&cru SCLK_EMMC>, <&cru SCLK_EMMC_DRV>, <&cru SCLK_EMMC_SAMPLE>; - clock-names = "biu", "ciu", "ciu-drv", "ciu-sample"; + clock-names = "biu", "ciu", "ciu-drive", "ciu-sample"; fifo-depth = <0x100>; max-frequency = <150000000>; power-domains = <&power PX30_PD_MMC_NAND>; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/rockchip/rk3399-firefly.dts +++ linux-riscv-5.4.0/arch/arm64/boot/dts/rockchip/rk3399-firefly.dts @@ -669,9 +669,12 @@ vqmmc-supply = &vcc1v8_s3; /* IO line */ vmmc-supply = &vcc_sdio; /* card's power */ + #address-cells = <1>; + #size-cells = <0>; status = "okay"; brcmf: wifi@1 { + reg = <1>; compatible = "brcm,bcm4329-fmac"; interrupt-parent = <&gpio0>; interrupts = ; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/rockchip/rk3399-khadas-edge.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/rockchip/rk3399-khadas-edge.dtsi @@ -654,9 +654,12 @@ sd-uhs-sdr104; vqmmc-supply = <&vcc1v8_s3>; vmmc-supply = <&vccio_sd>; + #address-cells = <1>; + #size-cells = <0>; status = "okay"; brcmf: wifi@1 { + reg = <1>; compatible = "brcm,bcm4329-fmac"; interrupt-parent = <&gpio0>; interrupts = ; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/rockchip/rk3399-nanopc-t4.dts +++ linux-riscv-5.4.0/arch/arm64/boot/dts/rockchip/rk3399-nanopc-t4.dts @@ -94,33 +94,6 @@ }; }; -&gpu_thermal { - trips { - gpu_warm: gpu_warm { - temperature = <55000>; - hysteresis = <2000>; - type = "active"; - }; - - gpu_hot: gpu_hot { - temperature = <65000>; - hysteresis = <2000>; - type = "active"; - }; - }; - cooling-maps { - map1 { - trip = <&gpu_warm>; - cooling-device = <&fan THERMAL_NO_LIMIT 1>; - }; - - map2 { - trip = <&gpu_hot>; - cooling-device = <&fan 2 THERMAL_NO_LIMIT>; - }; - }; -}; - &pinctrl { ir { ir_rx: ir-rx { --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/rockchip/rk3399-orangepi.dts +++ linux-riscv-5.4.0/arch/arm64/boot/dts/rockchip/rk3399-orangepi.dts @@ -648,9 +648,12 @@ pinctrl-names = "default"; pinctrl-0 = <&sdio0_bus4 &sdio0_cmd &sdio0_clk>; sd-uhs-sdr104; + #address-cells = <1>; + #size-cells = <0>; status = "okay"; brcmf: wifi@1 { + reg = <1>; compatible = "brcm,bcm4329-fmac"; interrupt-parent = <&gpio0>; interrupts = ; --- linux-riscv-5.4.0.orig/arch/arm64/boot/dts/ti/k3-j721e-main.dtsi +++ linux-riscv-5.4.0/arch/arm64/boot/dts/ti/k3-j721e-main.dtsi @@ -43,6 +43,7 @@ smmu0: smmu@36600000 { compatible = "arm,smmu-v3"; reg = <0x0 0x36600000 0x0 0x100000>; + power-domains = <&k3_pds 229 TI_SCI_PD_EXCLUSIVE>; interrupt-parent = <&gic500>; interrupts = , ; --- linux-riscv-5.4.0.orig/arch/arm64/crypto/aes-neonbs-glue.c +++ linux-riscv-5.4.0/arch/arm64/crypto/aes-neonbs-glue.c @@ -384,7 +384,7 @@ goto xts_tail; kernel_neon_end(); - skcipher_walk_done(&walk, nbytes); + err = skcipher_walk_done(&walk, nbytes); } if (err || likely(!tail)) --- linux-riscv-5.4.0.orig/arch/arm64/crypto/ghash-ce-glue.c +++ linux-riscv-5.4.0/arch/arm64/crypto/ghash-ce-glue.c @@ -261,7 +261,7 @@ static struct shash_alg ghash_alg[] = {{ .base.cra_name = "ghash", .base.cra_driver_name = "ghash-neon", - .base.cra_priority = 100, + .base.cra_priority = 150, .base.cra_blocksize = GHASH_BLOCK_SIZE, .base.cra_ctxsize = sizeof(struct ghash_key), .base.cra_module = THIS_MODULE, --- linux-riscv-5.4.0.orig/arch/arm64/include/asm/alternative.h +++ linux-riscv-5.4.0/arch/arm64/include/asm/alternative.h @@ -35,13 +35,16 @@ static inline void apply_alternatives_module(void *start, size_t length) { } #endif -#define ALTINSTR_ENTRY(feature,cb) \ +#define ALTINSTR_ENTRY(feature) \ " .word 661b - .\n" /* label */ \ - " .if " __stringify(cb) " == 0\n" \ " .word 663f - .\n" /* new instruction */ \ - " .else\n" \ + " .hword " __stringify(feature) "\n" /* feature bit */ \ + " .byte 662b-661b\n" /* source len */ \ + " .byte 664f-663f\n" /* replacement len */ + +#define ALTINSTR_ENTRY_CB(feature, cb) \ + " .word 661b - .\n" /* label */ \ " .word " __stringify(cb) "- .\n" /* callback */ \ - " .endif\n" \ " .hword " __stringify(feature) "\n" /* feature bit */ \ " .byte 662b-661b\n" /* source len */ \ " .byte 664f-663f\n" /* replacement len */ @@ -62,15 +65,14 @@ * * Alternatives with callbacks do not generate replacement instructions. */ -#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled, cb) \ +#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled) \ ".if "__stringify(cfg_enabled)" == 1\n" \ "661:\n\t" \ oldinstr "\n" \ "662:\n" \ ".pushsection .altinstructions,\"a\"\n" \ - ALTINSTR_ENTRY(feature,cb) \ + ALTINSTR_ENTRY(feature) \ ".popsection\n" \ - " .if " __stringify(cb) " == 0\n" \ ".pushsection .altinstr_replacement, \"a\"\n" \ "663:\n\t" \ newinstr "\n" \ @@ -78,17 +80,25 @@ ".popsection\n\t" \ ".org . - (664b-663b) + (662b-661b)\n\t" \ ".org . - (662b-661b) + (664b-663b)\n" \ - ".else\n\t" \ + ".endif\n" + +#define __ALTERNATIVE_CFG_CB(oldinstr, feature, cfg_enabled, cb) \ + ".if "__stringify(cfg_enabled)" == 1\n" \ + "661:\n\t" \ + oldinstr "\n" \ + "662:\n" \ + ".pushsection .altinstructions,\"a\"\n" \ + ALTINSTR_ENTRY_CB(feature, cb) \ + ".popsection\n" \ "663:\n\t" \ "664:\n\t" \ - ".endif\n" \ ".endif\n" #define _ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg, ...) \ - __ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg), 0) + __ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg)) #define ALTERNATIVE_CB(oldinstr, cb) \ - __ALTERNATIVE_CFG(oldinstr, "NOT_AN_INSTRUCTION", ARM64_CB_PATCH, 1, cb) + __ALTERNATIVE_CFG_CB(oldinstr, ARM64_CB_PATCH, 1, cb) #else #include @@ -211,7 +221,7 @@ .macro user_alt, label, oldinstr, newinstr, cond 9999: alternative_insn "\oldinstr", "\newinstr", \cond - _ASM_EXTABLE 9999b, \label + _asm_extable 9999b, \label .endm /* --- linux-riscv-5.4.0.orig/arch/arm64/include/asm/atomic_lse.h +++ linux-riscv-5.4.0/arch/arm64/include/asm/atomic_lse.h @@ -14,6 +14,7 @@ static inline void __lse_atomic_##op(int i, atomic_t *v) \ { \ asm volatile( \ + __LSE_PREAMBLE \ " " #asm_op " %w[i], %[v]\n" \ : [i] "+r" (i), [v] "+Q" (v->counter) \ : "r" (v)); \ @@ -30,6 +31,7 @@ static inline int __lse_atomic_fetch_##op##name(int i, atomic_t *v) \ { \ asm volatile( \ + __LSE_PREAMBLE \ " " #asm_op #mb " %w[i], %w[i], %[v]" \ : [i] "+r" (i), [v] "+Q" (v->counter) \ : "r" (v) \ @@ -58,6 +60,7 @@ u32 tmp; \ \ asm volatile( \ + __LSE_PREAMBLE \ " ldadd" #mb " %w[i], %w[tmp], %[v]\n" \ " add %w[i], %w[i], %w[tmp]" \ : [i] "+r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \ @@ -77,6 +80,7 @@ static inline void __lse_atomic_and(int i, atomic_t *v) { asm volatile( + __LSE_PREAMBLE " mvn %w[i], %w[i]\n" " stclr %w[i], %[v]" : [i] "+&r" (i), [v] "+Q" (v->counter) @@ -87,6 +91,7 @@ static inline int __lse_atomic_fetch_and##name(int i, atomic_t *v) \ { \ asm volatile( \ + __LSE_PREAMBLE \ " mvn %w[i], %w[i]\n" \ " ldclr" #mb " %w[i], %w[i], %[v]" \ : [i] "+&r" (i), [v] "+Q" (v->counter) \ @@ -106,6 +111,7 @@ static inline void __lse_atomic_sub(int i, atomic_t *v) { asm volatile( + __LSE_PREAMBLE " neg %w[i], %w[i]\n" " stadd %w[i], %[v]" : [i] "+&r" (i), [v] "+Q" (v->counter) @@ -118,6 +124,7 @@ u32 tmp; \ \ asm volatile( \ + __LSE_PREAMBLE \ " neg %w[i], %w[i]\n" \ " ldadd" #mb " %w[i], %w[tmp], %[v]\n" \ " add %w[i], %w[i], %w[tmp]" \ @@ -139,6 +146,7 @@ static inline int __lse_atomic_fetch_sub##name(int i, atomic_t *v) \ { \ asm volatile( \ + __LSE_PREAMBLE \ " neg %w[i], %w[i]\n" \ " ldadd" #mb " %w[i], %w[i], %[v]" \ : [i] "+&r" (i), [v] "+Q" (v->counter) \ @@ -159,6 +167,7 @@ static inline void __lse_atomic64_##op(s64 i, atomic64_t *v) \ { \ asm volatile( \ + __LSE_PREAMBLE \ " " #asm_op " %[i], %[v]\n" \ : [i] "+r" (i), [v] "+Q" (v->counter) \ : "r" (v)); \ @@ -175,6 +184,7 @@ static inline long __lse_atomic64_fetch_##op##name(s64 i, atomic64_t *v)\ { \ asm volatile( \ + __LSE_PREAMBLE \ " " #asm_op #mb " %[i], %[i], %[v]" \ : [i] "+r" (i), [v] "+Q" (v->counter) \ : "r" (v) \ @@ -203,6 +213,7 @@ unsigned long tmp; \ \ asm volatile( \ + __LSE_PREAMBLE \ " ldadd" #mb " %[i], %x[tmp], %[v]\n" \ " add %[i], %[i], %x[tmp]" \ : [i] "+r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \ @@ -222,6 +233,7 @@ static inline void __lse_atomic64_and(s64 i, atomic64_t *v) { asm volatile( + __LSE_PREAMBLE " mvn %[i], %[i]\n" " stclr %[i], %[v]" : [i] "+&r" (i), [v] "+Q" (v->counter) @@ -232,6 +244,7 @@ static inline long __lse_atomic64_fetch_and##name(s64 i, atomic64_t *v) \ { \ asm volatile( \ + __LSE_PREAMBLE \ " mvn %[i], %[i]\n" \ " ldclr" #mb " %[i], %[i], %[v]" \ : [i] "+&r" (i), [v] "+Q" (v->counter) \ @@ -251,6 +264,7 @@ static inline void __lse_atomic64_sub(s64 i, atomic64_t *v) { asm volatile( + __LSE_PREAMBLE " neg %[i], %[i]\n" " stadd %[i], %[v]" : [i] "+&r" (i), [v] "+Q" (v->counter) @@ -263,6 +277,7 @@ unsigned long tmp; \ \ asm volatile( \ + __LSE_PREAMBLE \ " neg %[i], %[i]\n" \ " ldadd" #mb " %[i], %x[tmp], %[v]\n" \ " add %[i], %[i], %x[tmp]" \ @@ -284,6 +299,7 @@ static inline long __lse_atomic64_fetch_sub##name(s64 i, atomic64_t *v) \ { \ asm volatile( \ + __LSE_PREAMBLE \ " neg %[i], %[i]\n" \ " ldadd" #mb " %[i], %[i], %[v]" \ : [i] "+&r" (i), [v] "+Q" (v->counter) \ @@ -305,6 +321,7 @@ unsigned long tmp; asm volatile( + __LSE_PREAMBLE "1: ldr %x[tmp], %[v]\n" " subs %[ret], %x[tmp], #1\n" " b.lt 2f\n" @@ -332,6 +349,7 @@ unsigned long tmp; \ \ asm volatile( \ + __LSE_PREAMBLE \ " mov %" #w "[tmp], %" #w "[old]\n" \ " cas" #mb #sfx "\t%" #w "[tmp], %" #w "[new], %[v]\n" \ " mov %" #w "[ret], %" #w "[tmp]" \ @@ -379,6 +397,7 @@ register unsigned long x4 asm ("x4") = (unsigned long)ptr; \ \ asm volatile( \ + __LSE_PREAMBLE \ " casp" #mb "\t%[old1], %[old2], %[new1], %[new2], %[v]\n"\ " eor %[old1], %[old1], %[oldval1]\n" \ " eor %[old2], %[old2], %[oldval2]\n" \ --- linux-riscv-5.4.0.orig/arch/arm64/include/asm/daifflags.h +++ linux-riscv-5.4.0/arch/arm64/include/asm/daifflags.h @@ -36,7 +36,7 @@ trace_hardirqs_off(); } -static inline unsigned long local_daif_save(void) +static inline unsigned long local_daif_save_flags(void) { unsigned long flags; @@ -48,6 +48,15 @@ flags |= PSR_I_BIT; } + return flags; +} + +static inline unsigned long local_daif_save(void) +{ + unsigned long flags; + + flags = local_daif_save_flags(); + local_daif_mask(); return flags; --- linux-riscv-5.4.0.orig/arch/arm64/include/asm/kvm_emulate.h +++ linux-riscv-5.4.0/arch/arm64/include/asm/kvm_emulate.h @@ -204,6 +204,38 @@ vcpu_gp_regs(vcpu)->spsr[KVM_SPSR_EL1] = v; } +/* + * The layout of SPSR for an AArch32 state is different when observed from an + * AArch64 SPSR_ELx or an AArch32 SPSR_*. This function generates the AArch32 + * view given an AArch64 view. + * + * In ARM DDI 0487E.a see: + * + * - The AArch64 view (SPSR_EL2) in section C5.2.18, page C5-426 + * - The AArch32 view (SPSR_abt) in section G8.2.126, page G8-6256 + * - The AArch32 view (SPSR_und) in section G8.2.132, page G8-6280 + * + * Which show the following differences: + * + * | Bit | AA64 | AA32 | Notes | + * +-----+------+------+-----------------------------| + * | 24 | DIT | J | J is RES0 in ARMv8 | + * | 21 | SS | DIT | SS doesn't exist in AArch32 | + * + * ... and all other bits are (currently) common. + */ +static inline unsigned long host_spsr_to_spsr32(unsigned long spsr) +{ + const unsigned long overlap = BIT(24) | BIT(21); + unsigned long dit = !!(spsr & PSR_AA32_DIT_BIT); + + spsr &= ~overlap; + + spsr |= dit << 21; + + return spsr; +} + static inline bool vcpu_mode_priv(const struct kvm_vcpu *vcpu) { u32 mode; @@ -263,6 +295,11 @@ return !!(kvm_vcpu_get_hsr(vcpu) & ESR_ELx_SSE); } +static inline bool kvm_vcpu_dabt_issf(const struct kvm_vcpu *vcpu) +{ + return !!(kvm_vcpu_get_hsr(vcpu) & ESR_ELx_SF); +} + static inline int kvm_vcpu_dabt_get_rd(const struct kvm_vcpu *vcpu) { return (kvm_vcpu_get_hsr(vcpu) & ESR_ELx_SRT_MASK) >> ESR_ELx_SRT_SHIFT; --- linux-riscv-5.4.0.orig/arch/arm64/include/asm/kvm_mmio.h +++ linux-riscv-5.4.0/arch/arm64/include/asm/kvm_mmio.h @@ -10,13 +10,11 @@ #include #include -/* - * This is annoying. The mmio code requires this, even if we don't - * need any decoding. To be fixed. - */ struct kvm_decode { unsigned long rt; bool sign_extend; + /* Witdth of the register accessed by the faulting instruction is 64-bits */ + bool sixty_four; }; void kvm_mmio_write_buf(void *buf, unsigned int len, unsigned long data); --- linux-riscv-5.4.0.orig/arch/arm64/include/asm/lse.h +++ linux-riscv-5.4.0/arch/arm64/include/asm/lse.h @@ -6,6 +6,8 @@ #if defined(CONFIG_AS_LSE) && defined(CONFIG_ARM64_LSE_ATOMICS) +#define __LSE_PREAMBLE ".arch_extension lse\n" + #include #include #include @@ -14,8 +16,6 @@ #include #include -__asm__(".arch_extension lse"); - extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS]; extern struct static_key_false arm64_const_caps_ready; @@ -34,7 +34,7 @@ /* In-line patching at runtime */ #define ARM64_LSE_ATOMIC_INSN(llsc, lse) \ - ALTERNATIVE(llsc, lse, ARM64_HAS_LSE_ATOMICS) + ALTERNATIVE(llsc, __LSE_PREAMBLE lse, ARM64_HAS_LSE_ATOMICS) #else /* CONFIG_AS_LSE && CONFIG_ARM64_LSE_ATOMICS */ --- linux-riscv-5.4.0.orig/arch/arm64/include/asm/memory.h +++ linux-riscv-5.4.0/arch/arm64/include/asm/memory.h @@ -219,7 +219,7 @@ ((__force __typeof__(addr))sign_extend64((__force u64)(addr), 55)) #define untagged_addr(addr) ({ \ - u64 __addr = (__force u64)addr; \ + u64 __addr = (__force u64)(addr); \ __addr &= __untagged_addr(__addr); \ (__force __typeof__(addr))__addr; \ }) --- linux-riscv-5.4.0.orig/arch/arm64/include/asm/pgtable-prot.h +++ linux-riscv-5.4.0/arch/arm64/include/asm/pgtable-prot.h @@ -85,13 +85,12 @@ #define PAGE_SHARED_EXEC __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_WRITE) #define PAGE_READONLY __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN) #define PAGE_READONLY_EXEC __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_RDONLY | PTE_NG | PTE_PXN) -#define PAGE_EXECONLY __pgprot(_PAGE_DEFAULT | PTE_RDONLY | PTE_NG | PTE_PXN) #define __P000 PAGE_NONE #define __P001 PAGE_READONLY #define __P010 PAGE_READONLY #define __P011 PAGE_READONLY -#define __P100 PAGE_EXECONLY +#define __P100 PAGE_READONLY_EXEC #define __P101 PAGE_READONLY_EXEC #define __P110 PAGE_READONLY_EXEC #define __P111 PAGE_READONLY_EXEC @@ -100,7 +99,7 @@ #define __S001 PAGE_READONLY #define __S010 PAGE_SHARED #define __S011 PAGE_SHARED -#define __S100 PAGE_EXECONLY +#define __S100 PAGE_READONLY_EXEC #define __S101 PAGE_READONLY_EXEC #define __S110 PAGE_SHARED_EXEC #define __S111 PAGE_SHARED_EXEC --- linux-riscv-5.4.0.orig/arch/arm64/include/asm/pgtable.h +++ linux-riscv-5.4.0/arch/arm64/include/asm/pgtable.h @@ -96,12 +96,8 @@ #define pte_dirty(pte) (pte_sw_dirty(pte) || pte_hw_dirty(pte)) #define pte_valid(pte) (!!(pte_val(pte) & PTE_VALID)) -/* - * Execute-only user mappings do not have the PTE_USER bit set. All valid - * kernel mappings have the PTE_UXN bit set. - */ #define pte_valid_not_user(pte) \ - ((pte_val(pte) & (PTE_VALID | PTE_USER | PTE_UXN)) == (PTE_VALID | PTE_UXN)) + ((pte_val(pte) & (PTE_VALID | PTE_USER)) == PTE_VALID) #define pte_valid_young(pte) \ ((pte_val(pte) & (PTE_VALID | PTE_AF)) == (PTE_VALID | PTE_AF)) #define pte_valid_user(pte) \ @@ -117,8 +113,8 @@ /* * p??_access_permitted() is true for valid user mappings (subject to the - * write permission check) other than user execute-only which do not have the - * PTE_USER bit set. PROT_NONE mappings do not have the PTE_VALID bit set. + * write permission check). PROT_NONE mappings do not have the PTE_VALID bit + * set. */ #define pte_access_permitted(pte, write) \ (pte_valid_user(pte) && (!(write) || pte_write(pte))) --- linux-riscv-5.4.0.orig/arch/arm64/include/asm/ptrace.h +++ linux-riscv-5.4.0/arch/arm64/include/asm/ptrace.h @@ -62,6 +62,7 @@ #define PSR_AA32_I_BIT 0x00000080 #define PSR_AA32_A_BIT 0x00000100 #define PSR_AA32_E_BIT 0x00000200 +#define PSR_AA32_PAN_BIT 0x00400000 #define PSR_AA32_SSBS_BIT 0x00800000 #define PSR_AA32_DIT_BIT 0x01000000 #define PSR_AA32_Q_BIT 0x08000000 --- linux-riscv-5.4.0.orig/arch/arm64/include/asm/uaccess.h +++ linux-riscv-5.4.0/arch/arm64/include/asm/uaccess.h @@ -62,8 +62,13 @@ { unsigned long ret, limit = current_thread_info()->addr_limit; + /* + * Asynchronous I/O running in a kernel thread does not have the + * TIF_TAGGED_ADDR flag of the process owning the mm, so always untag + * the user address before checking. + */ if (IS_ENABLED(CONFIG_ARM64_TAGGED_ADDR_ABI) && - test_thread_flag(TIF_TAGGED_ADDR)) + (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR))) addr = untagged_addr(addr); __chk_user_ptr(addr); --- linux-riscv-5.4.0.orig/arch/arm64/include/asm/unistd.h +++ linux-riscv-5.4.0/arch/arm64/include/asm/unistd.h @@ -25,8 +25,8 @@ #define __NR_compat_gettimeofday 78 #define __NR_compat_sigreturn 119 #define __NR_compat_rt_sigreturn 173 -#define __NR_compat_clock_getres 247 #define __NR_compat_clock_gettime 263 +#define __NR_compat_clock_getres 264 #define __NR_compat_clock_gettime64 403 #define __NR_compat_clock_getres_time64 406 @@ -42,7 +42,6 @@ #endif #define __ARCH_WANT_SYS_CLONE -#define __ARCH_WANT_SYS_CLONE3 #ifndef __COMPAT_SYSCALL_NR #include --- linux-riscv-5.4.0.orig/arch/arm64/include/uapi/asm/ptrace.h +++ linux-riscv-5.4.0/arch/arm64/include/uapi/asm/ptrace.h @@ -49,6 +49,7 @@ #define PSR_SSBS_BIT 0x00001000 #define PSR_PAN_BIT 0x00400000 #define PSR_UAO_BIT 0x00800000 +#define PSR_DIT_BIT 0x01000000 #define PSR_V_BIT 0x10000000 #define PSR_C_BIT 0x20000000 #define PSR_Z_BIT 0x40000000 --- linux-riscv-5.4.0.orig/arch/arm64/include/uapi/asm/unistd.h +++ linux-riscv-5.4.0/arch/arm64/include/uapi/asm/unistd.h @@ -19,5 +19,6 @@ #define __ARCH_WANT_NEW_STAT #define __ARCH_WANT_SET_GET_RLIMIT #define __ARCH_WANT_TIME32_SYSCALLS +#define __ARCH_WANT_SYS_CLONE3 #include --- linux-riscv-5.4.0.orig/arch/arm64/kernel/acpi.c +++ linux-riscv-5.4.0/arch/arm64/kernel/acpi.c @@ -274,7 +274,7 @@ if (!IS_ENABLED(CONFIG_ACPI_APEI_GHES)) return err; - current_flags = arch_local_save_flags(); + current_flags = local_daif_save_flags(); /* * SEA can interrupt SError, mask it and describe this as an NMI so --- linux-riscv-5.4.0.orig/arch/arm64/kernel/cpu_errata.c +++ linux-riscv-5.4.0/arch/arm64/kernel/cpu_errata.c @@ -575,6 +575,7 @@ MIDR_ALL_VERSIONS(MIDR_CORTEX_A53), MIDR_ALL_VERSIONS(MIDR_CORTEX_A55), MIDR_ALL_VERSIONS(MIDR_BRAHMA_B53), + MIDR_ALL_VERSIONS(MIDR_HISI_TSV110), { /* sentinel */ } }; --- linux-riscv-5.4.0.orig/arch/arm64/kernel/cpufeature.c +++ linux-riscv-5.4.0/arch/arm64/kernel/cpufeature.c @@ -32,9 +32,7 @@ #define COMPAT_ELF_HWCAP_DEFAULT \ (COMPAT_HWCAP_HALF|COMPAT_HWCAP_THUMB|\ COMPAT_HWCAP_FAST_MULT|COMPAT_HWCAP_EDSP|\ - COMPAT_HWCAP_TLS|COMPAT_HWCAP_VFP|\ - COMPAT_HWCAP_VFPv3|COMPAT_HWCAP_VFPv4|\ - COMPAT_HWCAP_NEON|COMPAT_HWCAP_IDIV|\ + COMPAT_HWCAP_TLS|COMPAT_HWCAP_IDIV|\ COMPAT_HWCAP_LPAE) unsigned int compat_elf_hwcap __read_mostly = COMPAT_ELF_HWCAP_DEFAULT; unsigned int compat_elf_hwcap2 __read_mostly; @@ -1367,7 +1365,7 @@ { /* FP/SIMD is not implemented */ .capability = ARM64_HAS_NO_FPSIMD, - .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE, .min_field_value = 0, .matches = has_no_fpsimd, }, @@ -1595,6 +1593,12 @@ .match_list = list, \ } +#define HWCAP_CAP_MATCH(match, cap_type, cap) \ + { \ + __HWCAP_CAP(#cap, cap_type, cap) \ + .matches = match, \ + } + #ifdef CONFIG_ARM64_PTR_AUTH static const struct arm64_cpu_capabilities ptr_auth_hwcap_addr_matches[] = { { @@ -1668,8 +1672,35 @@ {}, }; +#ifdef CONFIG_COMPAT +static bool compat_has_neon(const struct arm64_cpu_capabilities *cap, int scope) +{ + /* + * Check that all of MVFR1_EL1.{SIMDSP, SIMDInt, SIMDLS} are available, + * in line with that of arm32 as in vfp_init(). We make sure that the + * check is future proof, by making sure value is non-zero. + */ + u32 mvfr1; + + WARN_ON(scope == SCOPE_LOCAL_CPU && preemptible()); + if (scope == SCOPE_SYSTEM) + mvfr1 = read_sanitised_ftr_reg(SYS_MVFR1_EL1); + else + mvfr1 = read_sysreg_s(SYS_MVFR1_EL1); + + return cpuid_feature_extract_unsigned_field(mvfr1, MVFR1_SIMDSP_SHIFT) && + cpuid_feature_extract_unsigned_field(mvfr1, MVFR1_SIMDINT_SHIFT) && + cpuid_feature_extract_unsigned_field(mvfr1, MVFR1_SIMDLS_SHIFT); +} +#endif + static const struct arm64_cpu_capabilities compat_elf_hwcaps[] = { #ifdef CONFIG_COMPAT + HWCAP_CAP_MATCH(compat_has_neon, CAP_COMPAT_HWCAP, COMPAT_HWCAP_NEON), + HWCAP_CAP(SYS_MVFR1_EL1, MVFR1_SIMDFMAC_SHIFT, FTR_UNSIGNED, 1, CAP_COMPAT_HWCAP, COMPAT_HWCAP_VFPv4), + /* Arm v8 mandates MVFR0.FPDP == {0, 2}. So, piggy back on this for the presence of VFP support */ + HWCAP_CAP(SYS_MVFR0_EL1, MVFR0_FPDP_SHIFT, FTR_UNSIGNED, 2, CAP_COMPAT_HWCAP, COMPAT_HWCAP_VFP), + HWCAP_CAP(SYS_MVFR0_EL1, MVFR0_FPDP_SHIFT, FTR_UNSIGNED, 2, CAP_COMPAT_HWCAP, COMPAT_HWCAP_VFPv3), HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_AES_SHIFT, FTR_UNSIGNED, 2, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_PMULL), HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_AES_SHIFT, FTR_UNSIGNED, 1, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_AES), HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_SHA1_SHIFT, FTR_UNSIGNED, 1, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_SHA1), --- linux-riscv-5.4.0.orig/arch/arm64/kernel/fpsimd.c +++ linux-riscv-5.4.0/arch/arm64/kernel/fpsimd.c @@ -269,6 +269,7 @@ */ static void task_fpsimd_load(void) { + WARN_ON(!system_supports_fpsimd()); WARN_ON(!have_cpu_fpsimd_context()); if (system_supports_sve() && test_thread_flag(TIF_SVE)) @@ -289,6 +290,7 @@ this_cpu_ptr(&fpsimd_last_state); /* set by fpsimd_bind_task_to_cpu() or fpsimd_bind_state_to_cpu() */ + WARN_ON(!system_supports_fpsimd()); WARN_ON(!have_cpu_fpsimd_context()); if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) { @@ -1092,6 +1094,7 @@ struct fpsimd_last_state_struct *last = this_cpu_ptr(&fpsimd_last_state); + WARN_ON(!system_supports_fpsimd()); last->st = ¤t->thread.uw.fpsimd_state; last->sve_state = current->thread.sve_state; last->sve_vl = current->thread.sve_vl; @@ -1114,6 +1117,7 @@ struct fpsimd_last_state_struct *last = this_cpu_ptr(&fpsimd_last_state); + WARN_ON(!system_supports_fpsimd()); WARN_ON(!in_softirq() && !irqs_disabled()); last->st = st; @@ -1128,8 +1132,19 @@ */ void fpsimd_restore_current_state(void) { - if (!system_supports_fpsimd()) + /* + * For the tasks that were created before we detected the absence of + * FP/SIMD, the TIF_FOREIGN_FPSTATE could be set via fpsimd_thread_switch(), + * e.g, init. This could be then inherited by the children processes. + * If we later detect that the system doesn't support FP/SIMD, + * we must clear the flag for all the tasks to indicate that the + * FPSTATE is clean (as we can't have one) to avoid looping for ever in + * do_notify_resume(). + */ + if (!system_supports_fpsimd()) { + clear_thread_flag(TIF_FOREIGN_FPSTATE); return; + } get_cpu_fpsimd_context(); @@ -1148,7 +1163,7 @@ */ void fpsimd_update_current_state(struct user_fpsimd_state const *state) { - if (!system_supports_fpsimd()) + if (WARN_ON(!system_supports_fpsimd())) return; get_cpu_fpsimd_context(); @@ -1179,7 +1194,13 @@ void fpsimd_flush_task_state(struct task_struct *t) { t->thread.fpsimd_cpu = NR_CPUS; - + /* + * If we don't support fpsimd, bail out after we have + * reset the fpsimd_cpu for this task and clear the + * FPSTATE. + */ + if (!system_supports_fpsimd()) + return; barrier(); set_tsk_thread_flag(t, TIF_FOREIGN_FPSTATE); @@ -1193,6 +1214,7 @@ */ static void fpsimd_flush_cpu_state(void) { + WARN_ON(!system_supports_fpsimd()); __this_cpu_write(fpsimd_last_state.st, NULL); set_thread_flag(TIF_FOREIGN_FPSTATE); } @@ -1203,6 +1225,8 @@ */ void fpsimd_save_and_flush_cpu_state(void) { + if (!system_supports_fpsimd()) + return; WARN_ON(preemptible()); __get_cpu_fpsimd_context(); fpsimd_save(); --- linux-riscv-5.4.0.orig/arch/arm64/kernel/process.c +++ linux-riscv-5.4.0/arch/arm64/kernel/process.c @@ -360,8 +360,8 @@ asmlinkage void ret_from_fork(void) asm("ret_from_fork"); -int copy_thread(unsigned long clone_flags, unsigned long stack_start, - unsigned long stk_sz, struct task_struct *p) +int copy_thread_tls(unsigned long clone_flags, unsigned long stack_start, + unsigned long stk_sz, struct task_struct *p, unsigned long tls) { struct pt_regs *childregs = task_pt_regs(p); @@ -394,11 +394,11 @@ } /* - * If a TLS pointer was passed to clone (4th argument), use it - * for the new thread. + * If a TLS pointer was passed to clone, use it for the new + * thread. */ if (clone_flags & CLONE_SETTLS) - p->thread.uw.tp_value = childregs->regs[3]; + p->thread.uw.tp_value = tls; } else { memset(childregs, 0, sizeof(struct pt_regs)); childregs->pstate = PSR_MODE_EL1h; @@ -466,6 +466,13 @@ if (unlikely(next->flags & PF_KTHREAD)) return; + /* + * If all CPUs implement the SSBS extension, then we just need to + * context-switch the PSTATE field. + */ + if (cpu_have_feature(cpu_feature(SSBS))) + return; + /* If the mitigation is enabled, then we leave SSBS clear. */ if ((arm64_get_ssbd_state() == ARM64_SSBD_FORCE_ENABLE) || test_tsk_thread_flag(next, TIF_SSBD)) --- linux-riscv-5.4.0.orig/arch/arm64/kernel/psci.c +++ linux-riscv-5.4.0/arch/arm64/kernel/psci.c @@ -81,7 +81,8 @@ static int cpu_psci_cpu_kill(unsigned int cpu) { - int err, i; + int err; + unsigned long start, end; if (!psci_ops.affinity_info) return 0; @@ -91,16 +92,18 @@ * while it is dying. So, try again a few times. */ - for (i = 0; i < 10; i++) { + start = jiffies; + end = start + msecs_to_jiffies(100); + do { err = psci_ops.affinity_info(cpu_logical_map(cpu), 0); if (err == PSCI_0_2_AFFINITY_LEVEL_OFF) { - pr_info("CPU%d killed.\n", cpu); + pr_info("CPU%d killed (polled %d ms)\n", cpu, + jiffies_to_msecs(jiffies - start)); return 0; } - msleep(10); - pr_info("Retrying again to check for CPU kill\n"); - } + usleep_range(100, 1000); + } while (time_before(jiffies, end)); pr_warn("CPU%d may not have shut down cleanly (AFFINITY_INFO reports %d)\n", cpu, err); --- linux-riscv-5.4.0.orig/arch/arm64/kernel/ptrace.c +++ linux-riscv-5.4.0/arch/arm64/kernel/ptrace.c @@ -615,6 +615,13 @@ return 0; } +static int fpr_active(struct task_struct *target, const struct user_regset *regset) +{ + if (!system_supports_fpsimd()) + return -ENODEV; + return regset->n; +} + /* * TODO: update fp accessors for lazy context switching (sync/flush hwstate) */ @@ -637,6 +644,9 @@ unsigned int pos, unsigned int count, void *kbuf, void __user *ubuf) { + if (!system_supports_fpsimd()) + return -EINVAL; + if (target == current) fpsimd_preserve_current_state(); @@ -676,6 +686,9 @@ { int ret; + if (!system_supports_fpsimd()) + return -EINVAL; + ret = __fpr_set(target, regset, pos, count, kbuf, ubuf, 0); if (ret) return ret; @@ -1134,6 +1147,7 @@ */ .size = sizeof(u32), .align = sizeof(u32), + .active = fpr_active, .get = fpr_get, .set = fpr_set }, @@ -1348,6 +1362,9 @@ compat_ulong_t fpscr; int ret, vregs_end_pos; + if (!system_supports_fpsimd()) + return -EINVAL; + uregs = &target->thread.uw.fpsimd_state; if (target == current) @@ -1381,6 +1398,9 @@ compat_ulong_t fpscr; int ret, vregs_end_pos; + if (!system_supports_fpsimd()) + return -EINVAL; + uregs = &target->thread.uw.fpsimd_state; vregs_end_pos = VFP_STATE_SIZE - sizeof(compat_ulong_t); @@ -1438,6 +1458,7 @@ .n = VFP_STATE_SIZE / sizeof(compat_ulong_t), .size = sizeof(compat_ulong_t), .align = sizeof(compat_ulong_t), + .active = fpr_active, .get = compat_vfp_get, .set = compat_vfp_set }, --- linux-riscv-5.4.0.orig/arch/arm64/kernel/smp.c +++ linux-riscv-5.4.0/arch/arm64/kernel/smp.c @@ -955,11 +955,22 @@ } #endif +/* + * The number of CPUs online, not counting this CPU (which may not be + * fully online and so not counted in num_online_cpus()). + */ +static inline unsigned int num_other_online_cpus(void) +{ + unsigned int this_cpu_online = cpu_online(smp_processor_id()); + + return num_online_cpus() - this_cpu_online; +} + void smp_send_stop(void) { unsigned long timeout; - if (num_online_cpus() > 1) { + if (num_other_online_cpus()) { cpumask_t mask; cpumask_copy(&mask, cpu_online_mask); @@ -972,10 +983,10 @@ /* Wait up to one second for other CPUs to stop */ timeout = USEC_PER_SEC; - while (num_online_cpus() > 1 && timeout--) + while (num_other_online_cpus() && timeout--) udelay(1); - if (num_online_cpus() > 1) + if (num_other_online_cpus()) pr_warning("SMP: failed to stop secondary CPUs %*pbl\n", cpumask_pr_args(cpu_online_mask)); @@ -998,7 +1009,11 @@ cpus_stopped = 1; - if (num_online_cpus() == 1) { + /* + * If this cpu is the only one alive at this point in time, online or + * not, there are no stop messages to be sent around, so just back out. + */ + if (num_other_online_cpus() == 0) { sdei_mask_local_cpu(); return; } @@ -1006,7 +1021,7 @@ cpumask_copy(&mask, cpu_online_mask); cpumask_clear_cpu(smp_processor_id(), &mask); - atomic_set(&waiting_for_crash_ipi, num_online_cpus() - 1); + atomic_set(&waiting_for_crash_ipi, num_other_online_cpus()); pr_crit("SMP: stopping secondary CPUs\n"); smp_cross_call(&mask, IPI_CPU_CRASH_STOP); --- linux-riscv-5.4.0.orig/arch/arm64/kvm/debug.c +++ linux-riscv-5.4.0/arch/arm64/kvm/debug.c @@ -101,7 +101,7 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) { bool trap_debug = !(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY); - unsigned long mdscr; + unsigned long mdscr, orig_mdcr_el2 = vcpu->arch.mdcr_el2; trace_kvm_arm_setup_debug(vcpu, vcpu->guest_debug); @@ -197,6 +197,10 @@ if (vcpu_read_sys_reg(vcpu, MDSCR_EL1) & (DBG_MDSCR_KDE | DBG_MDSCR_MDE)) vcpu->arch.flags |= KVM_ARM64_DEBUG_DIRTY; + /* Write mdcr_el2 changes since vcpu_load on VHE systems */ + if (has_vhe() && orig_mdcr_el2 != vcpu->arch.mdcr_el2) + write_sysreg(vcpu->arch.mdcr_el2, mdcr_el2); + trace_kvm_arm_set_dreg32("MDCR_EL2", vcpu->arch.mdcr_el2); trace_kvm_arm_set_dreg32("MDSCR_EL1", vcpu_read_sys_reg(vcpu, MDSCR_EL1)); } --- linux-riscv-5.4.0.orig/arch/arm64/kvm/hyp/switch.c +++ linux-riscv-5.4.0/arch/arm64/kvm/hyp/switch.c @@ -28,7 +28,15 @@ /* Check whether the FP regs were dirtied while in the host-side run loop: */ static bool __hyp_text update_fp_enabled(struct kvm_vcpu *vcpu) { - if (vcpu->arch.host_thread_info->flags & _TIF_FOREIGN_FPSTATE) + /* + * When the system doesn't support FP/SIMD, we cannot rely on + * the _TIF_FOREIGN_FPSTATE flag. However, we always inject an + * abort on the very first access to FP and thus we should never + * see KVM_ARM64_FP_ENABLED. For added safety, make sure we always + * trap the accesses. + */ + if (!system_supports_fpsimd() || + vcpu->arch.host_thread_info->flags & _TIF_FOREIGN_FPSTATE) vcpu->arch.flags &= ~(KVM_ARM64_FP_ENABLED | KVM_ARM64_FP_HOST); --- linux-riscv-5.4.0.orig/arch/arm64/kvm/inject_fault.c +++ linux-riscv-5.4.0/arch/arm64/kvm/inject_fault.c @@ -14,9 +14,6 @@ #include #include -#define PSTATE_FAULT_BITS_64 (PSR_MODE_EL1h | PSR_A_BIT | PSR_F_BIT | \ - PSR_I_BIT | PSR_D_BIT) - #define CURRENT_EL_SP_EL0_VECTOR 0x0 #define CURRENT_EL_SP_ELx_VECTOR 0x200 #define LOWER_EL_AArch64_VECTOR 0x400 @@ -50,6 +47,69 @@ return vcpu_read_sys_reg(vcpu, VBAR_EL1) + exc_offset + type; } +/* + * When an exception is taken, most PSTATE fields are left unchanged in the + * handler. However, some are explicitly overridden (e.g. M[4:0]). Luckily all + * of the inherited bits have the same position in the AArch64/AArch32 SPSR_ELx + * layouts, so we don't need to shuffle these for exceptions from AArch32 EL0. + * + * For the SPSR_ELx layout for AArch64, see ARM DDI 0487E.a page C5-429. + * For the SPSR_ELx layout for AArch32, see ARM DDI 0487E.a page C5-426. + * + * Here we manipulate the fields in order of the AArch64 SPSR_ELx layout, from + * MSB to LSB. + */ +static unsigned long get_except64_pstate(struct kvm_vcpu *vcpu) +{ + unsigned long sctlr = vcpu_read_sys_reg(vcpu, SCTLR_EL1); + unsigned long old, new; + + old = *vcpu_cpsr(vcpu); + new = 0; + + new |= (old & PSR_N_BIT); + new |= (old & PSR_Z_BIT); + new |= (old & PSR_C_BIT); + new |= (old & PSR_V_BIT); + + // TODO: TCO (if/when ARMv8.5-MemTag is exposed to guests) + + new |= (old & PSR_DIT_BIT); + + // PSTATE.UAO is set to zero upon any exception to AArch64 + // See ARM DDI 0487E.a, page D5-2579. + + // PSTATE.PAN is unchanged unless SCTLR_ELx.SPAN == 0b0 + // SCTLR_ELx.SPAN is RES1 when ARMv8.1-PAN is not implemented + // See ARM DDI 0487E.a, page D5-2578. + new |= (old & PSR_PAN_BIT); + if (!(sctlr & SCTLR_EL1_SPAN)) + new |= PSR_PAN_BIT; + + // PSTATE.SS is set to zero upon any exception to AArch64 + // See ARM DDI 0487E.a, page D2-2452. + + // PSTATE.IL is set to zero upon any exception to AArch64 + // See ARM DDI 0487E.a, page D1-2306. + + // PSTATE.SSBS is set to SCTLR_ELx.DSSBS upon any exception to AArch64 + // See ARM DDI 0487E.a, page D13-3258 + if (sctlr & SCTLR_ELx_DSSBS) + new |= PSR_SSBS_BIT; + + // PSTATE.BTYPE is set to zero upon any exception to AArch64 + // See ARM DDI 0487E.a, pages D1-2293 to D1-2294. + + new |= PSR_D_BIT; + new |= PSR_A_BIT; + new |= PSR_I_BIT; + new |= PSR_F_BIT; + + new |= PSR_MODE_EL1h; + + return new; +} + static void inject_abt64(struct kvm_vcpu *vcpu, bool is_iabt, unsigned long addr) { unsigned long cpsr = *vcpu_cpsr(vcpu); @@ -59,7 +119,7 @@ vcpu_write_elr_el1(vcpu, *vcpu_pc(vcpu)); *vcpu_pc(vcpu) = get_except_vector(vcpu, except_type_sync); - *vcpu_cpsr(vcpu) = PSTATE_FAULT_BITS_64; + *vcpu_cpsr(vcpu) = get_except64_pstate(vcpu); vcpu_write_spsr(vcpu, cpsr); vcpu_write_sys_reg(vcpu, addr, FAR_EL1); @@ -94,7 +154,7 @@ vcpu_write_elr_el1(vcpu, *vcpu_pc(vcpu)); *vcpu_pc(vcpu) = get_except_vector(vcpu, except_type_sync); - *vcpu_cpsr(vcpu) = PSTATE_FAULT_BITS_64; + *vcpu_cpsr(vcpu) = get_except64_pstate(vcpu); vcpu_write_spsr(vcpu, cpsr); /* --- linux-riscv-5.4.0.orig/arch/arm64/kvm/sys_regs.c +++ linux-riscv-5.4.0/arch/arm64/kvm/sys_regs.c @@ -2360,8 +2360,11 @@ if ((id & KVM_REG_ARM_COPROC_MASK) != KVM_REG_ARM64_SYSREG) return NULL; + if (!index_to_params(id, ¶ms)) + return NULL; + table = get_target_table(vcpu->arch.target, true, &num); - r = find_reg_by_id(id, ¶ms, table, num); + r = find_reg(¶ms, table, num); if (!r) r = find_reg(¶ms, sys_reg_descs, ARRAY_SIZE(sys_reg_descs)); --- linux-riscv-5.4.0.orig/arch/arm64/mm/fault.c +++ linux-riscv-5.4.0/arch/arm64/mm/fault.c @@ -454,7 +454,7 @@ const struct fault_info *inf; struct mm_struct *mm = current->mm; vm_fault_t fault, major = 0; - unsigned long vm_flags = VM_READ | VM_WRITE; + unsigned long vm_flags = VM_READ | VM_WRITE | VM_EXEC; unsigned int mm_flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE; if (kprobe_page_fault(regs, esr)) --- linux-riscv-5.4.0.orig/arch/arm64/mm/mmu.c +++ linux-riscv-5.4.0/arch/arm64/mm/mmu.c @@ -1069,7 +1069,6 @@ { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; - struct zone *zone; /* * FIXME: Cleanup page tables (also in arch_add_memory() in case @@ -1078,7 +1077,6 @@ * unplug. ARCH_ENABLE_MEMORY_HOTREMOVE must not be * unlocked yet. */ - zone = page_zone(pfn_to_page(start_pfn)); - __remove_pages(zone, start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap); } #endif --- linux-riscv-5.4.0.orig/arch/csky/Kconfig +++ linux-riscv-5.4.0/arch/csky/Kconfig @@ -36,6 +36,7 @@ select GX6605S_TIMER if CPU_CK610 select HAVE_ARCH_TRACEHOOK select HAVE_ARCH_AUDITSYSCALL + select HAVE_COPY_THREAD_TLS select HAVE_DYNAMIC_FTRACE select HAVE_FUNCTION_TRACER select HAVE_FUNCTION_GRAPH_TRACER @@ -74,7 +75,7 @@ config CPU_HAS_LDSTEX bool help - For SMP, CPU needs "ldex&stex" instrcutions to atomic operations. + For SMP, CPU needs "ldex&stex" instructions for atomic operations. config CPU_NEED_TLBSYNC bool --- linux-riscv-5.4.0.orig/arch/csky/abiv1/inc/abi/entry.h +++ linux-riscv-5.4.0/arch/csky/abiv1/inc/abi/entry.h @@ -16,14 +16,16 @@ #define LSAVE_A4 40 #define LSAVE_A5 44 +#define usp ss1 + .macro USPTOKSP - mtcr sp, ss1 + mtcr sp, usp mfcr sp, ss0 .endm .macro KSPTOUSP mtcr sp, ss0 - mfcr sp, ss1 + mfcr sp, usp .endm .macro SAVE_ALL epc_inc @@ -45,7 +47,13 @@ add lr, r13 stw lr, (sp, 8) + mov lr, sp + addi lr, 32 + addi lr, 32 + addi lr, 16 + bt 2f mfcr lr, ss1 +2: stw lr, (sp, 16) stw a0, (sp, 20) @@ -79,9 +87,10 @@ ldw a0, (sp, 12) mtcr a0, epsr btsti a0, 31 + bt 1f ldw a0, (sp, 16) mtcr a0, ss1 - +1: ldw a0, (sp, 24) ldw a1, (sp, 28) ldw a2, (sp, 32) @@ -102,9 +111,9 @@ addi sp, 32 addi sp, 8 - bt 1f + bt 2f KSPTOUSP -1: +2: rte .endm --- linux-riscv-5.4.0.orig/arch/csky/abiv2/inc/abi/entry.h +++ linux-riscv-5.4.0/arch/csky/abiv2/inc/abi/entry.h @@ -31,7 +31,13 @@ mfcr lr, epsr stw lr, (sp, 12) + btsti lr, 31 + bf 1f + addi lr, sp, 152 + br 2f +1: mfcr lr, usp +2: stw lr, (sp, 16) stw a0, (sp, 20) @@ -64,8 +70,10 @@ mtcr a0, epc ldw a0, (sp, 12) mtcr a0, epsr + btsti a0, 31 ldw a0, (sp, 16) mtcr a0, usp + mtcr a0, ss0 #ifdef CONFIG_CPU_HAS_HILO ldw a0, (sp, 140) @@ -86,6 +94,9 @@ addi sp, 40 ldm r16-r30, (sp) addi sp, 72 + bf 1f + mfcr sp, ss0 +1: rte .endm --- linux-riscv-5.4.0.orig/arch/csky/include/uapi/asm/unistd.h +++ linux-riscv-5.4.0/arch/csky/include/uapi/asm/unistd.h @@ -1,7 +1,10 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ // Copyright (C) 2018 Hangzhou C-SKY Microsystems co.,ltd. +#define __ARCH_WANT_STAT64 +#define __ARCH_WANT_NEW_STAT #define __ARCH_WANT_SYS_CLONE +#define __ARCH_WANT_SYS_CLONE3 #define __ARCH_WANT_SET_GET_RLIMIT #define __ARCH_WANT_TIME32_SYSCALLS #include --- linux-riscv-5.4.0.orig/arch/csky/kernel/atomic.S +++ linux-riscv-5.4.0/arch/csky/kernel/atomic.S @@ -17,10 +17,12 @@ mfcr a3, epc addi a3, TRAP0_SIZE - subi sp, 8 + subi sp, 16 stw a3, (sp, 0) mfcr a3, epsr stw a3, (sp, 4) + mfcr a3, usp + stw a3, (sp, 8) psrset ee #ifdef CONFIG_CPU_HAS_LDSTEX @@ -47,7 +49,9 @@ mtcr a3, epc ldw a3, (sp, 4) mtcr a3, epsr - addi sp, 8 + ldw a3, (sp, 8) + mtcr a3, usp + addi sp, 16 KSPTOUSP rte END(csky_cmpxchg) --- linux-riscv-5.4.0.orig/arch/csky/kernel/process.c +++ linux-riscv-5.4.0/arch/csky/kernel/process.c @@ -34,10 +34,11 @@ return sw->r15; } -int copy_thread(unsigned long clone_flags, +int copy_thread_tls(unsigned long clone_flags, unsigned long usp, unsigned long kthread_arg, - struct task_struct *p) + struct task_struct *p, + unsigned long tls) { struct switch_stack *childstack; struct pt_regs *childregs = task_pt_regs(p); @@ -64,7 +65,7 @@ childregs->usp = usp; if (clone_flags & CLONE_SETTLS) task_thread_info(p)->tp_value = childregs->tls - = childregs->regs[0]; + = tls; childregs->a0 = 0; childstack->r15 = (unsigned long) ret_from_fork; --- linux-riscv-5.4.0.orig/arch/csky/kernel/smp.c +++ linux-riscv-5.4.0/arch/csky/kernel/smp.c @@ -120,7 +120,7 @@ int rc; if (ipi_irq == 0) - panic("%s IRQ mapping failed\n", __func__); + return; rc = request_percpu_irq(ipi_irq, handle_ipi, "IPI Interrupt", &ipi_dummy_dev); --- linux-riscv-5.4.0.orig/arch/csky/mm/Makefile +++ linux-riscv-5.4.0/arch/csky/mm/Makefile @@ -1,8 +1,10 @@ # SPDX-License-Identifier: GPL-2.0-only ifeq ($(CONFIG_CPU_HAS_CACHEV2),y) obj-y += cachev2.o +CFLAGS_REMOVE_cachev2.o = $(CC_FLAGS_FTRACE) else obj-y += cachev1.o +CFLAGS_REMOVE_cachev1.o = $(CC_FLAGS_FTRACE) endif obj-y += dma-mapping.o --- linux-riscv-5.4.0.orig/arch/csky/mm/init.c +++ linux-riscv-5.4.0/arch/csky/mm/init.c @@ -31,6 +31,7 @@ pgd_t swapper_pg_dir[PTRS_PER_PGD] __page_aligned_bss; pte_t invalid_pte_table[PTRS_PER_PTE] __page_aligned_bss; +EXPORT_SYMBOL(invalid_pte_table); unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)] __page_aligned_bss; EXPORT_SYMBOL(empty_zero_page); --- linux-riscv-5.4.0.orig/arch/hexagon/include/asm/atomic.h +++ linux-riscv-5.4.0/arch/hexagon/include/asm/atomic.h @@ -91,7 +91,7 @@ "1: %0 = memw_locked(%1);\n" \ " %0 = "#op "(%0,%2);\n" \ " memw_locked(%1,P3)=%0;\n" \ - " if !P3 jump 1b;\n" \ + " if (!P3) jump 1b;\n" \ : "=&r" (output) \ : "r" (&v->counter), "r" (i) \ : "memory", "p3" \ @@ -107,7 +107,7 @@ "1: %0 = memw_locked(%1);\n" \ " %0 = "#op "(%0,%2);\n" \ " memw_locked(%1,P3)=%0;\n" \ - " if !P3 jump 1b;\n" \ + " if (!P3) jump 1b;\n" \ : "=&r" (output) \ : "r" (&v->counter), "r" (i) \ : "memory", "p3" \ @@ -124,7 +124,7 @@ "1: %0 = memw_locked(%2);\n" \ " %1 = "#op "(%0,%3);\n" \ " memw_locked(%2,P3)=%1;\n" \ - " if !P3 jump 1b;\n" \ + " if (!P3) jump 1b;\n" \ : "=&r" (output), "=&r" (val) \ : "r" (&v->counter), "r" (i) \ : "memory", "p3" \ @@ -173,7 +173,7 @@ " }" " memw_locked(%2, p3) = %1;" " {" - " if !p3 jump 1b;" + " if (!p3) jump 1b;" " }" "2:" : "=&r" (__oldval), "=&r" (tmp) --- linux-riscv-5.4.0.orig/arch/hexagon/include/asm/bitops.h +++ linux-riscv-5.4.0/arch/hexagon/include/asm/bitops.h @@ -38,7 +38,7 @@ "1: R12 = memw_locked(R10);\n" " { P0 = tstbit(R12,R11); R12 = clrbit(R12,R11); }\n" " memw_locked(R10,P1) = R12;\n" - " {if !P1 jump 1b; %0 = mux(P0,#1,#0);}\n" + " {if (!P1) jump 1b; %0 = mux(P0,#1,#0);}\n" : "=&r" (oldval) : "r" (addr), "r" (nr) : "r10", "r11", "r12", "p0", "p1", "memory" @@ -62,7 +62,7 @@ "1: R12 = memw_locked(R10);\n" " { P0 = tstbit(R12,R11); R12 = setbit(R12,R11); }\n" " memw_locked(R10,P1) = R12;\n" - " {if !P1 jump 1b; %0 = mux(P0,#1,#0);}\n" + " {if (!P1) jump 1b; %0 = mux(P0,#1,#0);}\n" : "=&r" (oldval) : "r" (addr), "r" (nr) : "r10", "r11", "r12", "p0", "p1", "memory" @@ -88,7 +88,7 @@ "1: R12 = memw_locked(R10);\n" " { P0 = tstbit(R12,R11); R12 = togglebit(R12,R11); }\n" " memw_locked(R10,P1) = R12;\n" - " {if !P1 jump 1b; %0 = mux(P0,#1,#0);}\n" + " {if (!P1) jump 1b; %0 = mux(P0,#1,#0);}\n" : "=&r" (oldval) : "r" (addr), "r" (nr) : "r10", "r11", "r12", "p0", "p1", "memory" @@ -223,7 +223,7 @@ int r; asm("{ P0 = cmp.eq(%1,#0); %0 = ct0(%1);}\n" - "{ if P0 %0 = #0; if !P0 %0 = add(%0,#1);}\n" + "{ if (P0) %0 = #0; if (!P0) %0 = add(%0,#1);}\n" : "=&r" (r) : "r" (x) : "p0"); --- linux-riscv-5.4.0.orig/arch/hexagon/include/asm/cmpxchg.h +++ linux-riscv-5.4.0/arch/hexagon/include/asm/cmpxchg.h @@ -30,7 +30,7 @@ __asm__ __volatile__ ( "1: %0 = memw_locked(%1);\n" /* load into retval */ " memw_locked(%1,P0) = %2;\n" /* store into memory */ - " if !P0 jump 1b;\n" + " if (!P0) jump 1b;\n" : "=&r" (retval) : "r" (ptr), "r" (x) : "memory", "p0" --- linux-riscv-5.4.0.orig/arch/hexagon/include/asm/futex.h +++ linux-riscv-5.4.0/arch/hexagon/include/asm/futex.h @@ -16,7 +16,7 @@ /* For example: %1 = %4 */ \ insn \ "2: memw_locked(%3,p2) = %1;\n" \ - " if !p2 jump 1b;\n" \ + " if (!p2) jump 1b;\n" \ " %1 = #0;\n" \ "3:\n" \ ".section .fixup,\"ax\"\n" \ @@ -84,10 +84,10 @@ "1: %1 = memw_locked(%3)\n" " {\n" " p2 = cmp.eq(%1,%4)\n" - " if !p2.new jump:NT 3f\n" + " if (!p2.new) jump:NT 3f\n" " }\n" "2: memw_locked(%3,p2) = %5\n" - " if !p2 jump 1b\n" + " if (!p2) jump 1b\n" "3:\n" ".section .fixup,\"ax\"\n" "4: %0 = #%6\n" --- linux-riscv-5.4.0.orig/arch/hexagon/include/asm/spinlock.h +++ linux-riscv-5.4.0/arch/hexagon/include/asm/spinlock.h @@ -30,9 +30,9 @@ __asm__ __volatile__( "1: R6 = memw_locked(%0);\n" " { P3 = cmp.ge(R6,#0); R6 = add(R6,#1);}\n" - " { if !P3 jump 1b; }\n" + " { if (!P3) jump 1b; }\n" " memw_locked(%0,P3) = R6;\n" - " { if !P3 jump 1b; }\n" + " { if (!P3) jump 1b; }\n" : : "r" (&lock->lock) : "memory", "r6", "p3" @@ -46,7 +46,7 @@ "1: R6 = memw_locked(%0);\n" " R6 = add(R6,#-1);\n" " memw_locked(%0,P3) = R6\n" - " if !P3 jump 1b;\n" + " if (!P3) jump 1b;\n" : : "r" (&lock->lock) : "memory", "r6", "p3" @@ -61,7 +61,7 @@ __asm__ __volatile__( " R6 = memw_locked(%1);\n" " { %0 = #0; P3 = cmp.ge(R6,#0); R6 = add(R6,#1);}\n" - " { if !P3 jump 1f; }\n" + " { if (!P3) jump 1f; }\n" " memw_locked(%1,P3) = R6;\n" " { %0 = P3 }\n" "1:\n" @@ -78,9 +78,9 @@ __asm__ __volatile__( "1: R6 = memw_locked(%0)\n" " { P3 = cmp.eq(R6,#0); R6 = #-1;}\n" - " { if !P3 jump 1b; }\n" + " { if (!P3) jump 1b; }\n" " memw_locked(%0,P3) = R6;\n" - " { if !P3 jump 1b; }\n" + " { if (!P3) jump 1b; }\n" : : "r" (&lock->lock) : "memory", "r6", "p3" @@ -94,7 +94,7 @@ __asm__ __volatile__( " R6 = memw_locked(%1)\n" " { %0 = #0; P3 = cmp.eq(R6,#0); R6 = #-1;}\n" - " { if !P3 jump 1f; }\n" + " { if (!P3) jump 1f; }\n" " memw_locked(%1,P3) = R6;\n" " %0 = P3;\n" "1:\n" @@ -117,9 +117,9 @@ __asm__ __volatile__( "1: R6 = memw_locked(%0);\n" " P3 = cmp.eq(R6,#0);\n" - " { if !P3 jump 1b; R6 = #1; }\n" + " { if (!P3) jump 1b; R6 = #1; }\n" " memw_locked(%0,P3) = R6;\n" - " { if !P3 jump 1b; }\n" + " { if (!P3) jump 1b; }\n" : : "r" (&lock->lock) : "memory", "r6", "p3" @@ -139,7 +139,7 @@ __asm__ __volatile__( " R6 = memw_locked(%1);\n" " P3 = cmp.eq(R6,#0);\n" - " { if !P3 jump 1f; R6 = #1; %0 = #0; }\n" + " { if (!P3) jump 1f; R6 = #1; %0 = #0; }\n" " memw_locked(%1,P3) = R6;\n" " %0 = P3;\n" "1:\n" --- linux-riscv-5.4.0.orig/arch/hexagon/kernel/stacktrace.c +++ linux-riscv-5.4.0/arch/hexagon/kernel/stacktrace.c @@ -11,8 +11,6 @@ #include #include -register unsigned long current_frame_pointer asm("r30"); - struct stackframe { unsigned long fp; unsigned long rets; @@ -30,7 +28,7 @@ low = (unsigned long)task_stack_page(current); high = low + THREAD_SIZE; - fp = current_frame_pointer; + fp = (unsigned long)__builtin_frame_address(0); while (fp >= low && fp <= (high - sizeof(*frame))) { frame = (struct stackframe *)fp; --- linux-riscv-5.4.0.orig/arch/hexagon/kernel/vm_entry.S +++ linux-riscv-5.4.0/arch/hexagon/kernel/vm_entry.S @@ -369,7 +369,7 @@ R26.L = #LO(do_work_pending); R0 = #VM_INT_DISABLE; } - if P0 jump check_work_pending + if (P0) jump check_work_pending { R0 = R25; callr R24 --- linux-riscv-5.4.0.orig/arch/ia64/mm/init.c +++ linux-riscv-5.4.0/arch/ia64/mm/init.c @@ -689,9 +689,7 @@ { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; - struct zone *zone; - zone = page_zone(pfn_to_page(start_pfn)); - __remove_pages(zone, start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap); } #endif --- linux-riscv-5.4.0.orig/arch/microblaze/kernel/cpu/cache.c +++ linux-riscv-5.4.0/arch/microblaze/kernel/cpu/cache.c @@ -92,7 +92,8 @@ #define CACHE_LOOP_LIMITS(start, end, cache_line_length, cache_size) \ do { \ int align = ~(cache_line_length - 1); \ - end = min(start + cache_size, end); \ + if (start < UINT_MAX - cache_size) \ + end = min(start + cache_size, end); \ start &= align; \ } while (0) --- linux-riscv-5.4.0.orig/arch/mips/Kconfig +++ linux-riscv-5.4.0/arch/mips/Kconfig @@ -46,7 +46,7 @@ select HAVE_ARCH_TRACEHOOK select HAVE_ARCH_TRANSPARENT_HUGEPAGE if CPU_SUPPORTS_HUGEPAGES select HAVE_ASM_MODVERSIONS - select HAVE_EBPF_JIT if (!CPU_MICROMIPS) + select HAVE_EBPF_JIT if 64BIT && !CPU_MICROMIPS && TARGET_ISA_REV >= 2 select HAVE_CONTEXT_TRACKING select HAVE_COPY_THREAD_TLS select HAVE_C_RECORDMCOUNT --- linux-riscv-5.4.0.orig/arch/mips/Makefile.postlink +++ linux-riscv-5.4.0/arch/mips/Makefile.postlink @@ -12,7 +12,7 @@ include scripts/Kbuild.include CMD_RELOCS = arch/mips/boot/tools/relocs -quiet_cmd_relocs = RELOCS $@ +quiet_cmd_relocs = RELOCS $@ cmd_relocs = $(CMD_RELOCS) $@ # `@true` prevents complaint when there is nothing to be done --- linux-riscv-5.4.0.orig/arch/mips/boot/Makefile +++ linux-riscv-5.4.0/arch/mips/boot/Makefile @@ -123,7 +123,7 @@ targets += vmlinux.its targets += vmlinux.gz.its targets += vmlinux.bz2.its -targets += vmlinux.lzmo.its +targets += vmlinux.lzma.its targets += vmlinux.lzo.its quiet_cmd_cpp_its_S = ITS $@ --- linux-riscv-5.4.0.orig/arch/mips/boot/compressed/Makefile +++ linux-riscv-5.4.0/arch/mips/boot/compressed/Makefile @@ -29,6 +29,9 @@ -DBOOT_HEAP_SIZE=$(BOOT_HEAP_SIZE) \ -DKERNEL_ENTRY=$(VMLINUX_ENTRY_ADDRESS) +# Prevents link failures: __sanitizer_cov_trace_pc() is not linked in. +KCOV_INSTRUMENT := n + # decompressor objects (linked with vmlinuz) vmlinuzobjs-y := $(obj)/head.o $(obj)/decompress.o $(obj)/string.o --- linux-riscv-5.4.0.orig/arch/mips/include/asm/pgtable-64.h +++ linux-riscv-5.4.0/arch/mips/include/asm/pgtable-64.h @@ -18,10 +18,12 @@ #include #define __ARCH_USE_5LEVEL_HACK -#if defined(CONFIG_PAGE_SIZE_64KB) && !defined(CONFIG_MIPS_VA_BITS_48) +#if CONFIG_PGTABLE_LEVELS == 2 #include -#elif !(defined(CONFIG_PAGE_SIZE_4KB) && defined(CONFIG_MIPS_VA_BITS_48)) +#elif CONFIG_PGTABLE_LEVELS == 3 #include +#else +#include #endif /* @@ -216,6 +218,9 @@ return pgd_val(pgd); } +#define pgd_phys(pgd) virt_to_phys((void *)pgd_val(pgd)) +#define pgd_page(pgd) (pfn_to_page(pgd_phys(pgd) >> PAGE_SHIFT)) + static inline pud_t *pud_offset(pgd_t *pgd, unsigned long address) { return (pud_t *)pgd_page_vaddr(*pgd) + pud_index(address); --- linux-riscv-5.4.0.orig/arch/mips/include/asm/thread_info.h +++ linux-riscv-5.4.0/arch/mips/include/asm/thread_info.h @@ -49,8 +49,26 @@ .addr_limit = KERNEL_DS, \ } -/* How to get the thread information struct from C. */ +/* + * A pointer to the struct thread_info for the currently executing thread is + * held in register $28/$gp. + * + * We declare __current_thread_info as a global register variable rather than a + * local register variable within current_thread_info() because clang doesn't + * support explicit local register variables. + * + * When building the VDSO we take care not to declare the global register + * variable because this causes GCC to not preserve the value of $28/$gp in + * functions that change its value (which is common in the PIC VDSO when + * accessing the GOT). Since the VDSO shouldn't be accessing + * __current_thread_info anyway we declare it extern in order to cause a link + * failure if it's referenced. + */ +#ifdef __VDSO__ +extern struct thread_info *__current_thread_info; +#else register struct thread_info *__current_thread_info __asm__("$28"); +#endif static inline struct thread_info *current_thread_info(void) { --- linux-riscv-5.4.0.orig/arch/mips/include/asm/vdso/gettimeofday.h +++ linux-riscv-5.4.0/arch/mips/include/asm/vdso/gettimeofday.h @@ -26,8 +26,6 @@ #define __VDSO_USE_SYSCALL ULLONG_MAX -#ifdef CONFIG_MIPS_CLOCK_VSYSCALL - static __always_inline long gettimeofday_fallback( struct __kernel_old_timeval *_tv, struct timezone *_tz) @@ -48,17 +46,6 @@ return error ? -ret : ret; } -#else - -static __always_inline long gettimeofday_fallback( - struct __kernel_old_timeval *_tv, - struct timezone *_tz) -{ - return -1; -} - -#endif - static __always_inline long clock_gettime_fallback( clockid_t _clkid, struct __kernel_timespec *_ts) --- linux-riscv-5.4.0.orig/arch/mips/kernel/cacheinfo.c +++ linux-riscv-5.4.0/arch/mips/kernel/cacheinfo.c @@ -50,6 +50,25 @@ return 0; } +static void fill_cpumask_siblings(int cpu, cpumask_t *cpu_map) +{ + int cpu1; + + for_each_possible_cpu(cpu1) + if (cpus_are_siblings(cpu, cpu1)) + cpumask_set_cpu(cpu1, cpu_map); +} + +static void fill_cpumask_cluster(int cpu, cpumask_t *cpu_map) +{ + int cpu1; + int cluster = cpu_cluster(&cpu_data[cpu]); + + for_each_possible_cpu(cpu1) + if (cpu_cluster(&cpu_data[cpu1]) == cluster) + cpumask_set_cpu(cpu1, cpu_map); +} + static int __populate_cache_leaves(unsigned int cpu) { struct cpuinfo_mips *c = ¤t_cpu_data; @@ -57,14 +76,20 @@ struct cacheinfo *this_leaf = this_cpu_ci->info_list; if (c->icache.waysize) { + /* L1 caches are per core */ + fill_cpumask_siblings(cpu, &this_leaf->shared_cpu_map); populate_cache(dcache, this_leaf, 1, CACHE_TYPE_DATA); + fill_cpumask_siblings(cpu, &this_leaf->shared_cpu_map); populate_cache(icache, this_leaf, 1, CACHE_TYPE_INST); } else { populate_cache(dcache, this_leaf, 1, CACHE_TYPE_UNIFIED); } - if (c->scache.waysize) + if (c->scache.waysize) { + /* L2 cache is per cluster */ + fill_cpumask_cluster(cpu, &this_leaf->shared_cpu_map); populate_cache(scache, this_leaf, 2, CACHE_TYPE_UNIFIED); + } if (c->tcache.waysize) populate_cache(tcache, this_leaf, 3, CACHE_TYPE_UNIFIED); --- linux-riscv-5.4.0.orig/arch/mips/kernel/syscalls/Makefile +++ linux-riscv-5.4.0/arch/mips/kernel/syscalls/Makefile @@ -18,7 +18,7 @@ '$(syshdr_pfx_$(basetarget))' \ '$(syshdr_offset_$(basetarget))' -quiet_cmd_sysnr = SYSNR $@ +quiet_cmd_sysnr = SYSNR $@ cmd_sysnr = $(CONFIG_SHELL) '$(sysnr)' '$<' '$@' \ '$(sysnr_abis_$(basetarget))' \ '$(sysnr_pfx_$(basetarget))' \ --- linux-riscv-5.4.0.orig/arch/mips/kernel/vpe.c +++ linux-riscv-5.4.0/arch/mips/kernel/vpe.c @@ -134,7 +134,7 @@ { list_del(&v->list); if (v->load_addr) - release_progmem(v); + release_progmem(v->load_addr); kfree(v); } --- linux-riscv-5.4.0.orig/arch/mips/loongson64/loongson-3/platform.c +++ linux-riscv-5.4.0/arch/mips/loongson64/loongson-3/platform.c @@ -27,6 +27,9 @@ continue; pdev = kzalloc(sizeof(struct platform_device), GFP_KERNEL); + if (!pdev) + return -ENOMEM; + pdev->name = loongson_sysconf.sensors[i].name; pdev->id = loongson_sysconf.sensors[i].id; pdev->dev.platform_data = &loongson_sysconf.sensors[i]; --- linux-riscv-5.4.0.orig/arch/mips/net/ebpf_jit.c +++ linux-riscv-5.4.0/arch/mips/net/ebpf_jit.c @@ -604,6 +604,7 @@ static int emit_bpf_tail_call(struct jit_ctx *ctx, int this_idx) { int off, b_off; + int tcc_reg; ctx->flags |= EBPF_SEEN_TC; /* @@ -616,14 +617,14 @@ b_off = b_imm(this_idx + 1, ctx); emit_instr(ctx, bne, MIPS_R_AT, MIPS_R_ZERO, b_off); /* - * if (--TCC < 0) + * if (TCC-- < 0) * goto out; */ /* Delay slot */ - emit_instr(ctx, daddiu, MIPS_R_T5, - (ctx->flags & EBPF_TCC_IN_V1) ? MIPS_R_V1 : MIPS_R_S4, -1); + tcc_reg = (ctx->flags & EBPF_TCC_IN_V1) ? MIPS_R_V1 : MIPS_R_S4; + emit_instr(ctx, daddiu, MIPS_R_T5, tcc_reg, -1); b_off = b_imm(this_idx + 1, ctx); - emit_instr(ctx, bltz, MIPS_R_T5, b_off); + emit_instr(ctx, bltz, tcc_reg, b_off); /* * prog = array->ptrs[index]; * if (prog == NULL) @@ -1803,7 +1804,7 @@ unsigned int image_size; u8 *image_ptr; - if (!prog->jit_requested || MIPS_ISA_REV < 2) + if (!prog->jit_requested) return prog; tmp = bpf_jit_blind_constants(prog); --- linux-riscv-5.4.0.orig/arch/mips/pci/pci-xtalk-bridge.c +++ linux-riscv-5.4.0/arch/mips/pci/pci-xtalk-bridge.c @@ -279,16 +279,15 @@ struct bridge_irq_chip_data *data = d->chip_data; int bit = d->parent_data->hwirq; int pin = d->hwirq; - nasid_t nasid; int ret, cpu; ret = irq_chip_set_affinity_parent(d, mask, force); if (ret >= 0) { cpu = cpumask_first_and(mask, cpu_online_mask); - nasid = COMPACT_TO_NASID_NODEID(cpu_to_node(cpu)); + data->nnasid = COMPACT_TO_NASID_NODEID(cpu_to_node(cpu)); bridge_write(data->bc, b_int_addr[pin].addr, (((data->bc->intr_addr >> 30) & 0x30000) | - bit | (nasid << 8))); + bit | (data->nasid << 8))); bridge_read(data->bc, b_wid_tflush); } return ret; --- linux-riscv-5.4.0.orig/arch/mips/ralink/Kconfig +++ linux-riscv-5.4.0/arch/mips/ralink/Kconfig @@ -51,6 +51,7 @@ select MIPS_GIC select COMMON_CLK select CLKSRC_MIPS_GIC + select HAVE_PCI if PCI_MT7621 endchoice choice --- linux-riscv-5.4.0.orig/arch/mips/sgi-ip27/ip27-irq.c +++ linux-riscv-5.4.0/arch/mips/sgi-ip27/ip27-irq.c @@ -73,6 +73,9 @@ int cpu; cpu = cpumask_first_and(mask, cpu_online_mask); + if (cpu >= nr_cpu_ids) + cpu = cpumask_any(cpu_online_mask); + nasid = COMPACT_TO_NASID_NODEID(cpu_to_node(cpu)); hd->cpu = cpu; if (!cputoslice(cpu)) { @@ -139,6 +142,7 @@ /* use CPU connected to nearest hub */ hub = hub_data(NASID_TO_COMPACT_NODEID(info->nasid)); setup_hub_mask(hd, &hub->h_cpus); + info->nasid = cpu_to_node(hd->cpu); /* Make sure it's not already pending when we connect it. */ REMOTE_HUB_CLR_INTR(info->nasid, swlevel); --- linux-riscv-5.4.0.orig/arch/mips/vdso/vgettimeofday.c +++ linux-riscv-5.4.0/arch/mips/vdso/vgettimeofday.c @@ -17,12 +17,22 @@ return __cvdso_clock_gettime32(clock, ts); } +#ifdef CONFIG_MIPS_CLOCK_VSYSCALL + +/* + * This is behind the ifdef so that we don't provide the symbol when there's no + * possibility of there being a usable clocksource, because there's nothing we + * can do without it. When libc fails the symbol lookup it should fall back on + * the standard syscall path. + */ int __vdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz) { return __cvdso_gettimeofday(tv, tz); } +#endif /* CONFIG_MIPS_CLOCK_VSYSCALL */ + int __vdso_clock_getres(clockid_t clock_id, struct old_timespec32 *res) { @@ -43,12 +53,22 @@ return __cvdso_clock_gettime(clock, ts); } +#ifdef CONFIG_MIPS_CLOCK_VSYSCALL + +/* + * This is behind the ifdef so that we don't provide the symbol when there's no + * possibility of there being a usable clocksource, because there's nothing we + * can do without it. When libc fails the symbol lookup it should fall back on + * the standard syscall path. + */ int __vdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz) { return __cvdso_gettimeofday(tv, tz); } +#endif /* CONFIG_MIPS_CLOCK_VSYSCALL */ + int __vdso_clock_getres(clockid_t clock_id, struct __kernel_timespec *res) { --- linux-riscv-5.4.0.orig/arch/nds32/include/asm/cacheflush.h +++ linux-riscv-5.4.0/arch/nds32/include/asm/cacheflush.h @@ -9,7 +9,11 @@ #define PG_dcache_dirty PG_arch_1 void flush_icache_range(unsigned long start, unsigned long end); +#define flush_icache_range flush_icache_range + void flush_icache_page(struct vm_area_struct *vma, struct page *page); +#define flush_icache_page flush_icache_page + #ifdef CONFIG_CPU_CACHE_ALIASING void flush_cache_mm(struct mm_struct *mm); void flush_cache_dup_mm(struct mm_struct *mm); @@ -40,12 +44,11 @@ #define flush_dcache_mmap_unlock(mapping) xa_unlock_irq(&(mapping)->i_pages) #else -#include -#undef flush_icache_range -#undef flush_icache_page -#undef flush_icache_user_range void flush_icache_user_range(struct vm_area_struct *vma, struct page *page, unsigned long addr, int len); +#define flush_icache_user_range flush_icache_user_range + +#include #endif #endif /* __NDS32_CACHEFLUSH_H__ */ --- linux-riscv-5.4.0.orig/arch/parisc/Kconfig +++ linux-riscv-5.4.0/arch/parisc/Kconfig @@ -62,6 +62,7 @@ select HAVE_FTRACE_MCOUNT_RECORD if HAVE_DYNAMIC_FTRACE select HAVE_KPROBES_ON_FTRACE select HAVE_DYNAMIC_FTRACE_WITH_REGS + select HAVE_COPY_THREAD_TLS help The PA-RISC microprocessor is designed by Hewlett-Packard and used --- linux-riscv-5.4.0.orig/arch/parisc/include/asm/cmpxchg.h +++ linux-riscv-5.4.0/arch/parisc/include/asm/cmpxchg.h @@ -44,8 +44,14 @@ ** if (((unsigned long)p & 0xf) == 0) ** return __ldcw(p); */ -#define xchg(ptr, x) \ - ((__typeof__(*(ptr)))__xchg((unsigned long)(x), (ptr), sizeof(*(ptr)))) +#define xchg(ptr, x) \ +({ \ + __typeof__(*(ptr)) __ret; \ + __typeof__(*(ptr)) _x_ = (x); \ + __ret = (__typeof__(*(ptr))) \ + __xchg((unsigned long)_x_, (ptr), sizeof(*(ptr))); \ + __ret; \ +}) /* bug catcher for when unsupported size is used - won't link */ extern void __cmpxchg_called_with_bad_pointer(void); --- linux-riscv-5.4.0.orig/arch/parisc/include/asm/kexec.h +++ linux-riscv-5.4.0/arch/parisc/include/asm/kexec.h @@ -2,8 +2,6 @@ #ifndef _ASM_PARISC_KEXEC_H #define _ASM_PARISC_KEXEC_H -#ifdef CONFIG_KEXEC - /* Maximum physical address we can use pages from */ #define KEXEC_SOURCE_MEMORY_LIMIT (-1UL) /* Maximum address we can reach in physical address mode */ @@ -32,6 +30,4 @@ #endif /* __ASSEMBLY__ */ -#endif /* CONFIG_KEXEC */ - #endif /* _ASM_PARISC_KEXEC_H */ --- linux-riscv-5.4.0.orig/arch/parisc/kernel/Makefile +++ linux-riscv-5.4.0/arch/parisc/kernel/Makefile @@ -37,5 +37,5 @@ obj-$(CONFIG_JUMP_LABEL) += jump_label.o obj-$(CONFIG_KGDB) += kgdb.o obj-$(CONFIG_KPROBES) += kprobes.o -obj-$(CONFIG_KEXEC) += kexec.o relocate_kernel.o +obj-$(CONFIG_KEXEC_CORE) += kexec.o relocate_kernel.o obj-$(CONFIG_KEXEC_FILE) += kexec_file.o --- linux-riscv-5.4.0.orig/arch/parisc/kernel/drivers.c +++ linux-riscv-5.4.0/arch/parisc/kernel/drivers.c @@ -810,7 +810,7 @@ static void walk_native_bus(unsigned long io_io_low, unsigned long io_io_high, struct device *parent); -static void walk_lower_bus(struct parisc_device *dev) +static void __init walk_lower_bus(struct parisc_device *dev) { unsigned long io_io_low, io_io_high; @@ -889,8 +889,8 @@ static int count; print_pa_hwpath(dev, hw_path); - pr_info("%d. %s at 0x%px [%s] { %d, 0x%x, 0x%.3x, 0x%.5x }", - ++count, dev->name, (void*) dev->hpa.start, hw_path, dev->id.hw_type, + pr_info("%d. %s at %pap [%s] { %d, 0x%x, 0x%.3x, 0x%.5x }", + ++count, dev->name, &(dev->hpa.start), hw_path, dev->id.hw_type, dev->id.hversion_rev, dev->id.hversion, dev->id.sversion); if (dev->num_addrs) { --- linux-riscv-5.4.0.orig/arch/parisc/kernel/process.c +++ linux-riscv-5.4.0/arch/parisc/kernel/process.c @@ -208,8 +208,8 @@ * Copy architecture-specific thread state */ int -copy_thread(unsigned long clone_flags, unsigned long usp, - unsigned long kthread_arg, struct task_struct *p) +copy_thread_tls(unsigned long clone_flags, unsigned long usp, + unsigned long kthread_arg, struct task_struct *p, unsigned long tls) { struct pt_regs *cregs = &(p->thread.regs); void *stack = task_stack_page(p); @@ -254,9 +254,9 @@ cregs->ksp = (unsigned long)stack + THREAD_SZ_ALGN + FRAME_SIZE; cregs->kpc = (unsigned long) &child_return; - /* Setup thread TLS area from the 4th parameter in clone */ + /* Setup thread TLS area */ if (clone_flags & CLONE_SETTLS) - cregs->cr27 = cregs->gr[23]; + cregs->cr27 = tls; } return 0; --- linux-riscv-5.4.0.orig/arch/powerpc/Kconfig +++ linux-riscv-5.4.0/arch/powerpc/Kconfig @@ -221,8 +221,7 @@ select HAVE_HARDLOCKUP_DETECTOR_PERF if PERF_EVENTS && HAVE_PERF_EVENTS_NMI && !HAVE_HARDLOCKUP_DETECTOR_ARCH select HAVE_PERF_REGS select HAVE_PERF_USER_STACK_DUMP - select HAVE_RCU_TABLE_FREE if SMP - select HAVE_RCU_TABLE_NO_INVALIDATE if HAVE_RCU_TABLE_FREE + select HAVE_RCU_TABLE_FREE select HAVE_MMU_GATHER_PAGE_SIZE select HAVE_REGS_AND_STACK_ACCESS_API select HAVE_RELIABLE_STACKTRACE if PPC_BOOK3S_64 && CPU_LITTLE_ENDIAN @@ -237,6 +236,7 @@ select NEED_DMA_MAP_STATE if PPC64 || NOT_COHERENT_CACHE select NEED_SG_DMA_LENGTH select OF + select OF_DMA_DEFAULT_COHERENT if !NOT_COHERENT_CACHE select OF_EARLY_FLATTREE select OLD_SIGACTION if PPC32 select OLD_SIGSUSPEND @@ -934,6 +934,28 @@ If unsure, say y. +config PPC_SECURE_BOOT + prompt "Enable secure boot support" + bool + depends on PPC_POWERNV + depends on IMA_ARCH_POLICY + help + Systems with firmware secure boot enabled need to define security + policies to extend secure boot to the OS. This config allows a user + to enable OS secure boot on systems that have firmware support for + it. If in doubt say N. + +config PPC_SECVAR_SYSFS + bool "Enable sysfs interface for POWER secure variables" + default y + depends on PPC_SECURE_BOOT + depends on SYSFS + help + POWER secure variables are managed and controlled by firmware. + These variables are exposed to userspace via sysfs to enable + read/write operations on these variables. Say Y if you have + secure boot enabled and want to expose variables to userspace. + endmenu config ISA_DMA_API --- linux-riscv-5.4.0.orig/arch/powerpc/Kconfig.debug +++ linux-riscv-5.4.0/arch/powerpc/Kconfig.debug @@ -371,7 +371,7 @@ config PPC_DEBUG_WX bool "Warn on W+X mappings at boot" - depends on PPC_PTDUMP + depends on PPC_PTDUMP && STRICT_KERNEL_RWX help Generate a warning if any W+X mappings are found at boot. --- linux-riscv-5.4.0.orig/arch/powerpc/Makefile +++ linux-riscv-5.4.0/arch/powerpc/Makefile @@ -91,11 +91,13 @@ endif ifdef CONFIG_PPC64 +ifndef CONFIG_CC_IS_CLANG cflags-$(CONFIG_CPU_BIG_ENDIAN) += $(call cc-option,-mabi=elfv1) cflags-$(CONFIG_CPU_BIG_ENDIAN) += $(call cc-option,-mcall-aixdesc) aflags-$(CONFIG_CPU_BIG_ENDIAN) += $(call cc-option,-mabi=elfv1) aflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mabi=elfv2 endif +endif ifndef CONFIG_CC_IS_CLANG cflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mno-strict-align @@ -141,6 +143,7 @@ endif CFLAGS-$(CONFIG_PPC64) := $(call cc-option,-mtraceback=no) +ifndef CONFIG_CC_IS_CLANG ifdef CONFIG_CPU_LITTLE_ENDIAN CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv2,$(call cc-option,-mcall-aixdesc)) AFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv2) @@ -149,6 +152,7 @@ CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mcall-aixdesc) AFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv1) endif +endif CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mcmodel=medium,$(call cc-option,-mminimal-toc)) CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mno-pointers-to-nested-functions) @@ -281,7 +285,7 @@ all: zImage # With make 3.82 we cannot mix normal and wildcard targets -BOOT_TARGETS1 := zImage zImage.initrd uImage +BOOT_TARGETS1 := zImage zImage.initrd uImage vmlinux.strip BOOT_TARGETS2 := zImage% dtbImage% treeImage.% cuImage.% simpleImage.% uImage.% PHONY += $(BOOT_TARGETS1) $(BOOT_TARGETS2) --- linux-riscv-5.4.0.orig/arch/powerpc/Makefile.postlink +++ linux-riscv-5.4.0/arch/powerpc/Makefile.postlink @@ -17,11 +17,11 @@ quiet_cmd_relocs_check = CHKREL $@ ifdef CONFIG_PPC_BOOK3S_64 cmd_relocs_check = \ - $(CONFIG_SHELL) $(srctree)/arch/powerpc/tools/relocs_check.sh "$(OBJDUMP)" "$@" ; \ + $(CONFIG_SHELL) $(srctree)/arch/powerpc/tools/relocs_check.sh "$(OBJDUMP)" "$(NM)" "$@" ; \ $(BASH) $(srctree)/arch/powerpc/tools/unrel_branch_check.sh "$(OBJDUMP)" "$@" else cmd_relocs_check = \ - $(CONFIG_SHELL) $(srctree)/arch/powerpc/tools/relocs_check.sh "$(OBJDUMP)" "$@" + $(CONFIG_SHELL) $(srctree)/arch/powerpc/tools/relocs_check.sh "$(OBJDUMP)" "$(NM)" "$@" endif # `@true` prevents complaint when there is nothing to be done --- linux-riscv-5.4.0.orig/arch/powerpc/boot/4xx.c +++ linux-riscv-5.4.0/arch/powerpc/boot/4xx.c @@ -228,7 +228,7 @@ dpath = 8; /* 64 bits */ /* get address pins (rows) */ - val = SDRAM0_READ(DDR0_42); + val = SDRAM0_READ(DDR0_42); row = DDR_GET_VAL(val, DDR_APIN, DDR_APIN_SHIFT); if (row > max_row) --- linux-riscv-5.4.0.orig/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-0-best-effort.dtsi +++ linux-riscv-5.4.0/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-0-best-effort.dtsi @@ -63,6 +63,7 @@ #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe1000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy0: ethernet-phy@0 { reg = <0x0>; --- linux-riscv-5.4.0.orig/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-0.dtsi +++ linux-riscv-5.4.0/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-0.dtsi @@ -60,6 +60,7 @@ #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xf1000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy6: ethernet-phy@0 { reg = <0x0>; --- linux-riscv-5.4.0.orig/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-1-best-effort.dtsi +++ linux-riscv-5.4.0/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-1-best-effort.dtsi @@ -63,6 +63,7 @@ #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe3000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy1: ethernet-phy@0 { reg = <0x0>; --- linux-riscv-5.4.0.orig/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-1.dtsi +++ linux-riscv-5.4.0/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-1.dtsi @@ -60,6 +60,7 @@ #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xf3000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy7: ethernet-phy@0 { reg = <0x0>; --- linux-riscv-5.4.0.orig/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-0.dtsi +++ linux-riscv-5.4.0/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-0.dtsi @@ -59,6 +59,7 @@ #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe1000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy0: ethernet-phy@0 { reg = <0x0>; --- linux-riscv-5.4.0.orig/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-1.dtsi +++ linux-riscv-5.4.0/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-1.dtsi @@ -59,6 +59,7 @@ #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe3000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy1: ethernet-phy@0 { reg = <0x0>; --- linux-riscv-5.4.0.orig/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-2.dtsi +++ linux-riscv-5.4.0/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-2.dtsi @@ -59,6 +59,7 @@ #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe5000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy2: ethernet-phy@0 { reg = <0x0>; --- linux-riscv-5.4.0.orig/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-3.dtsi +++ linux-riscv-5.4.0/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-3.dtsi @@ -59,6 +59,7 @@ #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe7000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy3: ethernet-phy@0 { reg = <0x0>; --- linux-riscv-5.4.0.orig/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-4.dtsi +++ linux-riscv-5.4.0/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-4.dtsi @@ -59,6 +59,7 @@ #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe9000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy4: ethernet-phy@0 { reg = <0x0>; --- linux-riscv-5.4.0.orig/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-5.dtsi +++ linux-riscv-5.4.0/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-5.dtsi @@ -59,6 +59,7 @@ #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xeb000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy5: ethernet-phy@0 { reg = <0x0>; --- linux-riscv-5.4.0.orig/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-10g-0.dtsi +++ linux-riscv-5.4.0/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-10g-0.dtsi @@ -60,6 +60,7 @@ #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xf1000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy14: ethernet-phy@0 { reg = <0x0>; --- linux-riscv-5.4.0.orig/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-10g-1.dtsi +++ linux-riscv-5.4.0/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-10g-1.dtsi @@ -60,6 +60,7 @@ #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xf3000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy15: ethernet-phy@0 { reg = <0x0>; --- linux-riscv-5.4.0.orig/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-0.dtsi +++ linux-riscv-5.4.0/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-0.dtsi @@ -59,6 +59,7 @@ #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe1000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy8: ethernet-phy@0 { reg = <0x0>; --- linux-riscv-5.4.0.orig/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-1.dtsi +++ linux-riscv-5.4.0/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-1.dtsi @@ -59,6 +59,7 @@ #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe3000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy9: ethernet-phy@0 { reg = <0x0>; --- linux-riscv-5.4.0.orig/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-2.dtsi +++ linux-riscv-5.4.0/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-2.dtsi @@ -59,6 +59,7 @@ #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe5000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy10: ethernet-phy@0 { reg = <0x0>; --- linux-riscv-5.4.0.orig/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-3.dtsi +++ linux-riscv-5.4.0/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-3.dtsi @@ -59,6 +59,7 @@ #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe7000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy11: ethernet-phy@0 { reg = <0x0>; --- linux-riscv-5.4.0.orig/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-4.dtsi +++ linux-riscv-5.4.0/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-4.dtsi @@ -59,6 +59,7 @@ #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe9000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy12: ethernet-phy@0 { reg = <0x0>; --- linux-riscv-5.4.0.orig/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-5.dtsi +++ linux-riscv-5.4.0/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-5.dtsi @@ -59,6 +59,7 @@ #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xeb000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy13: ethernet-phy@0 { reg = <0x0>; --- linux-riscv-5.4.0.orig/arch/powerpc/boot/libfdt_env.h +++ linux-riscv-5.4.0/arch/powerpc/boot/libfdt_env.h @@ -6,6 +6,8 @@ #include #define INT_MAX ((int)(~0U>>1)) +#define UINT32_MAX ((u32)~0U) +#define INT32_MAX ((s32)(UINT32_MAX >> 1)) #include "of.h" --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/archrandom.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/archrandom.h @@ -28,7 +28,7 @@ unsigned long val; int rc; - rc = arch_get_random_long(&val); + rc = arch_get_random_seed_long(&val); if (rc) *v = val; --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/asm-prototypes.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/asm-prototypes.h @@ -152,9 +152,12 @@ /* Patch sites */ extern s32 patch__call_flush_count_cache; extern s32 patch__flush_count_cache_return; +extern s32 patch__flush_link_stack_return; +extern s32 patch__call_kvm_flush_link_stack; extern s32 patch__memset_nocache, patch__memcpy_nocache; extern long flush_count_cache; +extern long kvm_flush_link_stack; #ifdef CONFIG_PPC_TRANSACTIONAL_MEM void kvmppc_save_tm_hv(struct kvm_vcpu *vcpu, u64 msr, bool preserve_nv); --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/book3s/32/kup.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/book3s/32/kup.h @@ -102,11 +102,13 @@ isync(); /* Context sync required after mtsrin() */ } -static inline void allow_user_access(void __user *to, const void __user *from, u32 size) +static __always_inline void allow_user_access(void __user *to, const void __user *from, + u32 size, unsigned long dir) { u32 addr, end; - if (__builtin_constant_p(to) && to == NULL) + BUILD_BUG_ON(!__builtin_constant_p(dir)); + if (!(dir & KUAP_WRITE)) return; addr = (__force u32)to; @@ -119,11 +121,16 @@ kuap_update_sr(mfsrin(addr) & ~SR_KS, addr, end); /* Clear Ks */ } -static inline void prevent_user_access(void __user *to, const void __user *from, u32 size) +static __always_inline void prevent_user_access(void __user *to, const void __user *from, + u32 size, unsigned long dir) { u32 addr = (__force u32)to; u32 end = min(addr + size, TASK_SIZE); + BUILD_BUG_ON(!__builtin_constant_p(dir)); + if (!(dir & KUAP_WRITE)) + return; + if (!addr || addr >= TASK_SIZE || !size) return; @@ -131,12 +138,17 @@ kuap_update_sr(mfsrin(addr) | SR_KS, addr, end); /* set Ks */ } -static inline bool bad_kuap_fault(struct pt_regs *regs, bool is_write) +static inline bool +bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write) { + unsigned long begin = regs->kuap & 0xf0000000; + unsigned long end = regs->kuap << 28; + if (!is_write) return false; - return WARN(!regs->kuap, "Bug: write fault blocked by segment registers !"); + return WARN(address < begin || address >= end, + "Bug: write fault blocked by segment registers !"); } #endif /* CONFIG_PPC_KUAP */ --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/book3s/32/pgalloc.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/book3s/32/pgalloc.h @@ -49,7 +49,6 @@ #define get_hugepd_cache_index(x) (x) -#ifdef CONFIG_SMP static inline void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift) { @@ -66,13 +65,6 @@ pgtable_free(table, shift); } -#else -static inline void pgtable_free_tlb(struct mmu_gather *tlb, - void *table, int shift) -{ - pgtable_free(table, shift); -} -#endif static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table, unsigned long address) --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/book3s/64/kup-radix.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/book3s/64/kup-radix.h @@ -77,25 +77,27 @@ isync(); } -static inline void allow_user_access(void __user *to, const void __user *from, - unsigned long size) +static __always_inline void allow_user_access(void __user *to, const void __user *from, + unsigned long size, unsigned long dir) { // This is written so we can resolve to a single case at build time - if (__builtin_constant_p(to) && to == NULL) + BUILD_BUG_ON(!__builtin_constant_p(dir)); + if (dir == KUAP_READ) set_kuap(AMR_KUAP_BLOCK_WRITE); - else if (__builtin_constant_p(from) && from == NULL) + else if (dir == KUAP_WRITE) set_kuap(AMR_KUAP_BLOCK_READ); else set_kuap(0); } static inline void prevent_user_access(void __user *to, const void __user *from, - unsigned long size) + unsigned long size, unsigned long dir) { set_kuap(AMR_KUAP_BLOCKED); } -static inline bool bad_kuap_fault(struct pt_regs *regs, bool is_write) +static inline bool +bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write) { return WARN(mmu_has_feature(MMU_FTR_RADIX_KUAP) && (regs->kuap & (is_write ? AMR_KUAP_BLOCK_WRITE : AMR_KUAP_BLOCK_READ)), --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/book3s/64/mmu-hash.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/book3s/64/mmu-hash.h @@ -600,8 +600,11 @@ * */ #define MAX_USER_CONTEXT ((ASM_CONST(1) << CONTEXT_BITS) - 2) + +// The + 2 accounts for INVALID_REGION and 1 more to avoid overlap with kernel #define MIN_USER_CONTEXT (MAX_KERNEL_CTX_CNT + MAX_VMALLOC_CTX_CNT + \ - MAX_IO_CTX_CNT + MAX_VMEMMAP_CTX_CNT) + MAX_IO_CTX_CNT + MAX_VMEMMAP_CTX_CNT + 2) + /* * For platforms that support on 65bit VA we limit the context bits */ --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/book3s/64/pgalloc.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/book3s/64/pgalloc.h @@ -19,9 +19,7 @@ extern pmd_t *pmd_fragment_alloc(struct mm_struct *, unsigned long); extern void pmd_fragment_free(unsigned long *); extern void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift); -#ifdef CONFIG_SMP extern void __tlb_remove_table(void *_table); -#endif void pte_frag_destroy(void *pte_frag); static inline pgd_t *radix__pgd_alloc(struct mm_struct *mm) --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/cache.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/cache.h @@ -55,42 +55,48 @@ extern struct ppc64_caches ppc64_caches; -static inline u32 l1_cache_shift(void) +static inline u32 l1_dcache_shift(void) { return ppc64_caches.l1d.log_block_size; } -static inline u32 l1_cache_bytes(void) +static inline u32 l1_dcache_bytes(void) { return ppc64_caches.l1d.block_size; } + +static inline u32 l1_icache_shift(void) +{ + return ppc64_caches.l1i.log_block_size; +} + +static inline u32 l1_icache_bytes(void) +{ + return ppc64_caches.l1i.block_size; +} #else -static inline u32 l1_cache_shift(void) +static inline u32 l1_dcache_shift(void) { return L1_CACHE_SHIFT; } -static inline u32 l1_cache_bytes(void) +static inline u32 l1_dcache_bytes(void) { return L1_CACHE_BYTES; } -#endif -#endif /* ! __ASSEMBLY__ */ -#if defined(__ASSEMBLY__) -/* - * For a snooping icache, we still need a dummy icbi to purge all the - * prefetched instructions from the ifetch buffers. We also need a sync - * before the icbi to order the the actual stores to memory that might - * have modified instructions with the icbi. - */ -#define PURGE_PREFETCHED_INS \ - sync; \ - icbi 0,r3; \ - sync; \ - isync +static inline u32 l1_icache_shift(void) +{ + return L1_CACHE_SHIFT; +} + +static inline u32 l1_icache_bytes(void) +{ + return L1_CACHE_BYTES; +} + +#endif -#else #define __read_mostly __attribute__((__section__(".data..read_mostly"))) #ifdef CONFIG_PPC_BOOK3S_32 @@ -124,6 +130,17 @@ { __asm__ __volatile__ ("dcbst 0, %0" : : "r"(addr) : "memory"); } + +static inline void icbi(void *addr) +{ + asm volatile ("icbi 0, %0" : : "r"(addr) : "memory"); +} + +static inline void iccci(void *addr) +{ + asm volatile ("iccci 0, %0" : : "r"(addr) : "memory"); +} + #endif /* !__ASSEMBLY__ */ #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_CACHE_H */ --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/cacheflush.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/cacheflush.h @@ -42,29 +42,25 @@ #define flush_dcache_mmap_lock(mapping) do { } while (0) #define flush_dcache_mmap_unlock(mapping) do { } while (0) -extern void flush_icache_range(unsigned long, unsigned long); +void flush_icache_range(unsigned long start, unsigned long stop); extern void flush_icache_user_range(struct vm_area_struct *vma, struct page *page, unsigned long addr, int len); -extern void __flush_dcache_icache(void *page_va); extern void flush_dcache_icache_page(struct page *page); -#if defined(CONFIG_PPC32) && !defined(CONFIG_BOOKE) -extern void __flush_dcache_icache_phys(unsigned long physaddr); -#else -static inline void __flush_dcache_icache_phys(unsigned long physaddr) -{ - BUG(); -} -#endif +void __flush_dcache_icache(void *page); -/* - * Write any modified data cache blocks out to memory and invalidate them. - * Does not invalidate the corresponding instruction cache blocks. +/** + * flush_dcache_range(): Write any modified data cache blocks out to memory and + * invalidate them. Does not invalidate the corresponding instruction cache + * blocks. + * + * @start: the start address + * @stop: the stop address (exclusive) */ static inline void flush_dcache_range(unsigned long start, unsigned long stop) { - unsigned long shift = l1_cache_shift(); - unsigned long bytes = l1_cache_bytes(); + unsigned long shift = l1_dcache_shift(); + unsigned long bytes = l1_dcache_bytes(); void *addr = (void *)(start & ~(bytes - 1)); unsigned long size = stop - (unsigned long)addr + (bytes - 1); unsigned long i; @@ -89,8 +85,8 @@ */ static inline void clean_dcache_range(unsigned long start, unsigned long stop) { - unsigned long shift = l1_cache_shift(); - unsigned long bytes = l1_cache_bytes(); + unsigned long shift = l1_dcache_shift(); + unsigned long bytes = l1_dcache_bytes(); void *addr = (void *)(start & ~(bytes - 1)); unsigned long size = stop - (unsigned long)addr + (bytes - 1); unsigned long i; @@ -108,8 +104,8 @@ static inline void invalidate_dcache_range(unsigned long start, unsigned long stop) { - unsigned long shift = l1_cache_shift(); - unsigned long bytes = l1_cache_bytes(); + unsigned long shift = l1_dcache_shift(); + unsigned long bytes = l1_dcache_bytes(); void *addr = (void *)(start & ~(bytes - 1)); unsigned long size = stop - (unsigned long)addr + (bytes - 1); unsigned long i; --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/cputhreads.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/cputhreads.h @@ -3,6 +3,7 @@ #define _ASM_POWERPC_CPUTHREADS_H #ifndef __ASSEMBLY__ +#include #include #include --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/fixmap.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/fixmap.h @@ -77,7 +77,12 @@ static inline void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t flags) { - map_kernel_page(fix_to_virt(idx), phys, flags); + if (__builtin_constant_p(idx)) + BUILD_BUG_ON(idx >= __end_of_fixed_addresses); + else if (WARN_ON(idx >= __end_of_fixed_addresses)) + return; + + map_kernel_page(__fix_to_virt(idx), phys, flags); } #endif /* !__ASSEMBLY__ */ --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/futex.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/futex.h @@ -35,7 +35,7 @@ { int oldval = 0, ret; - allow_write_to_user(uaddr, sizeof(*uaddr)); + allow_read_write_user(uaddr, uaddr, sizeof(*uaddr)); pagefault_disable(); switch (op) { @@ -62,7 +62,7 @@ *oval = oldval; - prevent_write_to_user(uaddr, sizeof(*uaddr)); + prevent_read_write_user(uaddr, uaddr, sizeof(*uaddr)); return ret; } @@ -76,7 +76,8 @@ if (!access_ok(uaddr, sizeof(u32))) return -EFAULT; - allow_write_to_user(uaddr, sizeof(*uaddr)); + allow_read_write_user(uaddr, uaddr, sizeof(*uaddr)); + __asm__ __volatile__ ( PPC_ATOMIC_ENTRY_BARRIER "1: lwarx %1,0,%3 # futex_atomic_cmpxchg_inatomic\n\ @@ -97,7 +98,8 @@ : "cc", "memory"); *uval = prev; - prevent_write_to_user(uaddr, sizeof(*uaddr)); + prevent_read_write_user(uaddr, uaddr, sizeof(*uaddr)); + return ret; } --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/kup.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/kup.h @@ -2,6 +2,10 @@ #ifndef _ASM_POWERPC_KUP_H_ #define _ASM_POWERPC_KUP_H_ +#define KUAP_READ 1 +#define KUAP_WRITE 2 +#define KUAP_READ_WRITE (KUAP_READ | KUAP_WRITE) + #ifdef CONFIG_PPC64 #include #endif @@ -42,32 +46,48 @@ #else static inline void setup_kuap(bool disabled) { } static inline void allow_user_access(void __user *to, const void __user *from, - unsigned long size) { } + unsigned long size, unsigned long dir) { } static inline void prevent_user_access(void __user *to, const void __user *from, - unsigned long size) { } -static inline bool bad_kuap_fault(struct pt_regs *regs, bool is_write) { return false; } + unsigned long size, unsigned long dir) { } +static inline bool +bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write) +{ + return false; +} #endif /* CONFIG_PPC_KUAP */ static inline void allow_read_from_user(const void __user *from, unsigned long size) { - allow_user_access(NULL, from, size); + allow_user_access(NULL, from, size, KUAP_READ); } static inline void allow_write_to_user(void __user *to, unsigned long size) { - allow_user_access(to, NULL, size); + allow_user_access(to, NULL, size, KUAP_WRITE); +} + +static inline void allow_read_write_user(void __user *to, const void __user *from, + unsigned long size) +{ + allow_user_access(to, from, size, KUAP_READ_WRITE); } static inline void prevent_read_from_user(const void __user *from, unsigned long size) { - prevent_user_access(NULL, from, size); + prevent_user_access(NULL, from, size, KUAP_READ); } static inline void prevent_write_to_user(void __user *to, unsigned long size) { - prevent_user_access(to, NULL, size); + prevent_user_access(to, NULL, size, KUAP_WRITE); +} + +static inline void prevent_read_write_user(void __user *to, const void __user *from, + unsigned long size) +{ + prevent_user_access(to, from, size, KUAP_READ_WRITE); } #endif /* !__ASSEMBLY__ */ -#endif /* _ASM_POWERPC_KUP_H_ */ +#endif /* _ASM_POWERPC_KUAP_H_ */ --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/nohash/32/kup-8xx.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/nohash/32/kup-8xx.h @@ -34,18 +34,19 @@ #include static inline void allow_user_access(void __user *to, const void __user *from, - unsigned long size) + unsigned long size, unsigned long dir) { mtspr(SPRN_MD_AP, MD_APG_INIT); } static inline void prevent_user_access(void __user *to, const void __user *from, - unsigned long size) + unsigned long size, unsigned long dir) { mtspr(SPRN_MD_AP, MD_APG_KUAP); } -static inline bool bad_kuap_fault(struct pt_regs *regs, bool is_write) +static inline bool +bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write) { return WARN(!((regs->kuap ^ MD_APG_KUAP) & 0xf0000000), "Bug: fault blocked by AP register !"); --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/nohash/pgalloc.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/nohash/pgalloc.h @@ -46,7 +46,6 @@ #define get_hugepd_cache_index(x) (x) -#ifdef CONFIG_SMP static inline void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift) { unsigned long pgf = (unsigned long)table; @@ -64,13 +63,6 @@ pgtable_free(table, shift); } -#else -static inline void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift) -{ - pgtable_free(table, shift); -} -#endif - static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table, unsigned long address) { --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/opal-api.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/opal-api.h @@ -211,7 +211,10 @@ #define OPAL_MPIPL_UPDATE 173 #define OPAL_MPIPL_REGISTER_TAG 174 #define OPAL_MPIPL_QUERY_TAG 175 -#define OPAL_LAST 175 +#define OPAL_SECVAR_GET 176 +#define OPAL_SECVAR_GET_NEXT 177 +#define OPAL_SECVAR_ENQUEUE_UPDATE 178 +#define OPAL_LAST 178 #define QUIESCE_HOLD 1 /* Spin all calls at entry */ #define QUIESCE_REJECT 2 /* Fail all calls with OPAL_BUSY */ --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/opal.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/opal.h @@ -298,6 +298,13 @@ int opal_sensor_group_enable(u32 group_hndl, int token, bool enable); int opal_nx_coproc_init(uint32_t chip_id, uint32_t ct); +int opal_secvar_get(const char *key, uint64_t key_len, u8 *data, + uint64_t *data_size); +int opal_secvar_get_next(const char *key, uint64_t *key_len, + uint64_t key_buf_size); +int opal_secvar_enqueue_update(const char *key, uint64_t key_len, u8 *data, + uint64_t data_size); + s64 opal_mpipl_update(enum opal_mpipl_ops op, u64 src, u64 dest, u64 size); s64 opal_mpipl_register_tag(enum opal_mpipl_tags tag, u64 addr); s64 opal_mpipl_query_tag(enum opal_mpipl_tags tag, u64 *addr); --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/page.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/page.h @@ -295,8 +295,13 @@ /* * Some number of bits at the level of the page table that points to * a hugepte are used to encode the size. This masks those bits. + * On 8xx, HW assistance requires 4k alignment for the hugepte. */ +#ifdef CONFIG_PPC_8xx +#define HUGEPD_SHIFT_MASK 0xfff +#else #define HUGEPD_SHIFT_MASK 0x3f +#endif #ifndef __ASSEMBLY__ --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/pnv-pci.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/pnv-pci.h @@ -15,6 +15,7 @@ #define PCI_SLOT_ID_PREFIX (1UL << 63) #define PCI_SLOT_ID(phb_id, bdfn) \ (PCI_SLOT_ID_PREFIX | ((uint64_t)(bdfn) << 16) | (phb_id)) +#define PCI_PHB_SLOT_ID(phb_id) (phb_id) extern int pnv_pci_get_slot_id(struct device_node *np, uint64_t *id); extern int pnv_pci_get_device_tree(uint32_t phandle, void *buf, uint64_t len); --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/sections.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/sections.h @@ -5,8 +5,22 @@ #include #include + +#define arch_is_kernel_initmem_freed arch_is_kernel_initmem_freed + #include +extern bool init_mem_is_free; + +static inline int arch_is_kernel_initmem_freed(unsigned long addr) +{ + if (!init_mem_is_free) + return 0; + + return addr >= (unsigned long)__init_begin && + addr < (unsigned long)__init_end; +} + extern char __head_end[]; #ifdef __powerpc64__ --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/secure_boot.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/secure_boot.h @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Secure boot definitions + * + * Copyright (C) 2019 IBM Corporation + * Author: Nayna Jain + */ +#ifndef _ASM_POWER_SECURE_BOOT_H +#define _ASM_POWER_SECURE_BOOT_H + +#ifdef CONFIG_PPC_SECURE_BOOT + +bool is_ppc_secureboot_enabled(void); +bool is_ppc_trustedboot_enabled(void); + +#else + +static inline bool is_ppc_secureboot_enabled(void) +{ + return false; +} + +static inline bool is_ppc_trustedboot_enabled(void) +{ + return false; +} + +#endif +#endif --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/security_features.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/security_features.h @@ -9,7 +9,7 @@ #define _ASM_POWERPC_SECURITY_FEATURES_H -extern unsigned long powerpc_security_features; +extern u64 powerpc_security_features; extern bool rfi_flush; /* These are bit flags */ @@ -24,17 +24,17 @@ void do_stf_barrier_fixups(enum stf_barrier_type types); void setup_count_cache_flush(void); -static inline void security_ftr_set(unsigned long feature) +static inline void security_ftr_set(u64 feature) { powerpc_security_features |= feature; } -static inline void security_ftr_clear(unsigned long feature) +static inline void security_ftr_clear(u64 feature) { powerpc_security_features &= ~feature; } -static inline bool security_ftr_enabled(unsigned long feature) +static inline bool security_ftr_enabled(u64 feature) { return !!(powerpc_security_features & feature); } @@ -81,6 +81,9 @@ // Software required to flush count cache on context switch #define SEC_FTR_FLUSH_COUNT_CACHE 0x0000000000000400ull +// Software required to flush link stack on context switch +#define SEC_FTR_FLUSH_LINK_STACK 0x0000000000001000ull + // Features enabled by default #define SEC_FTR_DEFAULT \ --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/secvar.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/secvar.h @@ -0,0 +1,35 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2019 IBM Corporation + * Author: Nayna Jain + * + * PowerPC secure variable operations. + */ +#ifndef SECVAR_OPS_H +#define SECVAR_OPS_H + +#include +#include + +extern const struct secvar_operations *secvar_ops; + +struct secvar_operations { + int (*get)(const char *key, uint64_t key_len, u8 *data, + uint64_t *data_size); + int (*get_next)(const char *key, uint64_t *key_len, + uint64_t keybufsize); + int (*set)(const char *key, uint64_t key_len, u8 *data, + uint64_t data_size); +}; + +#ifdef CONFIG_PPC_SECURE_BOOT + +extern void set_secvar_ops(const struct secvar_operations *ops); + +#else + +static inline void set_secvar_ops(const struct secvar_operations *ops) { } + +#endif + +#endif --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/spinlock.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/spinlock.h @@ -15,6 +15,7 @@ * * (the type definitions are in asm/spinlock_types.h) */ +#include #include #ifdef CONFIG_PPC64 #include @@ -36,10 +37,12 @@ #endif #ifdef CONFIG_PPC_PSERIES +DECLARE_STATIC_KEY_FALSE(shared_processor); + #define vcpu_is_preempted vcpu_is_preempted static inline bool vcpu_is_preempted(int cpu) { - if (!firmware_has_feature(FW_FEATURE_SPLPAR)) + if (!static_branch_unlikely(&shared_processor)) return false; return !!(be32_to_cpu(lppaca_of(cpu).yield_count) & 1); } --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/tlb.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/tlb.h @@ -26,6 +26,17 @@ #define tlb_flush tlb_flush extern void tlb_flush(struct mmu_gather *tlb); +/* + * book3s: + * Hash does not use the linux page-tables, so we can avoid + * the TLB invalidate for page-table freeing, Radix otoh does use the + * page-tables and needs the TLBI. + * + * nohash: + * We still do TLB invalidate in the __pte_free_tlb routine before we + * add the page table pages to mmu gather table batch. + */ +#define tlb_needs_table_invalidate() radix_enabled() /* Get the generic bits... */ #include --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/uaccess.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/uaccess.h @@ -313,9 +313,9 @@ unsigned long ret; barrier_nospec(); - allow_user_access(to, from, n); + allow_read_write_user(to, from, n); ret = __copy_tofrom_user(to, from, n); - prevent_user_access(to, from, n); + prevent_read_write_user(to, from, n); return ret; } #endif /* __powerpc64__ */ @@ -401,7 +401,7 @@ return n; } -extern unsigned long __clear_user(void __user *addr, unsigned long size); +unsigned long __arch_clear_user(void __user *addr, unsigned long size); static inline unsigned long clear_user(void __user *addr, unsigned long size) { @@ -409,12 +409,17 @@ might_fault(); if (likely(access_ok(addr, size))) { allow_write_to_user(addr, size); - ret = __clear_user(addr, size); + ret = __arch_clear_user(addr, size); prevent_write_to_user(addr, size); } return ret; } +static inline unsigned long __clear_user(void __user *addr, unsigned long size) +{ + return clear_user(addr, size); +} + extern long strncpy_from_user(char *dst, const char __user *src, long count); extern __must_check long strnlen_user(const char __user *str, long n); --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/vdso_datapage.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/vdso_datapage.h @@ -82,6 +82,7 @@ __s32 wtom_clock_nsec; /* Wall to monotonic clock nsec */ __s64 wtom_clock_sec; /* Wall to monotonic clock sec */ struct timespec stamp_xtime; /* xtime as at tb_orig_stamp */ + __u32 hrtimer_res; /* hrtimer resolution */ __u32 syscall_map_64[SYSCALL_MAP_SIZE]; /* map of syscalls */ __u32 syscall_map_32[SYSCALL_MAP_SIZE]; /* map of syscalls */ }; @@ -103,6 +104,7 @@ __s32 wtom_clock_nsec; struct timespec stamp_xtime; /* xtime as at tb_orig_stamp */ __u32 stamp_sec_fraction; /* fractional seconds of stamp_xtime */ + __u32 hrtimer_res; /* hrtimer resolution */ __u32 syscall_map_32[SYSCALL_MAP_SIZE]; /* map of syscalls */ __u32 dcache_block_size; /* L1 d-cache block size */ __u32 icache_block_size; /* L1 i-cache block size */ --- linux-riscv-5.4.0.orig/arch/powerpc/include/asm/xive-regs.h +++ linux-riscv-5.4.0/arch/powerpc/include/asm/xive-regs.h @@ -39,6 +39,7 @@ #define XIVE_ESB_VAL_P 0x2 #define XIVE_ESB_VAL_Q 0x1 +#define XIVE_ESB_INVALID 0xFF /* * Thread Management (aka "TM") registers --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/Makefile +++ linux-riscv-5.4.0/arch/powerpc/kernel/Makefile @@ -5,8 +5,8 @@ CFLAGS_ptrace.o += -DUTS_MACHINE='"$(UTS_MACHINE)"' -# Disable clang warning for using setjmp without setjmp.h header -CFLAGS_crash.o += $(call cc-disable-warning, builtin-requires-header) +# Avoid clang warnings around longjmp/setjmp declarations +CFLAGS_crash.o += -ffreestanding ifdef CONFIG_PPC64 CFLAGS_prom_init.o += $(NO_MINIMAL_TOC) @@ -161,6 +161,9 @@ obj-y += ucall.o endif +obj-$(CONFIG_PPC_SECURE_BOOT) += secure_boot.o ima_arch.o secvar-ops.o +obj-$(CONFIG_PPC_SECVAR_SYSFS) += secvar-sysfs.o + # Disable GCOV, KCOV & sanitizers in odd or sensitive code GCOV_PROFILE_prom_init.o := n KCOV_INSTRUMENT_prom_init.o := n --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/asm-offsets.c +++ linux-riscv-5.4.0/arch/powerpc/kernel/asm-offsets.c @@ -387,6 +387,7 @@ OFFSET(WTOM_CLOCK_NSEC, vdso_data, wtom_clock_nsec); OFFSET(STAMP_XTIME, vdso_data, stamp_xtime); OFFSET(STAMP_SEC_FRAC, vdso_data, stamp_sec_fraction); + OFFSET(CLOCK_HRTIMER_RES, vdso_data, hrtimer_res); OFFSET(CFG_ICACHE_BLOCKSZ, vdso_data, icache_block_size); OFFSET(CFG_DCACHE_BLOCKSZ, vdso_data, dcache_block_size); OFFSET(CFG_ICACHE_LOGBLOCKSZ, vdso_data, icache_log_block_size); @@ -417,7 +418,6 @@ DEFINE(CLOCK_REALTIME_COARSE, CLOCK_REALTIME_COARSE); DEFINE(CLOCK_MONOTONIC_COARSE, CLOCK_MONOTONIC_COARSE); DEFINE(NSEC_PER_SEC, NSEC_PER_SEC); - DEFINE(CLOCK_REALTIME_RES, MONOTONIC_RES_NSEC); #ifdef CONFIG_BUG DEFINE(BUG_ENTRY_SIZE, sizeof(struct bug_entry)); --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/cputable.c +++ linux-riscv-5.4.0/arch/powerpc/kernel/cputable.c @@ -2193,11 +2193,13 @@ * oprofile_cpu_type already has a value, then we are * possibly overriding a real PVR with a logical one, * and, in that case, keep the current value for - * oprofile_cpu_type. + * oprofile_cpu_type. Futhermore, let's ensure that the + * fix for the PMAO bug is enabled on compatibility mode. */ if (old.oprofile_cpu_type != NULL) { t->oprofile_cpu_type = old.oprofile_cpu_type; t->oprofile_type = old.oprofile_type; + t->cpu_features |= old.cpu_features & CPU_FTR_PMAO_BUG; } } --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/eeh_driver.c +++ linux-riscv-5.4.0/arch/powerpc/kernel/eeh_driver.c @@ -541,12 +541,6 @@ pci_iov_remove_virtfn(edev->physfn, pdn->vf_index); edev->pdev = NULL; - - /* - * We have to set the VF PE number to invalid one, which is - * required to plug the VF successfully. - */ - pdn->pe_number = IODA_INVALID_PE; #endif if (rmv_data) list_add(&edev->rmv_entry, &rmv_data->removed_vf_list); @@ -897,12 +891,12 @@ /* Log the event */ if (pe->type & EEH_PE_PHB) { - pr_err("EEH: PHB#%x failure detected, location: %s\n", + pr_err("EEH: Recovering PHB#%x, location: %s\n", pe->phb->global_number, eeh_pe_loc_get(pe)); } else { struct eeh_pe *phb_pe = eeh_phb_pe_get(pe->phb); - pr_err("EEH: Frozen PHB#%x-PE#%x detected\n", + pr_err("EEH: Recovering PHB#%x-PE#%x\n", pe->phb->global_number, pe->addr); pr_err("EEH: PE location: %s, PHB location: %s\n", eeh_pe_loc_get(pe), eeh_pe_loc_get(phb_pe)); @@ -1206,6 +1200,17 @@ eeh_pe_state_mark(pe, EEH_PE_RECOVERING); eeh_handle_normal_event(pe); } else { + eeh_for_each_pe(pe, tmp_pe) + eeh_pe_for_each_dev(tmp_pe, edev, tmp_edev) + edev->mode &= ~EEH_DEV_NO_HANDLER; + + /* Notify all devices to be down */ + eeh_pe_state_clear(pe, EEH_PE_PRI_BUS, true); + eeh_set_channel_state(pe, pci_channel_io_perm_failure); + eeh_pe_report( + "error_detected(permanent failure)", pe, + eeh_report_failure, NULL); + pci_lock_rescan_remove(); list_for_each_entry(hose, &hose_list, list_node) { phb_pe = eeh_phb_pe_get(hose); @@ -1214,16 +1219,6 @@ (phb_pe->state & EEH_PE_RECOVERING)) continue; - eeh_for_each_pe(pe, tmp_pe) - eeh_pe_for_each_dev(tmp_pe, edev, tmp_edev) - edev->mode &= ~EEH_DEV_NO_HANDLER; - - /* Notify all devices to be down */ - eeh_pe_state_clear(pe, EEH_PE_PRI_BUS, true); - eeh_set_channel_state(pe, pci_channel_io_perm_failure); - eeh_pe_report( - "error_detected(permanent failure)", pe, - eeh_report_failure, NULL); bus = eeh_pe_bus_get(phb_pe); if (!bus) { pr_err("%s: Cannot find PCI bus for " --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/entry_32.S +++ linux-riscv-5.4.0/arch/powerpc/kernel/entry_32.S @@ -179,7 +179,7 @@ 2: /* if from kernel, check interrupted DOZE/NAP mode and * check for stack overflow */ - kuap_save_and_lock r11, r12, r9, r2, r0 + kuap_save_and_lock r11, r12, r9, r2, r6 addi r2, r12, -THREAD lwz r9,KSP_LIMIT(r12) cmplw r1,r9 /* if r1 <= ksp_limit */ @@ -284,6 +284,7 @@ rlwinm r9,r9,0,~MSR_EE lwz r12,_LINK(r11) /* and return to address in LR */ kuap_restore r11, r2, r3, r4, r5 + lwz r2, GPR2(r11) b fast_exception_return #endif @@ -777,7 +778,7 @@ 1: lis r3,exc_exit_restart_end@ha addi r3,r3,exc_exit_restart_end@l cmplw r12,r3 -#if CONFIG_PPC_BOOK3S_601 +#ifdef CONFIG_PPC_BOOK3S_601 bge 2b #else bge 3f @@ -785,7 +786,7 @@ lis r4,exc_exit_restart@ha addi r4,r4,exc_exit_restart@l cmplw r12,r4 -#if CONFIG_PPC_BOOK3S_601 +#ifdef CONFIG_PPC_BOOK3S_601 blt 2b #else blt 3f --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/entry_64.S +++ linux-riscv-5.4.0/arch/powerpc/kernel/entry_64.S @@ -537,6 +537,7 @@ /* Save LR into r9 */ mflr r9 + // Flush the link stack .rept 64 bl .+4 .endr @@ -546,6 +547,11 @@ .balign 32 /* Restore LR */ 1: mtlr r9 + + // If we're just flushing the link stack, return here +3: nop + patch_site 3b patch__flush_link_stack_return + li r9,0x7fff mtctr r9 --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/head_8xx.S +++ linux-riscv-5.4.0/arch/powerpc/kernel/head_8xx.S @@ -289,7 +289,7 @@ * set. All other Linux PTE bits control the behavior * of the MMU. */ - rlwimi r10, r10, 0, 0x0f00 /* Clear bits 20-23 */ + rlwinm r10, r10, 0, ~0x0f00 /* Clear bits 20-23 */ rlwimi r10, r10, 4, 0x0400 /* Copy _PAGE_EXEC into bit 21 */ ori r10, r10, RPN_PATTERN | 0x200 /* Set 22 and 24-27 */ mtspr SPRN_MI_RPN, r10 /* Update TLB entry */ --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/head_fsl_booke.S +++ linux-riscv-5.4.0/arch/powerpc/kernel/head_fsl_booke.S @@ -238,6 +238,9 @@ bl early_init +#ifdef CONFIG_KASAN + bl kasan_early_init +#endif #ifdef CONFIG_RELOCATABLE mr r3,r30 mr r4,r31 @@ -264,9 +267,6 @@ /* * Decide what sort of machine this is and initialize the MMU. */ -#ifdef CONFIG_KASAN - bl kasan_early_init -#endif mr r3,r30 mr r4,r31 bl machine_init --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/ima_arch.c +++ linux-riscv-5.4.0/arch/powerpc/kernel/ima_arch.c @@ -0,0 +1,78 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2019 IBM Corporation + * Author: Nayna Jain + */ + +#include +#include + +bool arch_ima_get_secureboot(void) +{ + return is_ppc_secureboot_enabled(); +} + +/* + * The "secure_rules" are enabled only on "secureboot" enabled systems. + * These rules verify the file signatures against known good values. + * The "appraise_type=imasig|modsig" option allows the known good signature + * to be stored as an xattr or as an appended signature. + * + * To avoid duplicate signature verification as much as possible, the IMA + * policy rule for module appraisal is added only if CONFIG_MODULE_SIG_FORCE + * is not enabled. + */ +static const char *const secure_rules[] = { + "appraise func=KEXEC_KERNEL_CHECK appraise_flag=check_blacklist appraise_type=imasig|modsig", +#ifndef CONFIG_MODULE_SIG_FORCE + "appraise func=MODULE_CHECK appraise_flag=check_blacklist appraise_type=imasig|modsig", +#endif + NULL +}; + +/* + * The "trusted_rules" are enabled only on "trustedboot" enabled systems. + * These rules add the kexec kernel image and kernel modules file hashes to + * the IMA measurement list. + */ +static const char *const trusted_rules[] = { + "measure func=KEXEC_KERNEL_CHECK", + "measure func=MODULE_CHECK", + NULL +}; + +/* + * The "secure_and_trusted_rules" contains rules for both the secure boot and + * trusted boot. The "template=ima-modsig" option includes the appended + * signature, when available, in the IMA measurement list. + */ +static const char *const secure_and_trusted_rules[] = { + "measure func=KEXEC_KERNEL_CHECK template=ima-modsig", + "measure func=MODULE_CHECK template=ima-modsig", + "appraise func=KEXEC_KERNEL_CHECK appraise_flag=check_blacklist appraise_type=imasig|modsig", +#ifndef CONFIG_MODULE_SIG + "appraise func=MODULE_CHECK appraise_flag=check_blacklist appraise_type=imasig|modsig", +#endif + NULL +}; + +/* + * Returns the relevant IMA arch-specific policies based on the system secure + * boot state. + */ +const char *const *arch_get_ima_policy(void) +{ + if (is_ppc_secureboot_enabled()) { + if (IS_ENABLED(CONFIG_MODULE_SIG)) + set_module_sig_enforced(); + + if (is_ppc_trustedboot_enabled()) + return secure_and_trusted_rules; + else + return secure_rules; + } else if (is_ppc_trustedboot_enabled()) { + return trusted_rules; + } + + return NULL; +} --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/irq.c +++ linux-riscv-5.4.0/arch/powerpc/kernel/irq.c @@ -619,8 +619,6 @@ trace_irq_entry(regs); - check_stack_overflow(); - /* * Query the platform PIC for the interrupt & ack it. * @@ -652,6 +650,8 @@ irqsp = hardirq_ctx[raw_smp_processor_id()]; sirqsp = softirq_ctx[raw_smp_processor_id()]; + check_stack_overflow(); + /* Already there ? */ if (unlikely(cursp == irqsp || cursp == sirqsp)) { __do_irq(regs); --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/misc_32.S +++ linux-riscv-5.4.0/arch/powerpc/kernel/misc_32.S @@ -317,126 +317,6 @@ #endif /* CONFIG_PPC_8xx */ /* - * Write any modified data cache blocks out to memory - * and invalidate the corresponding instruction cache blocks. - * This is a no-op on the 601. - * - * flush_icache_range(unsigned long start, unsigned long stop) - */ -_GLOBAL(flush_icache_range) -#if defined(CONFIG_PPC_BOOK3S_601) || defined(CONFIG_E200) - PURGE_PREFETCHED_INS - blr /* for 601 and e200, do nothing */ -#else - rlwinm r3,r3,0,0,31 - L1_CACHE_SHIFT - subf r4,r3,r4 - addi r4,r4,L1_CACHE_BYTES - 1 - srwi. r4,r4,L1_CACHE_SHIFT - beqlr - mtctr r4 - mr r6,r3 -1: dcbst 0,r3 - addi r3,r3,L1_CACHE_BYTES - bdnz 1b - sync /* wait for dcbst's to get to ram */ -#ifndef CONFIG_44x - mtctr r4 -2: icbi 0,r6 - addi r6,r6,L1_CACHE_BYTES - bdnz 2b -#else - /* Flash invalidate on 44x because we are passed kmapped addresses and - this doesn't work for userspace pages due to the virtually tagged - icache. Sigh. */ - iccci 0, r0 -#endif - sync /* additional sync needed on g4 */ - isync - blr -#endif -_ASM_NOKPROBE_SYMBOL(flush_icache_range) -EXPORT_SYMBOL(flush_icache_range) - -/* - * Flush a particular page from the data cache to RAM. - * Note: this is necessary because the instruction cache does *not* - * snoop from the data cache. - * This is a no-op on the 601 and e200 which have a unified cache. - * - * void __flush_dcache_icache(void *page) - */ -_GLOBAL(__flush_dcache_icache) -#if defined(CONFIG_PPC_BOOK3S_601) || defined(CONFIG_E200) - PURGE_PREFETCHED_INS - blr -#else - rlwinm r3,r3,0,0,31-PAGE_SHIFT /* Get page base address */ - li r4,PAGE_SIZE/L1_CACHE_BYTES /* Number of lines in a page */ - mtctr r4 - mr r6,r3 -0: dcbst 0,r3 /* Write line to ram */ - addi r3,r3,L1_CACHE_BYTES - bdnz 0b - sync -#ifdef CONFIG_44x - /* We don't flush the icache on 44x. Those have a virtual icache - * and we don't have access to the virtual address here (it's - * not the page vaddr but where it's mapped in user space). The - * flushing of the icache on these is handled elsewhere, when - * a change in the address space occurs, before returning to - * user space - */ -BEGIN_MMU_FTR_SECTION - blr -END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_44x) -#endif /* CONFIG_44x */ - mtctr r4 -1: icbi 0,r6 - addi r6,r6,L1_CACHE_BYTES - bdnz 1b - sync - isync - blr -#endif - -#ifndef CONFIG_BOOKE -/* - * Flush a particular page from the data cache to RAM, identified - * by its physical address. We turn off the MMU so we can just use - * the physical address (this may be a highmem page without a kernel - * mapping). - * - * void __flush_dcache_icache_phys(unsigned long physaddr) - */ -_GLOBAL(__flush_dcache_icache_phys) -#if defined(CONFIG_PPC_BOOK3S_601) || defined(CONFIG_E200) - PURGE_PREFETCHED_INS - blr /* for 601 and e200, do nothing */ -#else - mfmsr r10 - rlwinm r0,r10,0,28,26 /* clear DR */ - mtmsr r0 - isync - rlwinm r3,r3,0,0,31-PAGE_SHIFT /* Get page base address */ - li r4,PAGE_SIZE/L1_CACHE_BYTES /* Number of lines in a page */ - mtctr r4 - mr r6,r3 -0: dcbst 0,r3 /* Write line to ram */ - addi r3,r3,L1_CACHE_BYTES - bdnz 0b - sync - mtctr r4 -1: icbi 0,r6 - addi r6,r6,L1_CACHE_BYTES - bdnz 1b - sync - mtmsr r10 /* restore DR */ - isync - blr -#endif -#endif /* CONFIG_BOOKE */ - -/* * Copy a whole page. We use the dcbz instruction on the destination * to reduce memory traffic (it eliminates the unnecessary reads of * the destination into cache). This requires that the destination --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/misc_64.S +++ linux-riscv-5.4.0/arch/powerpc/kernel/misc_64.S @@ -49,108 +49,6 @@ mtlr r0 blr - .section ".toc","aw" -PPC64_CACHES: - .tc ppc64_caches[TC],ppc64_caches - .section ".text" - -/* - * Write any modified data cache blocks out to memory - * and invalidate the corresponding instruction cache blocks. - * - * flush_icache_range(unsigned long start, unsigned long stop) - * - * flush all bytes from start through stop-1 inclusive - */ - -_GLOBAL_TOC(flush_icache_range) -BEGIN_FTR_SECTION - PURGE_PREFETCHED_INS - blr -END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) -/* - * Flush the data cache to memory - * - * Different systems have different cache line sizes - * and in some cases i-cache and d-cache line sizes differ from - * each other. - */ - ld r10,PPC64_CACHES@toc(r2) - lwz r7,DCACHEL1BLOCKSIZE(r10)/* Get cache block size */ - addi r5,r7,-1 - andc r6,r3,r5 /* round low to line bdy */ - subf r8,r6,r4 /* compute length */ - add r8,r8,r5 /* ensure we get enough */ - lwz r9,DCACHEL1LOGBLOCKSIZE(r10) /* Get log-2 of cache block size */ - srw. r8,r8,r9 /* compute line count */ - beqlr /* nothing to do? */ - mtctr r8 -1: dcbst 0,r6 - add r6,r6,r7 - bdnz 1b - sync - -/* Now invalidate the instruction cache */ - - lwz r7,ICACHEL1BLOCKSIZE(r10) /* Get Icache block size */ - addi r5,r7,-1 - andc r6,r3,r5 /* round low to line bdy */ - subf r8,r6,r4 /* compute length */ - add r8,r8,r5 - lwz r9,ICACHEL1LOGBLOCKSIZE(r10) /* Get log-2 of Icache block size */ - srw. r8,r8,r9 /* compute line count */ - beqlr /* nothing to do? */ - mtctr r8 -2: icbi 0,r6 - add r6,r6,r7 - bdnz 2b - isync - blr -_ASM_NOKPROBE_SYMBOL(flush_icache_range) -EXPORT_SYMBOL(flush_icache_range) - -/* - * Flush a particular page from the data cache to RAM. - * Note: this is necessary because the instruction cache does *not* - * snoop from the data cache. - * - * void __flush_dcache_icache(void *page) - */ -_GLOBAL(__flush_dcache_icache) -/* - * Flush the data cache to memory - * - * Different systems have different cache line sizes - */ - -BEGIN_FTR_SECTION - PURGE_PREFETCHED_INS - blr -END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) - -/* Flush the dcache */ - ld r7,PPC64_CACHES@toc(r2) - clrrdi r3,r3,PAGE_SHIFT /* Page align */ - lwz r4,DCACHEL1BLOCKSPERPAGE(r7) /* Get # dcache blocks per page */ - lwz r5,DCACHEL1BLOCKSIZE(r7) /* Get dcache block size */ - mr r6,r3 - mtctr r4 -0: dcbst 0,r6 - add r6,r6,r5 - bdnz 0b - sync - -/* Now invalidate the icache */ - - lwz r4,ICACHEL1BLOCKSPERPAGE(r7) /* Get # icache blocks per page */ - lwz r5,ICACHEL1BLOCKSIZE(r7) /* Get icache block size */ - mtctr r4 -1: icbi 0,r3 - add r3,r3,r5 - bdnz 1b - isync - blr - _GLOBAL(__bswapdi2) EXPORT_SYMBOL(__bswapdi2) srdi r8,r3,32 --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/pci-common.c +++ linux-riscv-5.4.0/arch/powerpc/kernel/pci-common.c @@ -347,6 +347,7 @@ } return NULL; } +EXPORT_SYMBOL(pci_find_hose_for_OF_device); struct pci_controller *pci_find_controller_for_domain(int domain_nr) { @@ -1577,6 +1578,7 @@ { return pci_bus_find_capability(fake_pci_bus(hose, bus), devfn, cap); } +EXPORT_SYMBOL_GPL(early_find_capability); struct device_node *pcibios_get_phb_of_node(struct pci_bus *bus) { --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/pci_dn.c +++ linux-riscv-5.4.0/arch/powerpc/kernel/pci_dn.c @@ -244,9 +244,22 @@ continue; #ifdef CONFIG_EEH - /* Release EEH device for the VF */ + /* + * Release EEH state for this VF. The PCI core + * has already torn down the pci_dev for this VF, but + * we're responsible to removing the eeh_dev since it + * has the same lifetime as the pci_dn that spawned it. + */ edev = pdn_to_eeh_dev(pdn); if (edev) { + /* + * We allocate pci_dn's for the totalvfs count, + * but only only the vfs that were activated + * have a configured PE. + */ + if (edev->pe) + eeh_rmv_from_parent_pe(edev); + pdn->edev = NULL; kfree(edev); } --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/prom_init.c +++ linux-riscv-5.4.0/arch/powerpc/kernel/prom_init.c @@ -1053,7 +1053,7 @@ .reserved2 = 0, .reserved3 = 0, .subprocessors = 1, - .byte22 = OV5_FEAT(OV5_DRMEM_V2), + .byte22 = OV5_FEAT(OV5_DRMEM_V2) | OV5_FEAT(OV5_DRC_INFO), .intarch = 0, .mmu = 0, .hash_ext = 0, --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/secure_boot.c +++ linux-riscv-5.4.0/arch/powerpc/kernel/secure_boot.c @@ -0,0 +1,50 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2019 IBM Corporation + * Author: Nayna Jain + */ +#include +#include +#include + +static struct device_node *get_ppc_fw_sb_node(void) +{ + static const struct of_device_id ids[] = { + { .compatible = "ibm,secureboot", }, + { .compatible = "ibm,secureboot-v1", }, + { .compatible = "ibm,secureboot-v2", }, + {}, + }; + + return of_find_matching_node(NULL, ids); +} + +bool is_ppc_secureboot_enabled(void) +{ + struct device_node *node; + bool enabled = false; + + node = get_ppc_fw_sb_node(); + enabled = of_property_read_bool(node, "os-secureboot-enforcing"); + + of_node_put(node); + + pr_info("Secure boot mode %s\n", enabled ? "enabled" : "disabled"); + + return enabled; +} + +bool is_ppc_trustedboot_enabled(void) +{ + struct device_node *node; + bool enabled = false; + + node = get_ppc_fw_sb_node(); + enabled = of_property_read_bool(node, "trusted-enabled"); + + of_node_put(node); + + pr_info("Trusted boot mode %s\n", enabled ? "enabled" : "disabled"); + + return enabled; +} --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/security.c +++ linux-riscv-5.4.0/arch/powerpc/kernel/security.c @@ -16,7 +16,7 @@ #include -unsigned long powerpc_security_features __read_mostly = SEC_FTR_DEFAULT; +u64 powerpc_security_features __read_mostly = SEC_FTR_DEFAULT; enum count_cache_flush_type { COUNT_CACHE_FLUSH_NONE = 0x1, @@ -24,6 +24,7 @@ COUNT_CACHE_FLUSH_HW = 0x4, }; static enum count_cache_flush_type count_cache_flush_type = COUNT_CACHE_FLUSH_NONE; +static bool link_stack_flush_enabled; bool barrier_nospec_enabled; static bool no_nospec; @@ -108,7 +109,7 @@ static __init int security_feature_debugfs_init(void) { debugfs_create_x64("security_features", 0400, powerpc_debugfs_root, - (u64 *)&powerpc_security_features); + &powerpc_security_features); return 0; } device_initcall(security_feature_debugfs_init); @@ -141,32 +142,33 @@ thread_priv = security_ftr_enabled(SEC_FTR_L1D_THREAD_PRIV); - if (rfi_flush || thread_priv) { + if (rfi_flush) { struct seq_buf s; seq_buf_init(&s, buf, PAGE_SIZE - 1); - seq_buf_printf(&s, "Mitigation: "); - - if (rfi_flush) - seq_buf_printf(&s, "RFI Flush"); - - if (rfi_flush && thread_priv) - seq_buf_printf(&s, ", "); - + seq_buf_printf(&s, "Mitigation: RFI Flush"); if (thread_priv) - seq_buf_printf(&s, "L1D private per thread"); + seq_buf_printf(&s, ", L1D private per thread"); seq_buf_printf(&s, "\n"); return s.len; } + if (thread_priv) + return sprintf(buf, "Vulnerable: L1D private per thread\n"); + if (!security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV) && !security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR)) return sprintf(buf, "Not affected\n"); return sprintf(buf, "Vulnerable\n"); } + +ssize_t cpu_show_l1tf(struct device *dev, struct device_attribute *attr, char *buf) +{ + return cpu_show_meltdown(dev, attr, buf); +} #endif ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr, char *buf) @@ -212,11 +214,19 @@ if (ccd) seq_buf_printf(&s, "Indirect branch cache disabled"); + + if (link_stack_flush_enabled) + seq_buf_printf(&s, ", Software link stack flush"); + } else if (count_cache_flush_type != COUNT_CACHE_FLUSH_NONE) { seq_buf_printf(&s, "Mitigation: Software count cache flush"); if (count_cache_flush_type == COUNT_CACHE_FLUSH_HW) seq_buf_printf(&s, " (hardware accelerated)"); + + if (link_stack_flush_enabled) + seq_buf_printf(&s, ", Software link stack flush"); + } else if (btb_flush_enabled) { seq_buf_printf(&s, "Mitigation: Branch predictor state flush"); } else { @@ -377,18 +387,49 @@ device_initcall(stf_barrier_debugfs_init); #endif /* CONFIG_DEBUG_FS */ +static void no_count_cache_flush(void) +{ + count_cache_flush_type = COUNT_CACHE_FLUSH_NONE; + pr_info("count-cache-flush: software flush disabled.\n"); +} + static void toggle_count_cache_flush(bool enable) { - if (!enable || !security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE)) { + if (!security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE) && + !security_ftr_enabled(SEC_FTR_FLUSH_LINK_STACK)) + enable = false; + + if (!enable) { patch_instruction_site(&patch__call_flush_count_cache, PPC_INST_NOP); - count_cache_flush_type = COUNT_CACHE_FLUSH_NONE; - pr_info("count-cache-flush: software flush disabled.\n"); +#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE + patch_instruction_site(&patch__call_kvm_flush_link_stack, PPC_INST_NOP); +#endif + pr_info("link-stack-flush: software flush disabled.\n"); + link_stack_flush_enabled = false; + no_count_cache_flush(); return; } + // This enables the branch from _switch to flush_count_cache patch_branch_site(&patch__call_flush_count_cache, (u64)&flush_count_cache, BRANCH_SET_LINK); +#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE + // This enables the branch from guest_exit_cont to kvm_flush_link_stack + patch_branch_site(&patch__call_kvm_flush_link_stack, + (u64)&kvm_flush_link_stack, BRANCH_SET_LINK); +#endif + + pr_info("link-stack-flush: software flush enabled.\n"); + link_stack_flush_enabled = true; + + // If we just need to flush the link stack, patch an early return + if (!security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE)) { + patch_instruction_site(&patch__flush_link_stack_return, PPC_INST_BLR); + no_count_cache_flush(); + return; + } + if (!security_ftr_enabled(SEC_FTR_BCCTR_FLUSH_ASSIST)) { count_cache_flush_type = COUNT_CACHE_FLUSH_SW; pr_info("count-cache-flush: full software flush sequence enabled.\n"); @@ -407,11 +448,20 @@ if (no_spectrev2 || cpu_mitigations_off()) { if (security_ftr_enabled(SEC_FTR_BCCTRL_SERIALISED) || security_ftr_enabled(SEC_FTR_COUNT_CACHE_DISABLED)) - pr_warn("Spectre v2 mitigations not under software control, can't disable\n"); + pr_warn("Spectre v2 mitigations not fully under software control, can't disable\n"); enable = false; } + /* + * There's no firmware feature flag/hypervisor bit to tell us we need to + * flush the link stack on context switch. So we set it here if we see + * either of the Spectre v2 mitigations that aim to protect userspace. + */ + if (security_ftr_enabled(SEC_FTR_COUNT_CACHE_DISABLED) || + security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE)) + security_ftr_set(SEC_FTR_FLUSH_LINK_STACK); + toggle_count_cache_flush(enable); } --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/secvar-ops.c +++ linux-riscv-5.4.0/arch/powerpc/kernel/secvar-ops.c @@ -0,0 +1,17 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2019 IBM Corporation + * Author: Nayna Jain + * + * This file initializes secvar operations for PowerPC Secureboot + */ + +#include +#include + +const struct secvar_operations *secvar_ops __ro_after_init; + +void set_secvar_ops(const struct secvar_operations *ops) +{ + secvar_ops = ops; +} --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/secvar-sysfs.c +++ linux-riscv-5.4.0/arch/powerpc/kernel/secvar-sysfs.c @@ -0,0 +1,248 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Copyright (C) 2019 IBM Corporation + * + * This code exposes secure variables to user via sysfs + */ + +#define pr_fmt(fmt) "secvar-sysfs: "fmt + +#include +#include +#include +#include +#include + +#define NAME_MAX_SIZE 1024 + +static struct kobject *secvar_kobj; +static struct kset *secvar_kset; + +static ssize_t format_show(struct kobject *kobj, struct kobj_attribute *attr, + char *buf) +{ + ssize_t rc = 0; + struct device_node *node; + const char *format; + + node = of_find_compatible_node(NULL, NULL, "ibm,secvar-backend"); + if (!of_device_is_available(node)) + return -ENODEV; + + rc = of_property_read_string(node, "format", &format); + if (rc) + return rc; + + rc = sprintf(buf, "%s\n", format); + + of_node_put(node); + + return rc; +} + + +static ssize_t size_show(struct kobject *kobj, struct kobj_attribute *attr, + char *buf) +{ + uint64_t dsize; + int rc; + + rc = secvar_ops->get(kobj->name, strlen(kobj->name) + 1, NULL, &dsize); + if (rc) { + pr_err("Error retrieving %s variable size %d\n", kobj->name, + rc); + return rc; + } + + return sprintf(buf, "%llu\n", dsize); +} + +static ssize_t data_read(struct file *filep, struct kobject *kobj, + struct bin_attribute *attr, char *buf, loff_t off, + size_t count) +{ + uint64_t dsize; + char *data; + int rc; + + rc = secvar_ops->get(kobj->name, strlen(kobj->name) + 1, NULL, &dsize); + if (rc) { + pr_err("Error getting %s variable size %d\n", kobj->name, rc); + return rc; + } + pr_debug("dsize is %llu\n", dsize); + + data = kzalloc(dsize, GFP_KERNEL); + if (!data) + return -ENOMEM; + + rc = secvar_ops->get(kobj->name, strlen(kobj->name) + 1, data, &dsize); + if (rc) { + pr_err("Error getting %s variable %d\n", kobj->name, rc); + goto data_fail; + } + + rc = memory_read_from_buffer(buf, count, &off, data, dsize); + +data_fail: + kfree(data); + return rc; +} + +static ssize_t update_write(struct file *filep, struct kobject *kobj, + struct bin_attribute *attr, char *buf, loff_t off, + size_t count) +{ + int rc; + + pr_debug("count is %ld\n", count); + rc = secvar_ops->set(kobj->name, strlen(kobj->name) + 1, buf, count); + if (rc) { + pr_err("Error setting the %s variable %d\n", kobj->name, rc); + return rc; + } + + return count; +} + +static struct kobj_attribute format_attr = __ATTR_RO(format); + +static struct kobj_attribute size_attr = __ATTR_RO(size); + +static struct bin_attribute data_attr = __BIN_ATTR_RO(data, 0); + +static struct bin_attribute update_attr = __BIN_ATTR_WO(update, 0); + +static struct bin_attribute *secvar_bin_attrs[] = { + &data_attr, + &update_attr, + NULL, +}; + +static struct attribute *secvar_attrs[] = { + &size_attr.attr, + NULL, +}; + +static const struct attribute_group secvar_attr_group = { + .attrs = secvar_attrs, + .bin_attrs = secvar_bin_attrs, +}; +__ATTRIBUTE_GROUPS(secvar_attr); + +static struct kobj_type secvar_ktype = { + .sysfs_ops = &kobj_sysfs_ops, + .default_groups = secvar_attr_groups, +}; + +static int update_kobj_size(void) +{ + + struct device_node *node; + u64 varsize; + int rc = 0; + + node = of_find_compatible_node(NULL, NULL, "ibm,secvar-backend"); + if (!of_device_is_available(node)) { + rc = -ENODEV; + goto out; + } + + rc = of_property_read_u64(node, "max-var-size", &varsize); + if (rc) + goto out; + + data_attr.size = varsize; + update_attr.size = varsize; + +out: + of_node_put(node); + + return rc; +} + +static int secvar_sysfs_load(void) +{ + char *name; + uint64_t namesize = 0; + struct kobject *kobj; + int rc; + + name = kzalloc(NAME_MAX_SIZE, GFP_KERNEL); + if (!name) + return -ENOMEM; + + do { + rc = secvar_ops->get_next(name, &namesize, NAME_MAX_SIZE); + if (rc) { + if (rc != -ENOENT) + pr_err("error getting secvar from firmware %d\n", + rc); + break; + } + + kobj = kzalloc(sizeof(*kobj), GFP_KERNEL); + if (!kobj) { + rc = -ENOMEM; + break; + } + + kobject_init(kobj, &secvar_ktype); + + rc = kobject_add(kobj, &secvar_kset->kobj, "%s", name); + if (rc) { + pr_warn("kobject_add error %d for attribute: %s\n", rc, + name); + kobject_put(kobj); + kobj = NULL; + } + + if (kobj) + kobject_uevent(kobj, KOBJ_ADD); + + } while (!rc); + + kfree(name); + return rc; +} + +static int secvar_sysfs_init(void) +{ + int rc; + + if (!secvar_ops) { + pr_warn("secvar: failed to retrieve secvar operations.\n"); + return -ENODEV; + } + + secvar_kobj = kobject_create_and_add("secvar", firmware_kobj); + if (!secvar_kobj) { + pr_err("secvar: Failed to create firmware kobj\n"); + return -ENOMEM; + } + + rc = sysfs_create_file(secvar_kobj, &format_attr.attr); + if (rc) { + kobject_put(secvar_kobj); + return -ENOMEM; + } + + secvar_kset = kset_create_and_add("vars", NULL, secvar_kobj); + if (!secvar_kset) { + pr_err("secvar: sysfs kobject registration failed.\n"); + kobject_put(secvar_kobj); + return -ENOMEM; + } + + rc = update_kobj_size(); + if (rc) { + pr_err("Cannot read the size of the attribute\n"); + return rc; + } + + secvar_sysfs_load(); + + return 0; +} + +late_initcall(secvar_sysfs_init); --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/setup-common.c +++ linux-riscv-5.4.0/arch/powerpc/kernel/setup-common.c @@ -31,6 +31,7 @@ #include #include #include +#include #include #include #include @@ -64,6 +65,7 @@ #include #include #include +#include #include "setup.h" @@ -855,6 +857,16 @@ */ initialize_cache_info(); + /* + * Lock down the kernel if booted in secure mode. This is required to + * maintain kernel integrity. + */ + if (IS_ENABLED(CONFIG_LOCK_DOWN_IN_SECURE_BOOT)) { + if (is_ppc_secureboot_enabled()) + security_lock_kernel_down("PowerNV Secure Boot mode", + LOCKDOWN_INTEGRITY_MAX); + } + /* Initialize RTAS if available. */ rtas_initialize(); --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/signal.c +++ linux-riscv-5.4.0/arch/powerpc/kernel/signal.c @@ -200,14 +200,27 @@ * normal/non-checkpointed stack pointer. */ + unsigned long ret = tsk->thread.regs->gpr[1]; + #ifdef CONFIG_PPC_TRANSACTIONAL_MEM BUG_ON(tsk != current); if (MSR_TM_ACTIVE(tsk->thread.regs->msr)) { + preempt_disable(); tm_reclaim_current(TM_CAUSE_SIGNAL); if (MSR_TM_TRANSACTIONAL(tsk->thread.regs->msr)) - return tsk->thread.ckpt_regs.gpr[1]; + ret = tsk->thread.ckpt_regs.gpr[1]; + + /* + * If we treclaim, we must clear the current thread's TM bits + * before re-enabling preemption. Otherwise we might be + * preempted and have the live MSR[TS] changed behind our back + * (tm_recheckpoint_new_task() would recheckpoint). Besides, we + * enter the signal handler in non-transactional state. + */ + tsk->thread.regs->msr &= ~MSR_TS_MASK; + preempt_enable(); } #endif - return tsk->thread.regs->gpr[1]; + return ret; } --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/signal_32.c +++ linux-riscv-5.4.0/arch/powerpc/kernel/signal_32.c @@ -489,19 +489,11 @@ */ static int save_tm_user_regs(struct pt_regs *regs, struct mcontext __user *frame, - struct mcontext __user *tm_frame, int sigret) + struct mcontext __user *tm_frame, int sigret, + unsigned long msr) { - unsigned long msr = regs->msr; - WARN_ON(tm_suspend_disabled); - /* Remove TM bits from thread's MSR. The MSR in the sigcontext - * just indicates to userland that we were doing a transaction, but we - * don't want to return in transactional state. This also ensures - * that flush_fp_to_thread won't set TIF_RESTORE_TM again. - */ - regs->msr &= ~MSR_TS_MASK; - /* Save both sets of general registers */ if (save_general_regs(¤t->thread.ckpt_regs, frame) || save_general_regs(regs, tm_frame)) @@ -912,6 +904,10 @@ int sigret; unsigned long tramp; struct pt_regs *regs = tsk->thread.regs; +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM + /* Save the thread's msr before get_tm_stackpointer() changes it */ + unsigned long msr = regs->msr; +#endif BUG_ON(tsk != current); @@ -944,13 +940,13 @@ #ifdef CONFIG_PPC_TRANSACTIONAL_MEM tm_frame = &rt_sf->uc_transact.uc_mcontext; - if (MSR_TM_ACTIVE(regs->msr)) { + if (MSR_TM_ACTIVE(msr)) { if (__put_user((unsigned long)&rt_sf->uc_transact, &rt_sf->uc.uc_link) || __put_user((unsigned long)tm_frame, &rt_sf->uc_transact.uc_regs)) goto badframe; - if (save_tm_user_regs(regs, frame, tm_frame, sigret)) + if (save_tm_user_regs(regs, frame, tm_frame, sigret, msr)) goto badframe; } else @@ -1369,6 +1365,10 @@ int sigret; unsigned long tramp; struct pt_regs *regs = tsk->thread.regs; +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM + /* Save the thread's msr before get_tm_stackpointer() changes it */ + unsigned long msr = regs->msr; +#endif BUG_ON(tsk != current); @@ -1402,9 +1402,9 @@ #ifdef CONFIG_PPC_TRANSACTIONAL_MEM tm_mctx = &frame->mctx_transact; - if (MSR_TM_ACTIVE(regs->msr)) { + if (MSR_TM_ACTIVE(msr)) { if (save_tm_user_regs(regs, &frame->mctx, &frame->mctx_transact, - sigret)) + sigret, msr)) goto badframe; } else --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/signal_64.c +++ linux-riscv-5.4.0/arch/powerpc/kernel/signal_64.c @@ -192,7 +192,8 @@ static long setup_tm_sigcontexts(struct sigcontext __user *sc, struct sigcontext __user *tm_sc, struct task_struct *tsk, - int signr, sigset_t *set, unsigned long handler) + int signr, sigset_t *set, unsigned long handler, + unsigned long msr) { /* When CONFIG_ALTIVEC is set, we _always_ setup v_regs even if the * process never used altivec yet (MSR_VEC is zero in pt_regs of @@ -207,12 +208,11 @@ elf_vrreg_t __user *tm_v_regs = sigcontext_vmx_regs(tm_sc); #endif struct pt_regs *regs = tsk->thread.regs; - unsigned long msr = tsk->thread.regs->msr; long err = 0; BUG_ON(tsk != current); - BUG_ON(!MSR_TM_ACTIVE(regs->msr)); + BUG_ON(!MSR_TM_ACTIVE(msr)); WARN_ON(tm_suspend_disabled); @@ -222,13 +222,6 @@ */ msr |= tsk->thread.ckpt_regs.msr & (MSR_FP | MSR_VEC | MSR_VSX); - /* Remove TM bits from thread's MSR. The MSR in the sigcontext - * just indicates to userland that we were doing a transaction, but we - * don't want to return in transactional state. This also ensures - * that flush_fp_to_thread won't set TIF_RESTORE_TM again. - */ - regs->msr &= ~MSR_TS_MASK; - #ifdef CONFIG_ALTIVEC err |= __put_user(v_regs, &sc->v_regs); err |= __put_user(tm_v_regs, &tm_sc->v_regs); @@ -824,6 +817,10 @@ unsigned long newsp = 0; long err = 0; struct pt_regs *regs = tsk->thread.regs; +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM + /* Save the thread's msr before get_tm_stackpointer() changes it */ + unsigned long msr = regs->msr; +#endif BUG_ON(tsk != current); @@ -841,7 +838,7 @@ err |= __put_user(0, &frame->uc.uc_flags); err |= __save_altstack(&frame->uc.uc_stack, regs->gpr[1]); #ifdef CONFIG_PPC_TRANSACTIONAL_MEM - if (MSR_TM_ACTIVE(regs->msr)) { + if (MSR_TM_ACTIVE(msr)) { /* The ucontext_t passed to userland points to the second * ucontext_t (for transactional state) with its uc_link ptr. */ @@ -849,7 +846,8 @@ err |= setup_tm_sigcontexts(&frame->uc.uc_mcontext, &frame->uc_transact.uc_mcontext, tsk, ksig->sig, NULL, - (unsigned long)ksig->ka.sa.sa_handler); + (unsigned long)ksig->ka.sa.sa_handler, + msr); } else #endif { --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/time.c +++ linux-riscv-5.4.0/arch/powerpc/kernel/time.c @@ -232,7 +232,7 @@ * Accumulate stolen time by scanning the dispatch trace log. * Called on entry from user mode. */ -void accumulate_stolen_time(void) +void notrace accumulate_stolen_time(void) { u64 sst, ust; unsigned long save_irq_soft_mask = irq_soft_mask_return(); @@ -959,6 +959,7 @@ vdso_data->wtom_clock_nsec = tk->wall_to_monotonic.tv_nsec; vdso_data->stamp_xtime = xt; vdso_data->stamp_sec_fraction = frac_sec; + vdso_data->hrtimer_res = hrtimer_resolution; smp_wmb(); ++(vdso_data->tb_update_count); } --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/traps.c +++ linux-riscv-5.4.0/arch/powerpc/kernel/traps.c @@ -250,15 +250,22 @@ } NOKPROBE_SYMBOL(oops_end); +static char *get_mmu_str(void) +{ + if (early_radix_enabled()) + return " MMU=Radix"; + if (early_mmu_has_feature(MMU_FTR_HPTE_TABLE)) + return " MMU=Hash"; + return ""; +} + static int __die(const char *str, struct pt_regs *regs, long err) { printk("Oops: %s, sig: %ld [#%d]\n", str, err, ++die_counter); - printk("%s PAGE_SIZE=%luK%s%s%s%s%s%s%s %s\n", + printk("%s PAGE_SIZE=%luK%s%s%s%s%s%s %s\n", IS_ENABLED(CONFIG_CPU_LITTLE_ENDIAN) ? "LE" : "BE", - PAGE_SIZE / 1024, - early_radix_enabled() ? " MMU=Radix" : "", - early_mmu_has_feature(MMU_FTR_HPTE_TABLE) ? " MMU=Hash" : "", + PAGE_SIZE / 1024, get_mmu_str(), IS_ENABLED(CONFIG_PREEMPT) ? " PREEMPT" : "", IS_ENABLED(CONFIG_SMP) ? " SMP" : "", IS_ENABLED(CONFIG_SMP) ? (" NR_CPUS=" __stringify(NR_CPUS)) : "", --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/vdso32/gettimeofday.S +++ linux-riscv-5.4.0/arch/powerpc/kernel/vdso32/gettimeofday.S @@ -156,12 +156,15 @@ cror cr0*4+eq,cr0*4+eq,cr1*4+eq bne cr0,99f + mflr r12 + .cfi_register lr,r12 + bl __get_datapage@local /* get data page */ + lwz r5, CLOCK_HRTIMER_RES(r3) + mtlr r12 li r3,0 cmpli cr0,r4,0 crclr cr0*4+so beqlr - lis r5,CLOCK_REALTIME_RES@h - ori r5,r5,CLOCK_REALTIME_RES@l stw r3,TSPC32_TV_SEC(r4) stw r5,TSPC32_TV_NSEC(r4) blr --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/vdso64/cacheflush.S +++ linux-riscv-5.4.0/arch/powerpc/kernel/vdso64/cacheflush.S @@ -35,7 +35,7 @@ subf r8,r6,r4 /* compute length */ add r8,r8,r5 /* ensure we get enough */ lwz r9,CFG_DCACHE_LOGBLOCKSZ(r10) - srw. r8,r8,r9 /* compute line count */ + srd. r8,r8,r9 /* compute line count */ crclr cr0*4+so beqlr /* nothing to do? */ mtctr r8 @@ -52,7 +52,7 @@ subf r8,r6,r4 /* compute length */ add r8,r8,r5 lwz r9,CFG_ICACHE_LOGBLOCKSZ(r10) - srw. r8,r8,r9 /* compute line count */ + srd. r8,r8,r9 /* compute line count */ crclr cr0*4+so beqlr /* nothing to do? */ mtctr r8 --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/vdso64/gettimeofday.S +++ linux-riscv-5.4.0/arch/powerpc/kernel/vdso64/gettimeofday.S @@ -186,12 +186,15 @@ cror cr0*4+eq,cr0*4+eq,cr1*4+eq bne cr0,99f + mflr r12 + .cfi_register lr,r12 + bl V_LOCAL_FUNC(__get_datapage) + lwz r5, CLOCK_HRTIMER_RES(r3) + mtlr r12 li r3,0 cmpldi cr0,r4,0 crclr cr0*4+so beqlr - lis r5,CLOCK_REALTIME_RES@h - ori r5,r5,CLOCK_REALTIME_RES@l std r3,TSPC64_TV_SEC(r4) std r5,TSPC64_TV_NSEC(r4) blr --- linux-riscv-5.4.0.orig/arch/powerpc/kernel/vmlinux.lds.S +++ linux-riscv-5.4.0/arch/powerpc/kernel/vmlinux.lds.S @@ -326,6 +326,12 @@ *(.branch_lt) } +#ifdef CONFIG_DEBUG_INFO_BTF + .BTF : AT(ADDR(.BTF) - LOAD_OFFSET) { + *(.BTF) + } +#endif + .opd : AT(ADDR(.opd) - LOAD_OFFSET) { __start_opd = .; KEEP(*(.opd)) --- linux-riscv-5.4.0.orig/arch/powerpc/kvm/book3s_hv.c +++ linux-riscv-5.4.0/arch/powerpc/kvm/book3s_hv.c @@ -2354,7 +2354,7 @@ mutex_unlock(&kvm->lock); if (!vcore) - goto free_vcpu; + goto uninit_vcpu; spin_lock(&vcore->lock); ++vcore->num_threads; @@ -2371,6 +2371,8 @@ return vcpu; +uninit_vcpu: + kvm_vcpu_uninit(vcpu); free_vcpu: kmem_cache_free(kvm_vcpu_cache, vcpu); out: --- linux-riscv-5.4.0.orig/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ linux-riscv-5.4.0/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -11,6 +11,7 @@ */ #include +#include #include #include #include @@ -1116,7 +1117,7 @@ ld r7, VCPU_GPR(R7)(r4) bne ret_to_ultra - lwz r0, VCPU_CR(r4) + ld r0, VCPU_CR(r4) mtcr r0 ld r0, VCPU_GPR(R0)(r4) @@ -1136,7 +1137,7 @@ * R3 = UV_RETURN */ ret_to_ultra: - lwz r0, VCPU_CR(r4) + ld r0, VCPU_CR(r4) mtcr r0 ld r0, VCPU_GPR(R3)(r4) @@ -1487,6 +1488,13 @@ 1: #endif /* CONFIG_KVM_XICS */ + /* + * Possibly flush the link stack here, before we do a blr in + * guest_exit_short_path. + */ +1: nop + patch_site 1b patch__call_kvm_flush_link_stack + /* If we came in through the P9 short path, go back out to C now */ lwz r0, STACK_SLOT_SHORT_PATH(r1) cmpwi r0, 0 @@ -1963,6 +1971,28 @@ mtlr r0 blr +.balign 32 +.global kvm_flush_link_stack +kvm_flush_link_stack: + /* Save LR into r0 */ + mflr r0 + + /* Flush the link stack. On Power8 it's up to 32 entries in size. */ + .rept 32 + bl .+4 + .endr + + /* And on Power9 it's up to 64. */ +BEGIN_FTR_SECTION + .rept 32 + bl .+4 + .endr +END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300) + + /* Restore LR */ + mtlr r0 + blr + kvmppc_guest_external: /* External interrupt, first check for host_ipi. If this is * set, we know the host wants us out so let's do it now --- linux-riscv-5.4.0.orig/arch/powerpc/kvm/book3s_pr.c +++ linux-riscv-5.4.0/arch/powerpc/kvm/book3s_pr.c @@ -1769,10 +1769,12 @@ err = kvmppc_mmu_init(vcpu); if (err < 0) - goto uninit_vcpu; + goto free_shared_page; return vcpu; +free_shared_page: + free_page((unsigned long)vcpu->arch.shared); uninit_vcpu: kvm_vcpu_uninit(vcpu); free_shadow_vcpu: --- linux-riscv-5.4.0.orig/arch/powerpc/kvm/book3s_xive.c +++ linux-riscv-5.4.0/arch/powerpc/kvm/book3s_xive.c @@ -2005,6 +2005,10 @@ pr_devel("Creating xive for partition\n"); + /* Already there ? */ + if (kvm->arch.xive) + return -EEXIST; + xive = kvmppc_xive_get_device(kvm, type); if (!xive) return -ENOMEM; @@ -2014,12 +2018,6 @@ xive->kvm = kvm; mutex_init(&xive->lock); - /* Already there ? */ - if (kvm->arch.xive) - ret = -EEXIST; - else - kvm->arch.xive = xive; - /* We use the default queue size set by the host */ xive->q_order = xive_native_default_eq_shift(); if (xive->q_order < PAGE_SHIFT) @@ -2039,6 +2037,7 @@ if (ret) return ret; + kvm->arch.xive = xive; return 0; } --- linux-riscv-5.4.0.orig/arch/powerpc/kvm/book3s_xive_native.c +++ linux-riscv-5.4.0/arch/powerpc/kvm/book3s_xive_native.c @@ -50,6 +50,24 @@ } } +static int kvmppc_xive_native_configure_queue(u32 vp_id, struct xive_q *q, + u8 prio, __be32 *qpage, + u32 order, bool can_escalate) +{ + int rc; + __be32 *qpage_prev = q->qpage; + + rc = xive_native_configure_queue(vp_id, q, prio, qpage, order, + can_escalate); + if (rc) + return rc; + + if (qpage_prev) + put_page(virt_to_page(qpage_prev)); + + return rc; +} + void kvmppc_xive_native_cleanup_vcpu(struct kvm_vcpu *vcpu) { struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu; @@ -582,19 +600,14 @@ q->guest_qaddr = 0; q->guest_qshift = 0; - rc = xive_native_configure_queue(xc->vp_id, q, priority, - NULL, 0, true); + rc = kvmppc_xive_native_configure_queue(xc->vp_id, q, priority, + NULL, 0, true); if (rc) { pr_err("Failed to reset queue %d for VCPU %d: %d\n", priority, xc->server_num, rc); return rc; } - if (q->qpage) { - put_page(virt_to_page(q->qpage)); - q->qpage = NULL; - } - return 0; } @@ -624,17 +637,18 @@ srcu_idx = srcu_read_lock(&kvm->srcu); gfn = gpa_to_gfn(kvm_eq.qaddr); - page = gfn_to_page(kvm, gfn); - if (is_error_page(page)) { + + page_size = kvm_host_page_size(vcpu, gfn); + if (1ull << kvm_eq.qshift > page_size) { srcu_read_unlock(&kvm->srcu, srcu_idx); - pr_err("Couldn't get queue page %llx!\n", kvm_eq.qaddr); + pr_warn("Incompatible host page size %lx!\n", page_size); return -EINVAL; } - page_size = kvm_host_page_size(kvm, gfn); - if (1ull << kvm_eq.qshift > page_size) { + page = gfn_to_page(kvm, gfn); + if (is_error_page(page)) { srcu_read_unlock(&kvm->srcu, srcu_idx); - pr_warn("Incompatible host page size %lx!\n", page_size); + pr_err("Couldn't get queue page %llx!\n", kvm_eq.qaddr); return -EINVAL; } @@ -653,8 +667,8 @@ * OPAL level because the use of END ESBs is not supported by * Linux. */ - rc = xive_native_configure_queue(xc->vp_id, q, priority, - (__be32 *) qaddr, kvm_eq.qshift, true); + rc = kvmppc_xive_native_configure_queue(xc->vp_id, q, priority, + (__be32 *) qaddr, kvm_eq.qshift, true); if (rc) { pr_err("Failed to configure queue %d for VCPU %d: %d\n", priority, xc->server_num, rc); @@ -1081,7 +1095,6 @@ dev->private = xive; xive->dev = dev; xive->kvm = kvm; - kvm->arch.xive = xive; mutex_init(&xive->mapping_lock); mutex_init(&xive->lock); @@ -1102,6 +1115,7 @@ if (ret) return ret; + kvm->arch.xive = xive; return 0; } --- linux-riscv-5.4.0.orig/arch/powerpc/kvm/emulate_loadstore.c +++ linux-riscv-5.4.0/arch/powerpc/kvm/emulate_loadstore.c @@ -73,7 +73,6 @@ { struct kvm_run *run = vcpu->run; u32 inst; - int ra, rs, rt; enum emulation_result emulated = EMULATE_FAIL; int advance = 1; struct instruction_op op; @@ -85,10 +84,6 @@ if (emulated != EMULATE_DONE) return emulated; - ra = get_ra(inst); - rs = get_rs(inst); - rt = get_rt(inst); - vcpu->arch.mmio_vsx_copy_nums = 0; vcpu->arch.mmio_vsx_offset = 0; vcpu->arch.mmio_copy_type = KVMPPC_VSX_COPY_NONE; --- linux-riscv-5.4.0.orig/arch/powerpc/lib/sstep.c +++ linux-riscv-5.4.0/arch/powerpc/lib/sstep.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include #include --- linux-riscv-5.4.0.orig/arch/powerpc/lib/string_32.S +++ linux-riscv-5.4.0/arch/powerpc/lib/string_32.S @@ -17,7 +17,7 @@ LG_CACHELINE_BYTES = L1_CACHE_SHIFT CACHELINE_MASK = (L1_CACHE_BYTES-1) -_GLOBAL(__clear_user) +_GLOBAL(__arch_clear_user) /* * Use dcbz on the complete cache lines in the destination * to set them to zero. This requires that the destination @@ -87,4 +87,4 @@ EX_TABLE(8b, 91b) EX_TABLE(9b, 91b) -EXPORT_SYMBOL(__clear_user) +EXPORT_SYMBOL(__arch_clear_user) --- linux-riscv-5.4.0.orig/arch/powerpc/lib/string_64.S +++ linux-riscv-5.4.0/arch/powerpc/lib/string_64.S @@ -17,7 +17,7 @@ .section ".text" /** - * __clear_user: - Zero a block of memory in user space, with less checking. + * __arch_clear_user: - Zero a block of memory in user space, with less checking. * @to: Destination address, in user space. * @n: Number of bytes to zero. * @@ -58,7 +58,7 @@ mr r3,r4 blr -_GLOBAL_TOC(__clear_user) +_GLOBAL_TOC(__arch_clear_user) cmpdi r4,32 neg r6,r3 li r0,0 @@ -181,4 +181,4 @@ cmpdi r4,32 blt .Lshort_clear b .Lmedium_clear -EXPORT_SYMBOL(__clear_user) +EXPORT_SYMBOL(__arch_clear_user) --- linux-riscv-5.4.0.orig/arch/powerpc/mm/book3s64/hash_utils.c +++ linux-riscv-5.4.0/arch/powerpc/mm/book3s64/hash_utils.c @@ -294,10 +294,18 @@ ret = mmu_hash_ops.hpte_insert(hpteg, vpn, paddr, tprot, HPTE_V_BOLTED, psize, psize, ssize); - + if (ret == -1) { + /* Try to remove a non bolted entry */ + ret = mmu_hash_ops.hpte_remove(hpteg); + if (ret != -1) + ret = mmu_hash_ops.hpte_insert(hpteg, vpn, paddr, tprot, + HPTE_V_BOLTED, psize, psize, + ssize); + } if (ret < 0) break; + cond_resched(); #ifdef CONFIG_DEBUG_PAGEALLOC if (debug_pagealloc_enabled() && (paddr >> PAGE_SHIFT) < linear_map_hash_count) --- linux-riscv-5.4.0.orig/arch/powerpc/mm/book3s64/pgtable.c +++ linux-riscv-5.4.0/arch/powerpc/mm/book3s64/pgtable.c @@ -378,7 +378,6 @@ } } -#ifdef CONFIG_SMP void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int index) { unsigned long pgf = (unsigned long)table; @@ -395,12 +394,6 @@ return pgtable_free(table, index); } -#else -void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int index) -{ - return pgtable_free(table, index); -} -#endif #ifdef CONFIG_PROC_FS atomic_long_t direct_pages_count[MMU_PAGE_COUNT]; --- linux-riscv-5.4.0.orig/arch/powerpc/mm/fault.c +++ linux-riscv-5.4.0/arch/powerpc/mm/fault.c @@ -233,7 +233,7 @@ // Read/write fault in a valid region (the exception table search passed // above), but blocked by KUAP is bad, it can never succeed. - if (bad_kuap_fault(regs, is_write)) + if (bad_kuap_fault(regs, address, is_write)) return true; // What's left? Kernel fault on user in well defined regions (extable @@ -354,6 +354,9 @@ * Userspace trying to access kernel address, we get PROTFAULT for that. */ if (is_user && address >= TASK_SIZE) { + if ((long)address == -1) + return; + pr_crit_ratelimited("%s[%d]: User access of kernel address (%lx) - exploit attempt? (uid: %d)\n", current->comm, current->pid, address, from_kuid(&init_user_ns, current_uid())); --- linux-riscv-5.4.0.orig/arch/powerpc/mm/hugetlbpage.c +++ linux-riscv-5.4.0/arch/powerpc/mm/hugetlbpage.c @@ -53,20 +53,24 @@ if (pshift >= pdshift) { cachep = PGT_CACHE(PTE_T_ORDER); num_hugepd = 1 << (pshift - pdshift); + new = NULL; } else if (IS_ENABLED(CONFIG_PPC_8xx)) { - cachep = PGT_CACHE(PTE_INDEX_SIZE); + cachep = NULL; num_hugepd = 1; + new = pte_alloc_one(mm); } else { cachep = PGT_CACHE(pdshift - pshift); num_hugepd = 1; + new = NULL; } - if (!cachep) { + if (!cachep && !new) { WARN_ONCE(1, "No page table cache created for hugetlb tables"); return -ENOMEM; } - new = kmem_cache_alloc(cachep, pgtable_gfp_flags(mm, GFP_KERNEL)); + if (cachep) + new = kmem_cache_alloc(cachep, pgtable_gfp_flags(mm, GFP_KERNEL)); BUG_ON(pshift > HUGEPD_SHIFT_MASK); BUG_ON((unsigned long)new & HUGEPD_SHIFT_MASK); @@ -97,7 +101,10 @@ if (i < num_hugepd) { for (i = i - 1 ; i >= 0; i--, hpdp--) *hpdp = __hugepd(0); - kmem_cache_free(cachep, new); + if (cachep) + kmem_cache_free(cachep, new); + else + pte_free(mm, new); } else { kmemleak_ignore(new); } @@ -324,8 +331,7 @@ if (shift >= pdshift) hugepd_free(tlb, hugepte); else if (IS_ENABLED(CONFIG_PPC_8xx)) - pgtable_free_tlb(tlb, hugepte, - get_hugepd_cache_index(PTE_INDEX_SIZE)); + pgtable_free_tlb(tlb, hugepte, 0); else pgtable_free_tlb(tlb, hugepte, get_hugepd_cache_index(pdshift - shift)); @@ -639,12 +645,13 @@ * if we have pdshift and shift value same, we don't * use pgt cache for hugepd. */ - if (pdshift > shift && IS_ENABLED(CONFIG_PPC_8xx)) - pgtable_cache_add(PTE_INDEX_SIZE); - else if (pdshift > shift) - pgtable_cache_add(pdshift - shift); - else if (IS_ENABLED(CONFIG_PPC_FSL_BOOK3E) || IS_ENABLED(CONFIG_PPC_8xx)) + if (pdshift > shift) { + if (!IS_ENABLED(CONFIG_PPC_8xx)) + pgtable_cache_add(pdshift - shift); + } else if (IS_ENABLED(CONFIG_PPC_FSL_BOOK3E) || + IS_ENABLED(CONFIG_PPC_8xx)) { pgtable_cache_add(PTE_T_ORDER); + } configured = true; } --- linux-riscv-5.4.0.orig/arch/powerpc/mm/mem.c +++ linux-riscv-5.4.0/arch/powerpc/mm/mem.c @@ -104,6 +104,27 @@ return -ENODEV; } +#define FLUSH_CHUNK_SIZE SZ_1G +/** + * flush_dcache_range_chunked(): Write any modified data cache blocks out to + * memory and invalidate them, in chunks of up to FLUSH_CHUNK_SIZE + * Does not invalidate the corresponding instruction cache blocks. + * + * @start: the start address + * @stop: the stop address (exclusive) + * @chunk: the max size of the chunks + */ +static void flush_dcache_range_chunked(unsigned long start, unsigned long stop, + unsigned long chunk) +{ + unsigned long i; + + for (i = start; i < stop; i += chunk) { + flush_dcache_range(i, min(stop, i + chunk)); + cond_resched(); + } +} + int __ref arch_add_memory(int nid, u64 start, u64 size, struct mhp_restrictions *restrictions) { @@ -120,7 +141,8 @@ start, start + size, rc); return -EFAULT; } - flush_dcache_range(start, start + size); + + flush_dcache_range_chunked(start, start + size, FLUSH_CHUNK_SIZE); return __add_pages(nid, start_pfn, nr_pages, restrictions); } @@ -130,14 +152,14 @@ { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; - struct page *page = pfn_to_page(start_pfn) + vmem_altmap_offset(altmap); int ret; - __remove_pages(page_zone(page), start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap); /* Remove htab bolted mappings for this section of memory */ start = (unsigned long)__va(start); - flush_dcache_range(start, start + size); + flush_dcache_range_chunked(start, start + size, FLUSH_CHUNK_SIZE); + ret = remove_section_mapping(start, start + size); WARN_ON_ONCE(ret); @@ -260,6 +282,14 @@ BUILD_BUG_ON(MMU_PAGE_COUNT > 16); #ifdef CONFIG_SWIOTLB + /* + * Some platforms (e.g. 85xx) limit DMA-able memory way below + * 4G. We force memblock to bottom-up mode to ensure that the + * memory allocated in swiotlb_init() is DMA-able. + * As it's the last memblock allocation, no need to reset it + * back to to-down. + */ + memblock_set_bottom_up(true); swiotlb_init(0); #endif @@ -318,6 +348,122 @@ free_initmem_default(POISON_FREE_INITMEM); } +/** + * flush_coherent_icache() - if a CPU has a coherent icache, flush it + * @addr: The base address to use (can be any valid address, the whole cache will be flushed) + * Return true if the cache was flushed, false otherwise + */ +static inline bool flush_coherent_icache(unsigned long addr) +{ + /* + * For a snooping icache, we still need a dummy icbi to purge all the + * prefetched instructions from the ifetch buffers. We also need a sync + * before the icbi to order the the actual stores to memory that might + * have modified instructions with the icbi. + */ + if (cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) { + mb(); /* sync */ + allow_read_from_user((const void __user *)addr, L1_CACHE_BYTES); + icbi((void *)addr); + prevent_read_from_user((const void __user *)addr, L1_CACHE_BYTES); + mb(); /* sync */ + isync(); + return true; + } + + return false; +} + +/** + * invalidate_icache_range() - Flush the icache by issuing icbi across an address range + * @start: the start address + * @stop: the stop address (exclusive) + */ +static void invalidate_icache_range(unsigned long start, unsigned long stop) +{ + unsigned long shift = l1_icache_shift(); + unsigned long bytes = l1_icache_bytes(); + char *addr = (char *)(start & ~(bytes - 1)); + unsigned long size = stop - (unsigned long)addr + (bytes - 1); + unsigned long i; + + for (i = 0; i < size >> shift; i++, addr += bytes) + icbi(addr); + + mb(); /* sync */ + isync(); +} + +/** + * flush_icache_range: Write any modified data cache blocks out to memory + * and invalidate the corresponding blocks in the instruction cache + * + * Generic code will call this after writing memory, before executing from it. + * + * @start: the start address + * @stop: the stop address (exclusive) + */ +void flush_icache_range(unsigned long start, unsigned long stop) +{ + if (flush_coherent_icache(start)) + return; + + clean_dcache_range(start, stop); + + if (IS_ENABLED(CONFIG_44x)) { + /* + * Flash invalidate on 44x because we are passed kmapped + * addresses and this doesn't work for userspace pages due to + * the virtually tagged icache. + */ + iccci((void *)start); + mb(); /* sync */ + isync(); + } else + invalidate_icache_range(start, stop); +} +EXPORT_SYMBOL(flush_icache_range); + +#if !defined(CONFIG_PPC_8xx) && !defined(CONFIG_PPC64) +/** + * flush_dcache_icache_phys() - Flush a page by it's physical address + * @physaddr: the physical address of the page + */ +static void flush_dcache_icache_phys(unsigned long physaddr) +{ + unsigned long bytes = l1_dcache_bytes(); + unsigned long nb = PAGE_SIZE / bytes; + unsigned long addr = physaddr & PAGE_MASK; + unsigned long msr, msr0; + unsigned long loop1 = addr, loop2 = addr; + + msr0 = mfmsr(); + msr = msr0 & ~MSR_DR; + /* + * This must remain as ASM to prevent potential memory accesses + * while the data MMU is disabled + */ + asm volatile( + " mtctr %2;\n" + " mtmsr %3;\n" + " isync;\n" + "0: dcbst 0, %0;\n" + " addi %0, %0, %4;\n" + " bdnz 0b;\n" + " sync;\n" + " mtctr %2;\n" + "1: icbi 0, %1;\n" + " addi %1, %1, %4;\n" + " bdnz 1b;\n" + " sync;\n" + " mtmsr %5;\n" + " isync;\n" + : "+&r" (loop1), "+&r" (loop2) + : "r" (nb), "r" (msr), "i" (bytes), "r" (msr0) + : "ctr", "memory"); +} +#endif // !defined(CONFIG_PPC_8xx) && !defined(CONFIG_PPC64) + /* * This is called when a page has been modified by the kernel. * It just marks the page as not i-cache clean. We do the i-cache @@ -350,12 +496,46 @@ __flush_dcache_icache(start); kunmap_atomic(start); } else { - __flush_dcache_icache_phys(page_to_pfn(page) << PAGE_SHIFT); + unsigned long addr = page_to_pfn(page) << PAGE_SHIFT; + + if (flush_coherent_icache(addr)) + return; + flush_dcache_icache_phys(addr); } #endif } EXPORT_SYMBOL(flush_dcache_icache_page); +/** + * __flush_dcache_icache(): Flush a particular page from the data cache to RAM. + * Note: this is necessary because the instruction cache does *not* + * snoop from the data cache. + * + * @page: the address of the page to flush + */ +void __flush_dcache_icache(void *p) +{ + unsigned long addr = (unsigned long)p; + + if (flush_coherent_icache(addr)) + return; + + clean_dcache_range(addr, addr + PAGE_SIZE); + + /* + * We don't flush the icache on 44x. Those have a virtual icache and we + * don't have access to the virtual address here (it's not the page + * vaddr but where it's mapped in user space). The flushing of the + * icache on these is handled elsewhere, when a change in the address + * space occurs, before returning to user space. + */ + + if (cpu_has_feature(MMU_FTR_TYPE_44x)) + return; + + invalidate_icache_range(addr, addr + PAGE_SIZE); +} + void clear_user_page(void *page, unsigned long vaddr, struct page *pg) { clear_page(page); --- linux-riscv-5.4.0.orig/arch/powerpc/mm/pgtable_32.c +++ linux-riscv-5.4.0/arch/powerpc/mm/pgtable_32.c @@ -221,6 +221,7 @@ if (v_block_mapped((unsigned long)_sinittext)) { mmu_mark_rodata_ro(); + ptdump_check_wx(); return; } --- linux-riscv-5.4.0.orig/arch/powerpc/mm/ptdump/ptdump.c +++ linux-riscv-5.4.0/arch/powerpc/mm/ptdump/ptdump.c @@ -173,10 +173,12 @@ static void note_prot_wx(struct pg_state *st, unsigned long addr) { + pte_t pte = __pte(st->current_flags); + if (!IS_ENABLED(CONFIG_PPC_DEBUG_WX) || !st->check_wx) return; - if (!((st->current_flags & pgprot_val(PAGE_KERNEL_X)) == pgprot_val(PAGE_KERNEL_X))) + if (!pte_write(pte) || !pte_exec(pte)) return; WARN_ONCE(1, "powerpc/mm: Found insecure W+X mapping at address %p/%pS\n", --- linux-riscv-5.4.0.orig/arch/powerpc/mm/slice.c +++ linux-riscv-5.4.0/arch/powerpc/mm/slice.c @@ -50,7 +50,7 @@ #endif -static inline bool slice_addr_is_low(unsigned long addr) +static inline notrace bool slice_addr_is_low(unsigned long addr) { u64 tmp = (u64)addr; @@ -659,7 +659,7 @@ mm_ctx_user_psize(¤t->mm->context), 1); } -unsigned int get_slice_psize(struct mm_struct *mm, unsigned long addr) +unsigned int notrace get_slice_psize(struct mm_struct *mm, unsigned long addr) { unsigned char *psizes; int index, mask_index; --- linux-riscv-5.4.0.orig/arch/powerpc/platforms/powernv/Makefile +++ linux-riscv-5.4.0/arch/powerpc/platforms/powernv/Makefile @@ -20,3 +20,4 @@ obj-$(CONFIG_PPC_VAS) += vas.o vas-window.o vas-debug.o obj-$(CONFIG_OCXL_BASE) += ocxl.o obj-$(CONFIG_SCOM_DEBUGFS) += opal-xscom.o +obj-$(CONFIG_PPC_SECURE_BOOT) += opal-secvar.o --- linux-riscv-5.4.0.orig/arch/powerpc/platforms/powernv/opal-call.c +++ linux-riscv-5.4.0/arch/powerpc/platforms/powernv/opal-call.c @@ -290,3 +290,6 @@ OPAL_CALL(opal_mpipl_update, OPAL_MPIPL_UPDATE); OPAL_CALL(opal_mpipl_register_tag, OPAL_MPIPL_REGISTER_TAG); OPAL_CALL(opal_mpipl_query_tag, OPAL_MPIPL_QUERY_TAG); +OPAL_CALL(opal_secvar_get, OPAL_SECVAR_GET); +OPAL_CALL(opal_secvar_get_next, OPAL_SECVAR_GET_NEXT); +OPAL_CALL(opal_secvar_enqueue_update, OPAL_SECVAR_ENQUEUE_UPDATE); --- linux-riscv-5.4.0.orig/arch/powerpc/platforms/powernv/opal-imc.c +++ linux-riscv-5.4.0/arch/powerpc/platforms/powernv/opal-imc.c @@ -59,10 +59,6 @@ imc_debugfs_parent = debugfs_create_dir("imc", powerpc_debugfs_root); - /* - * Return here, either because 'imc' directory already exists, - * Or failed to create a new one. - */ if (!imc_debugfs_parent) return; @@ -135,7 +131,6 @@ } pmu_ptr->imc_counter_mmaped = true; - export_imc_mode_and_cmd(node, pmu_ptr); kfree(base_addr_arr); kfree(chipid_arr); return 0; @@ -151,7 +146,7 @@ * and domain as the inputs. * Allocates memory for the struct imc_pmu, sets up its domain, size and offsets */ -static int imc_pmu_create(struct device_node *parent, int pmu_index, int domain) +static struct imc_pmu *imc_pmu_create(struct device_node *parent, int pmu_index, int domain) { int ret = 0; struct imc_pmu *pmu_ptr; @@ -159,27 +154,23 @@ /* Return for unknown domain */ if (domain < 0) - return -EINVAL; + return NULL; /* memory for pmu */ pmu_ptr = kzalloc(sizeof(*pmu_ptr), GFP_KERNEL); if (!pmu_ptr) - return -ENOMEM; + return NULL; /* Set the domain */ pmu_ptr->domain = domain; ret = of_property_read_u32(parent, "size", &pmu_ptr->counter_mem_size); - if (ret) { - ret = -EINVAL; + if (ret) goto free_pmu; - } if (!of_property_read_u32(parent, "offset", &offset)) { - if (imc_get_mem_addr_nest(parent, pmu_ptr, offset)) { - ret = -EINVAL; + if (imc_get_mem_addr_nest(parent, pmu_ptr, offset)) goto free_pmu; - } } /* Function to register IMC pmu */ @@ -190,14 +181,14 @@ if (pmu_ptr->domain == IMC_DOMAIN_NEST) kfree(pmu_ptr->mem_info); kfree(pmu_ptr); - return ret; + return NULL; } - return 0; + return pmu_ptr; free_pmu: kfree(pmu_ptr); - return ret; + return NULL; } static void disable_nest_pmu_counters(void) @@ -254,6 +245,7 @@ static int opal_imc_counters_probe(struct platform_device *pdev) { struct device_node *imc_dev = pdev->dev.of_node; + struct imc_pmu *pmu; int pmu_count = 0, domain; bool core_imc_reg = false, thread_imc_reg = false; u32 type; @@ -269,6 +261,7 @@ } for_each_compatible_node(imc_dev, NULL, IMC_DTB_UNIT_COMPAT) { + pmu = NULL; if (of_property_read_u32(imc_dev, "type", &type)) { pr_warn("IMC Device without type property\n"); continue; @@ -285,7 +278,14 @@ domain = IMC_DOMAIN_THREAD; break; case IMC_TYPE_TRACE: - domain = IMC_DOMAIN_TRACE; + /* + * FIXME. Using trace_imc events to monitor application + * or KVM thread performance can cause a checkstop + * (system crash). + * Disable it for now. + */ + pr_info_once("IMC: disabling trace_imc PMU\n"); + domain = -1; break; default: pr_warn("IMC Unknown Device type \n"); @@ -293,9 +293,13 @@ break; } - if (!imc_pmu_create(imc_dev, pmu_count, domain)) { - if (domain == IMC_DOMAIN_NEST) + pmu = imc_pmu_create(imc_dev, pmu_count, domain); + if (pmu != NULL) { + if (domain == IMC_DOMAIN_NEST) { + if (!imc_debugfs_parent) + export_imc_mode_and_cmd(imc_dev, pmu); pmu_count++; + } if (domain == IMC_DOMAIN_CORE) core_imc_reg = true; if (domain == IMC_DOMAIN_THREAD) @@ -303,10 +307,6 @@ } } - /* If none of the nest units are registered, remove debugfs interface */ - if (pmu_count == 0) - debugfs_remove_recursive(imc_debugfs_parent); - /* If core imc is not registered, unregister thread-imc */ if (!core_imc_reg && thread_imc_reg) unregister_thread_imc(); --- linux-riscv-5.4.0.orig/arch/powerpc/platforms/powernv/opal-secvar.c +++ linux-riscv-5.4.0/arch/powerpc/platforms/powernv/opal-secvar.c @@ -0,0 +1,140 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * PowerNV code for secure variables + * + * Copyright (C) 2019 IBM Corporation + * Author: Claudio Carvalho + * Nayna Jain + * + * APIs to access secure variables managed by OPAL. + */ + +#define pr_fmt(fmt) "secvar: "fmt + +#include +#include +#include +#include +#include +#include + +static int opal_status_to_err(int rc) +{ + int err; + + switch (rc) { + case OPAL_SUCCESS: + err = 0; + break; + case OPAL_UNSUPPORTED: + err = -ENXIO; + break; + case OPAL_PARAMETER: + err = -EINVAL; + break; + case OPAL_RESOURCE: + err = -ENOSPC; + break; + case OPAL_HARDWARE: + err = -EIO; + break; + case OPAL_NO_MEM: + err = -ENOMEM; + break; + case OPAL_EMPTY: + err = -ENOENT; + break; + case OPAL_PARTIAL: + err = -EFBIG; + break; + default: + err = -EINVAL; + } + + return err; +} + +static int opal_get_variable(const char *key, uint64_t ksize, + u8 *data, uint64_t *dsize) +{ + int rc; + + if (!key || !dsize) + return -EINVAL; + + *dsize = cpu_to_be64(*dsize); + + rc = opal_secvar_get(key, ksize, data, dsize); + + *dsize = be64_to_cpu(*dsize); + + return opal_status_to_err(rc); +} + +static int opal_get_next_variable(const char *key, uint64_t *keylen, + uint64_t keybufsize) +{ + int rc; + + if (!key || !keylen) + return -EINVAL; + + *keylen = cpu_to_be64(*keylen); + + rc = opal_secvar_get_next(key, keylen, keybufsize); + + *keylen = be64_to_cpu(*keylen); + + return opal_status_to_err(rc); +} + +static int opal_set_variable(const char *key, uint64_t ksize, u8 *data, + uint64_t dsize) +{ + int rc; + + if (!key || !data) + return -EINVAL; + + rc = opal_secvar_enqueue_update(key, ksize, data, dsize); + + return opal_status_to_err(rc); +} + +static const struct secvar_operations opal_secvar_ops = { + .get = opal_get_variable, + .get_next = opal_get_next_variable, + .set = opal_set_variable, +}; + +static int opal_secvar_probe(struct platform_device *pdev) +{ + if (!opal_check_token(OPAL_SECVAR_GET) + || !opal_check_token(OPAL_SECVAR_GET_NEXT) + || !opal_check_token(OPAL_SECVAR_ENQUEUE_UPDATE)) { + pr_err("OPAL doesn't support secure variables\n"); + return -ENODEV; + } + + set_secvar_ops(&opal_secvar_ops); + + return 0; +} + +static const struct of_device_id opal_secvar_match[] = { + { .compatible = "ibm,secvar-backend",}, + {}, +}; + +static struct platform_driver opal_secvar_driver = { + .driver = { + .name = "secvar", + .of_match_table = opal_secvar_match, + }, +}; + +static int __init opal_secvar_init(void) +{ + return platform_driver_probe(&opal_secvar_driver, opal_secvar_probe); +} +device_initcall(opal_secvar_init); --- linux-riscv-5.4.0.orig/arch/powerpc/platforms/powernv/opal.c +++ linux-riscv-5.4.0/arch/powerpc/platforms/powernv/opal.c @@ -1002,6 +1002,9 @@ /* Initialise OPAL Power control interface */ opal_power_control_init(); + /* Initialize OPAL secure variables */ + opal_pdev_init("ibm,secvar-backend"); + return 0; } machine_subsys_initcall(powernv, opal_init); --- linux-riscv-5.4.0.orig/arch/powerpc/platforms/powernv/pci-ioda.c +++ linux-riscv-5.4.0/arch/powerpc/platforms/powernv/pci-ioda.c @@ -188,7 +188,7 @@ unsigned int pe_num = pe->pe_number; WARN_ON(pe->pdev); - WARN_ON(pe->npucomp); /* NPUs are not supposed to be freed */ + WARN_ON(pe->npucomp); /* NPUs for nvlink are not supposed to be freed */ kfree(pe->npucomp); memset(pe, 0, sizeof(struct pnv_ioda_pe)); clear_bit(pe_num, phb->ioda.pe_alloc); @@ -777,6 +777,34 @@ return 0; } +static void pnv_ioda_unset_peltv(struct pnv_phb *phb, + struct pnv_ioda_pe *pe, + struct pci_dev *parent) +{ + int64_t rc; + + while (parent) { + struct pci_dn *pdn = pci_get_pdn(parent); + + if (pdn && pdn->pe_number != IODA_INVALID_PE) { + rc = opal_pci_set_peltv(phb->opal_id, pdn->pe_number, + pe->pe_number, + OPAL_REMOVE_PE_FROM_DOMAIN); + /* XXX What to do in case of error ? */ + } + parent = parent->bus->self; + } + + opal_pci_eeh_freeze_clear(phb->opal_id, pe->pe_number, + OPAL_EEH_ACTION_CLEAR_FREEZE_ALL); + + /* Disassociate PE in PELT */ + rc = opal_pci_set_peltv(phb->opal_id, pe->pe_number, + pe->pe_number, OPAL_REMOVE_PE_FROM_DOMAIN); + if (rc) + pe_warn(pe, "OPAL error %lld remove self from PELTV\n", rc); +} + static int pnv_ioda_deconfigure_pe(struct pnv_phb *phb, struct pnv_ioda_pe *pe) { struct pci_dev *parent; @@ -827,25 +855,13 @@ for (rid = pe->rid; rid < rid_end; rid++) phb->ioda.pe_rmap[rid] = IODA_INVALID_PE; - /* Release from all parents PELT-V */ - while (parent) { - struct pci_dn *pdn = pci_get_pdn(parent); - if (pdn && pdn->pe_number != IODA_INVALID_PE) { - rc = opal_pci_set_peltv(phb->opal_id, pdn->pe_number, - pe->pe_number, OPAL_REMOVE_PE_FROM_DOMAIN); - /* XXX What to do in case of error ? */ - } - parent = parent->bus->self; - } - - opal_pci_eeh_freeze_clear(phb->opal_id, pe->pe_number, - OPAL_EEH_ACTION_CLEAR_FREEZE_ALL); + /* + * Release from all parents PELT-V. NPUs don't have a PELTV + * table + */ + if (phb->type != PNV_PHB_NPU_NVLINK && phb->type != PNV_PHB_NPU_OCAPI) + pnv_ioda_unset_peltv(phb, pe, parent); - /* Disassociate PE in PELT */ - rc = opal_pci_set_peltv(phb->opal_id, pe->pe_number, - pe->pe_number, OPAL_REMOVE_PE_FROM_DOMAIN); - if (rc) - pe_warn(pe, "OPAL error %lld remove self from PELTV\n", rc); rc = opal_pci_set_pe(phb->opal_id, pe->pe_number, pe->rid, bcomp, dcomp, fcomp, OPAL_UNMAP_PE); if (rc) @@ -1062,20 +1078,20 @@ return NULL; } - /* NOTE: We get only one ref to the pci_dev for the pdn, not for the - * pointer in the PE data structure, both should be destroyed at the - * same time. However, this needs to be looked at more closely again - * once we actually start removing things (Hotplug, SR-IOV, ...) + /* NOTE: We don't get a reference for the pointer in the PE + * data structure, both the device and PE structures should be + * destroyed at the same time. However, removing nvlink + * devices will need some work. * * At some point we want to remove the PDN completely anyways */ - pci_dev_get(dev); pdn->pe_number = pe->pe_number; pe->flags = PNV_IODA_PE_DEV; pe->pdev = dev; pe->pbus = NULL; pe->mve_number = -1; pe->rid = dev->bus->number << 8 | pdn->devfn; + pe->device_count++; pe_info(pe, "Associated device to PE\n"); @@ -1084,13 +1100,13 @@ pnv_ioda_free_pe(pe); pdn->pe_number = IODA_INVALID_PE; pe->pdev = NULL; - pci_dev_put(dev); return NULL; } /* Put PE to the list */ + mutex_lock(&phb->ioda.pe_list_mutex); list_add_tail(&pe->list, &phb->ioda.pe_list); - + mutex_unlock(&phb->ioda.pe_list_mutex); return pe; } @@ -1206,6 +1222,14 @@ struct pnv_phb *phb = hose->private_data; /* + * Intentionally leak a reference on the npu device (for + * nvlink only; this is not an opencapi path) to make sure it + * never goes away, as it's been the case all along and some + * work is needed otherwise. + */ + pci_dev_get(npu_pdev); + + /* * Due to a hardware errata PE#0 on the NPU is reserved for * error handling. This means we only have three PEs remaining * which need to be assigned to four links, implying some @@ -1228,11 +1252,11 @@ */ dev_info(&npu_pdev->dev, "Associating to existing PE %x\n", pe_num); - pci_dev_get(npu_pdev); npu_pdn = pci_get_pdn(npu_pdev); rid = npu_pdev->bus->number << 8 | npu_pdn->devfn; npu_pdn->pe_number = pe_num; phb->ioda.pe_rmap[rid] = pe->pe_number; + pe->device_count++; /* Map the PE to this link */ rc = opal_pci_set_pe(phb->opal_id, pe_num, rid, @@ -1268,8 +1292,6 @@ { struct pci_controller *hose; struct pnv_phb *phb; - struct pci_bus *bus; - struct pci_dev *pdev; struct pnv_ioda_pe *pe; list_for_each_entry(hose, &hose_list, list_node) { @@ -1281,11 +1303,6 @@ if (phb->model == PNV_PHB_MODEL_NPU2) WARN_ON_ONCE(pnv_npu2_init(hose)); } - if (phb->type == PNV_PHB_NPU_OCAPI) { - bus = hose->bus; - list_for_each_entry(pdev, &bus->devices, bus_list) - pnv_ioda_setup_dev_PE(pdev); - } } list_for_each_entry(hose, &hose_list, list_node) { phb = hose->private_data; @@ -1558,6 +1575,10 @@ /* Reserve PE for each VF */ for (vf_index = 0; vf_index < num_vfs; vf_index++) { + int vf_devfn = pci_iov_virtfn_devfn(pdev, vf_index); + int vf_bus = pci_iov_virtfn_bus(pdev, vf_index); + struct pci_dn *vf_pdn; + if (pdn->m64_single_mode) pe_num = pdn->pe_num_map[vf_index]; else @@ -1570,13 +1591,11 @@ pe->pbus = NULL; pe->parent_dev = pdev; pe->mve_number = -1; - pe->rid = (pci_iov_virtfn_bus(pdev, vf_index) << 8) | - pci_iov_virtfn_devfn(pdev, vf_index); + pe->rid = (vf_bus << 8) | vf_devfn; pe_info(pe, "VF %04d:%02d:%02d.%d associated with PE#%x\n", hose->global_number, pdev->bus->number, - PCI_SLOT(pci_iov_virtfn_devfn(pdev, vf_index)), - PCI_FUNC(pci_iov_virtfn_devfn(pdev, vf_index)), pe_num); + PCI_SLOT(vf_devfn), PCI_FUNC(vf_devfn), pe_num); if (pnv_ioda_configure_pe(phb, pe)) { /* XXX What do we do here ? */ @@ -1590,6 +1609,15 @@ list_add_tail(&pe->list, &phb->ioda.pe_list); mutex_unlock(&phb->ioda.pe_list_mutex); + /* associate this pe to it's pdn */ + list_for_each_entry(vf_pdn, &pdn->parent->child_list, list) { + if (vf_pdn->busno == vf_bus && + vf_pdn->devfn == vf_devfn) { + vf_pdn->pe_number = pe_num; + break; + } + } + pnv_pci_ioda2_setup_dma_pe(phb, pe); #ifdef CONFIG_IOMMU_API iommu_register_group(&pe->table_group, @@ -2889,9 +2917,6 @@ struct pci_dn *pdn; int mul, total_vfs; - if (!pdev->is_physfn || pci_dev_is_added(pdev)) - return; - pdn = pci_get_pdn(pdev); pdn->vfs_expanded = 0; pdn->m64_single_mode = false; @@ -2966,6 +2991,30 @@ res->end = res->start - 1; } } + +static void pnv_pci_ioda_fixup_iov(struct pci_dev *pdev) +{ + if (WARN_ON(pci_dev_is_added(pdev))) + return; + + if (pdev->is_virtfn) { + struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev); + + /* + * VF PEs are single-device PEs so their pdev pointer needs to + * be set. The pdev doesn't exist when the PE is allocated (in + * (pcibios_sriov_enable()) so we fix it up here. + */ + pe->pdev = pdev; + WARN_ON(!(pe->flags & PNV_IODA_PE_VF)); + } else if (pdev->is_physfn) { + /* + * For PFs adjust their allocated IOV resources to match what + * the PHB can support using it's M64 BAR table. + */ + pnv_pci_ioda_fixup_iov_resources(pdev); + } +} #endif /* CONFIG_PCI_IOV */ static void pnv_ioda_setup_pe_res(struct pnv_ioda_pe *pe, @@ -3383,6 +3432,28 @@ return true; } +static bool pnv_ocapi_enable_device_hook(struct pci_dev *dev) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + struct pnv_phb *phb = hose->private_data; + struct pci_dn *pdn; + struct pnv_ioda_pe *pe; + + if (!phb->initialized) + return true; + + pdn = pci_get_pdn(dev); + if (!pdn) + return false; + + if (pdn->pe_number == IODA_INVALID_PE) { + pe = pnv_ioda_setup_dev_PE(dev); + if (!pe) + return false; + } + return true; +} + static long pnv_pci_ioda1_unset_window(struct iommu_table_group *table_group, int num) { @@ -3512,7 +3583,10 @@ struct pnv_phb *phb = pe->phb; struct pnv_ioda_pe *slave, *tmp; + mutex_lock(&phb->ioda.pe_list_mutex); list_del(&pe->list); + mutex_unlock(&phb->ioda.pe_list_mutex); + switch (phb->type) { case PNV_PHB_IODA1: pnv_pci_ioda1_release_pe_dma(pe); @@ -3520,6 +3594,8 @@ case PNV_PHB_IODA2: pnv_pci_ioda2_release_pe_dma(pe); break; + case PNV_PHB_NPU_OCAPI: + break; default: WARN_ON(1); } @@ -3620,7 +3696,8 @@ }; static const struct pci_controller_ops pnv_npu_ocapi_ioda_controller_ops = { - .enable_device_hook = pnv_pci_enable_device_hook, + .enable_device_hook = pnv_ocapi_enable_device_hook, + .release_device = pnv_pci_release_device, .window_alignment = pnv_pci_window_alignment, .reset_secondary_bus = pnv_pci_reset_secondary_bus, .shutdown = pnv_pci_ioda_shutdown, @@ -3862,7 +3939,7 @@ ppc_md.pcibios_default_alignment = pnv_pci_default_alignment; #ifdef CONFIG_PCI_IOV - ppc_md.pcibios_fixup_sriov = pnv_pci_ioda_fixup_iov_resources; + ppc_md.pcibios_fixup_sriov = pnv_pci_ioda_fixup_iov; ppc_md.pcibios_iov_resource_alignment = pnv_pci_iov_resource_alignment; ppc_md.pcibios_sriov_enable = pnv_pcibios_sriov_enable; ppc_md.pcibios_sriov_disable = pnv_pcibios_sriov_disable; --- linux-riscv-5.4.0.orig/arch/powerpc/platforms/powernv/pci.c +++ linux-riscv-5.4.0/arch/powerpc/platforms/powernv/pci.c @@ -38,7 +38,7 @@ int pnv_pci_get_slot_id(struct device_node *np, uint64_t *id) { - struct device_node *parent = np; + struct device_node *node = np; u32 bdfn; u64 phbid; int ret; @@ -48,25 +48,29 @@ return -ENXIO; bdfn = ((bdfn & 0x00ffff00) >> 8); - while ((parent = of_get_parent(parent))) { - if (!PCI_DN(parent)) { - of_node_put(parent); + for (node = np; node; node = of_get_parent(node)) { + if (!PCI_DN(node)) { + of_node_put(node); break; } - if (!of_device_is_compatible(parent, "ibm,ioda2-phb") && - !of_device_is_compatible(parent, "ibm,ioda3-phb")) { - of_node_put(parent); + if (!of_device_is_compatible(node, "ibm,ioda2-phb") && + !of_device_is_compatible(node, "ibm,ioda3-phb") && + !of_device_is_compatible(node, "ibm,ioda2-npu2-opencapi-phb")) { + of_node_put(node); continue; } - ret = of_property_read_u64(parent, "ibm,opal-phbid", &phbid); + ret = of_property_read_u64(node, "ibm,opal-phbid", &phbid); if (ret) { - of_node_put(parent); + of_node_put(node); return -ENXIO; } - *id = PCI_SLOT_ID(phbid, bdfn); + if (of_device_is_compatible(node, "ibm,ioda2-npu2-opencapi-phb")) + *id = PCI_PHB_SLOT_ID(phbid); + else + *id = PCI_SLOT_ID(phbid, bdfn); return 0; } @@ -814,24 +818,6 @@ { struct pci_controller *hose = pci_bus_to_host(pdev->bus); struct pnv_phb *phb = hose->private_data; -#ifdef CONFIG_PCI_IOV - struct pnv_ioda_pe *pe; - struct pci_dn *pdn; - - /* Fix the VF pdn PE number */ - if (pdev->is_virtfn) { - pdn = pci_get_pdn(pdev); - WARN_ON(pdn->pe_number != IODA_INVALID_PE); - list_for_each_entry(pe, &phb->ioda.pe_list, list) { - if (pe->rid == ((pdev->bus->number << 8) | - (pdev->devfn & 0xff))) { - pdn->pe_number = pe->pe_number; - pe->pdev = pdev; - break; - } - } - } -#endif /* CONFIG_PCI_IOV */ if (phb && phb->dma_dev_setup) phb->dma_dev_setup(phb, pdev); @@ -945,6 +931,23 @@ if (!firmware_has_feature(FW_FEATURE_OPAL)) return; +#ifdef CONFIG_PCIEPORTBUS + /* + * On PowerNV PCIe devices are (currently) managed in cooperation + * with firmware. This isn't *strictly* required, but there's enough + * assumptions baked into both firmware and the platform code that + * it's unwise to allow the portbus services to be used. + * + * We need to fix this eventually, but for now set this flag to disable + * the portbus driver. The AER service isn't required since that AER + * events are handled via EEH. The pciehp hotplug driver can't work + * without kernel changes (and portbus binding breaks pnv_php). The + * other services also require some thinking about how we're going + * to integrate them. + */ + pcie_ports_disabled = true; +#endif + /* Look for IODA IO-Hubs. */ for_each_compatible_node(np, NULL, "ibm,ioda-hub") { pnv_pci_init_ioda_hub(np); --- linux-riscv-5.4.0.orig/arch/powerpc/platforms/pseries/cmm.c +++ linux-riscv-5.4.0/arch/powerpc/platforms/pseries/cmm.c @@ -411,6 +411,10 @@ .dev_name = "cmm", }; +static void cmm_release_device(struct device *dev) +{ +} + /** * cmm_sysfs_register - Register with sysfs * @@ -426,6 +430,7 @@ dev->id = 0; dev->bus = &cmm_subsys; + dev->release = cmm_release_device; if ((rc = device_register(dev))) goto subsys_unregister; --- linux-riscv-5.4.0.orig/arch/powerpc/platforms/pseries/hotplug-memory.c +++ linux-riscv-5.4.0/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -360,8 +360,10 @@ for (i = 0; i < scns_per_block; i++) { pfn = PFN_DOWN(phys_addr); - if (!pfn_present(pfn)) + if (!pfn_present(pfn)) { + phys_addr += MIN_MEMORY_BLOCK_SIZE; continue; + } rc &= is_mem_section_removable(pfn, PAGES_PER_SECTION); phys_addr += MIN_MEMORY_BLOCK_SIZE; --- linux-riscv-5.4.0.orig/arch/powerpc/platforms/pseries/iommu.c +++ linux-riscv-5.4.0/arch/powerpc/platforms/pseries/iommu.c @@ -36,7 +36,6 @@ #include #include #include -#include #include "pseries.h" @@ -133,10 +132,10 @@ return be64_to_cpu(*tcep); } -static void tce_free_pSeriesLP(struct iommu_table*, long, long); +static void tce_free_pSeriesLP(unsigned long liobn, long, long); static void tce_freemulti_pSeriesLP(struct iommu_table*, long, long); -static int tce_build_pSeriesLP(struct iommu_table *tbl, long tcenum, +static int tce_build_pSeriesLP(unsigned long liobn, long tcenum, long tceshift, long npages, unsigned long uaddr, enum dma_data_direction direction, unsigned long attrs) @@ -147,25 +146,25 @@ int ret = 0; long tcenum_start = tcenum, npages_start = npages; - rpn = __pa(uaddr) >> TCE_SHIFT; + rpn = __pa(uaddr) >> tceshift; proto_tce = TCE_PCI_READ; if (direction != DMA_TO_DEVICE) proto_tce |= TCE_PCI_WRITE; while (npages--) { - tce = proto_tce | (rpn & TCE_RPN_MASK) << TCE_RPN_SHIFT; - rc = plpar_tce_put((u64)tbl->it_index, (u64)tcenum << 12, tce); + tce = proto_tce | (rpn & TCE_RPN_MASK) << tceshift; + rc = plpar_tce_put((u64)liobn, (u64)tcenum << tceshift, tce); if (unlikely(rc == H_NOT_ENOUGH_RESOURCES)) { ret = (int)rc; - tce_free_pSeriesLP(tbl, tcenum_start, + tce_free_pSeriesLP(liobn, tcenum_start, (npages_start - (npages + 1))); break; } if (rc && printk_ratelimit()) { printk("tce_build_pSeriesLP: plpar_tce_put failed. rc=%lld\n", rc); - printk("\tindex = 0x%llx\n", (u64)tbl->it_index); + printk("\tindex = 0x%llx\n", (u64)liobn); printk("\ttcenum = 0x%llx\n", (u64)tcenum); printk("\ttce val = 0x%llx\n", tce ); dump_stack(); @@ -194,7 +193,8 @@ unsigned long flags; if ((npages == 1) || !firmware_has_feature(FW_FEATURE_MULTITCE)) { - return tce_build_pSeriesLP(tbl, tcenum, npages, uaddr, + return tce_build_pSeriesLP(tbl->it_index, tcenum, + tbl->it_page_shift, npages, uaddr, direction, attrs); } @@ -210,8 +210,9 @@ /* If allocation fails, fall back to the loop implementation */ if (!tcep) { local_irq_restore(flags); - return tce_build_pSeriesLP(tbl, tcenum, npages, uaddr, - direction, attrs); + return tce_build_pSeriesLP(tbl->it_index, tcenum, + tbl->it_page_shift, + npages, uaddr, direction, attrs); } __this_cpu_write(tce_page, tcep); } @@ -262,16 +263,16 @@ return ret; } -static void tce_free_pSeriesLP(struct iommu_table *tbl, long tcenum, long npages) +static void tce_free_pSeriesLP(unsigned long liobn, long tcenum, long npages) { u64 rc; while (npages--) { - rc = plpar_tce_put((u64)tbl->it_index, (u64)tcenum << 12, 0); + rc = plpar_tce_put((u64)liobn, (u64)tcenum << 12, 0); if (rc && printk_ratelimit()) { printk("tce_free_pSeriesLP: plpar_tce_put failed. rc=%lld\n", rc); - printk("\tindex = 0x%llx\n", (u64)tbl->it_index); + printk("\tindex = 0x%llx\n", (u64)liobn); printk("\ttcenum = 0x%llx\n", (u64)tcenum); dump_stack(); } @@ -286,7 +287,7 @@ u64 rc; if (!firmware_has_feature(FW_FEATURE_MULTITCE)) - return tce_free_pSeriesLP(tbl, tcenum, npages); + return tce_free_pSeriesLP(tbl->it_index, tcenum, npages); rc = plpar_tce_stuff((u64)tbl->it_index, (u64)tcenum << 12, 0, npages); @@ -401,6 +402,19 @@ u64 rc = 0; long l, limit; + if (!firmware_has_feature(FW_FEATURE_MULTITCE)) { + unsigned long tceshift = be32_to_cpu(maprange->tce_shift); + unsigned long dmastart = (start_pfn << PAGE_SHIFT) + + be64_to_cpu(maprange->dma_base); + unsigned long tcenum = dmastart >> tceshift; + unsigned long npages = num_pfn << PAGE_SHIFT >> tceshift; + void *uaddr = __va(start_pfn << PAGE_SHIFT); + + return tce_build_pSeriesLP(be32_to_cpu(maprange->liobn), + tcenum, tceshift, npages, (unsigned long) uaddr, + DMA_BIDIRECTIONAL, 0); + } + local_irq_disable(); /* to protect tcep and the page behind it */ tcep = __this_cpu_read(tce_page); @@ -1320,15 +1334,7 @@ of_reconfig_notifier_register(&iommu_reconfig_nb); register_memory_notifier(&iommu_mem_nb); - /* - * Secure guest memory is inacessible to devices so regular DMA isn't - * possible. - * - * In that case keep devices' dma_map_ops as NULL so that the generic - * DMA code path will use SWIOTLB to bounce buffers for DMA. - */ - if (!is_secure_guest()) - set_pci_dma_ops(&dma_iommu_ops); + set_pci_dma_ops(&dma_iommu_ops); } static int __init disable_multitce(char *str) --- linux-riscv-5.4.0.orig/arch/powerpc/platforms/pseries/lparcfg.c +++ linux-riscv-5.4.0/arch/powerpc/platforms/pseries/lparcfg.c @@ -435,10 +435,10 @@ { unsigned long maxmem = 0; - maxmem += drmem_info->n_lmbs * drmem_info->lmb_size; + maxmem += (unsigned long)drmem_info->n_lmbs * drmem_info->lmb_size; maxmem += hugetlb_total_pages() * PAGE_SIZE; - seq_printf(m, "MaxMem=%ld\n", maxmem); + seq_printf(m, "MaxMem=%lu\n", maxmem); } static int pseries_lparcfg_data(struct seq_file *m, void *v) --- linux-riscv-5.4.0.orig/arch/powerpc/platforms/pseries/papr_scm.c +++ linux-riscv-5.4.0/arch/powerpc/platforms/pseries/papr_scm.c @@ -152,7 +152,7 @@ int len, read; int64_t ret; - if ((hdr->in_offset + hdr->in_length) >= p->metadata_size) + if ((hdr->in_offset + hdr->in_length) > p->metadata_size) return -EINVAL; for (len = hdr->in_length; len; len -= read) { @@ -206,7 +206,7 @@ __be64 data_be; int64_t ret; - if ((hdr->in_offset + hdr->in_length) >= p->metadata_size) + if ((hdr->in_offset + hdr->in_length) > p->metadata_size) return -EINVAL; for (len = hdr->in_length; len; len -= wrote) { @@ -342,6 +342,7 @@ p->bus = nvdimm_bus_register(NULL, &p->bus_desc); if (!p->bus) { dev_err(dev, "Error creating nvdimm bus %pOF\n", p->dn); + kfree(p->bus_desc.provider_name); return -ENXIO; } @@ -498,6 +499,7 @@ nvdimm_bus_unregister(p->bus); drc_pmem_unbind(p); + kfree(p->bus_desc.provider_name); kfree(p); return 0; --- linux-riscv-5.4.0.orig/arch/powerpc/platforms/pseries/setup.c +++ linux-riscv-5.4.0/arch/powerpc/platforms/pseries/setup.c @@ -74,6 +74,9 @@ #include "pseries.h" #include "../../../../drivers/pci/pci.h" +DEFINE_STATIC_KEY_FALSE(shared_processor); +EXPORT_SYMBOL_GPL(shared_processor); + int CMO_PrPSP = -1; int CMO_SecPSP = -1; unsigned long CMO_PageSize = (ASM_CONST(1) << IOMMU_PAGE_SHIFT_4K); @@ -758,6 +761,10 @@ if (firmware_has_feature(FW_FEATURE_LPAR)) { vpa_init(boot_cpuid); + + if (lppaca_shared_proc(get_lppaca())) + static_branch_enable(&shared_processor); + ppc_md.power_save = pseries_lpar_idle; ppc_md.enable_pmcs = pseries_lpar_enable_pmcs; #ifdef CONFIG_PCI_IOV --- linux-riscv-5.4.0.orig/arch/powerpc/platforms/pseries/vio.c +++ linux-riscv-5.4.0/arch/powerpc/platforms/pseries/vio.c @@ -36,7 +36,6 @@ .name = "vio", .type = "", .dev.init_name = "vio", - .dev.bus = &vio_bus_type, }; #ifdef CONFIG_PPC_SMLPAR @@ -1176,6 +1175,8 @@ if (tbl == NULL) return NULL; + kref_init(&tbl->it_kref); + of_parse_dma_window(dev->dev.of_node, dma_window, &tbl->it_index, &offset, &size); --- linux-riscv-5.4.0.orig/arch/powerpc/sysdev/xive/common.c +++ linux-riscv-5.4.0/arch/powerpc/sysdev/xive/common.c @@ -972,12 +972,21 @@ enum irqchip_irq_state which, bool *state) { struct xive_irq_data *xd = irq_data_get_irq_handler_data(data); + u8 pq; switch (which) { case IRQCHIP_STATE_ACTIVE: - *state = !xd->stale_p && - (xd->saved_p || - !!(xive_esb_read(xd, XIVE_ESB_GET) & XIVE_ESB_VAL_P)); + pq = xive_esb_read(xd, XIVE_ESB_GET); + + /* + * The esb value being all 1's means we couldn't get + * the PQ state of the interrupt through mmio. It may + * happen, for example when querying a PHB interrupt + * while the PHB is in an error state. We consider the + * interrupt to be inactive in that case. + */ + *state = (pq != XIVE_ESB_INVALID) && !xd->stale_p && + (xd->saved_p || !!(pq & XIVE_ESB_VAL_P)); return 0; default: return -EINVAL; @@ -1035,6 +1044,15 @@ xd->target = XIVE_INVALID_TARGET; irq_set_handler_data(virq, xd); + /* + * Turn OFF by default the interrupt being mapped. A side + * effect of this check is the mapping the ESB page of the + * interrupt in the Linux address space. This prevents page + * fault issues in the crash handler which masks all + * interrupts. + */ + xive_esb_read(xd, XIVE_ESB_SET_PQ_01); + return 0; } --- linux-riscv-5.4.0.orig/arch/powerpc/sysdev/xive/spapr.c +++ linux-riscv-5.4.0/arch/powerpc/sysdev/xive/spapr.c @@ -392,20 +392,28 @@ data->esb_shift = esb_shift; data->trig_page = trig_page; + data->hw_irq = hw_irq; + /* * No chip-id for the sPAPR backend. This has an impact how we * pick a target. See xive_pick_irq_target(). */ data->src_chip = XIVE_INVALID_CHIP_ID; + /* + * When the H_INT_ESB flag is set, the H_INT_ESB hcall should + * be used for interrupt management. Skip the remapping of the + * ESB pages which are not available. + */ + if (data->flags & XIVE_IRQ_FLAG_H_INT_ESB) + return 0; + data->eoi_mmio = ioremap(data->eoi_page, 1u << data->esb_shift); if (!data->eoi_mmio) { pr_err("Failed to map EOI page for irq 0x%x\n", hw_irq); return -ENOMEM; } - data->hw_irq = hw_irq; - /* Full function page supports trigger */ if (flags & XIVE_SRC_TRIGGER) { data->trig_mmio = data->eoi_mmio; --- linux-riscv-5.4.0.orig/arch/powerpc/tools/relocs_check.sh +++ linux-riscv-5.4.0/arch/powerpc/tools/relocs_check.sh @@ -10,24 +10,29 @@ # based on relocs_check.pl # Copyright © 2009 IBM Corporation -if [ $# -lt 2 ]; then - echo "$0 [path to objdump] [path to vmlinux]" 1>&2 +if [ $# -lt 3 ]; then + echo "$0 [path to objdump] [path to nm] [path to vmlinux]" 1>&2 exit 1 fi -# Have Kbuild supply the path to objdump so we handle cross compilation. +# Have Kbuild supply the path to objdump and nm so we handle cross compilation. objdump="$1" -vmlinux="$2" +nm="$2" +vmlinux="$3" + +# Remove from the bad relocations those that match an undefined weak symbol +# which will result in an absolute relocation to 0. +# Weak unresolved symbols are of that form in nm output: +# " w _binary__btf_vmlinux_bin_end" +undef_weak_symbols=$($nm "$vmlinux" | awk '$1 ~ /w/ { print $2 }') bad_relocs=$( -"$objdump" -R "$vmlinux" | +$objdump -R "$vmlinux" | # Only look at relocation lines. grep -E '\ - # R_PPC64_ADDR64 __crc_ # On PPC: # R_PPC_RELATIVE, R_PPC_ADDR16_HI, # R_PPC_ADDR16_HA,R_PPC_ADDR16_LO, @@ -39,8 +44,7 @@ R_PPC_ADDR16_HA R_PPC_RELATIVE R_PPC_NONE' | - grep -E -v '\:' | awk '{print $1}' ) BRANCHES=$( -"$objdump" -R "$vmlinux" -D --start-address=0xc000000000000000 \ +$objdump -R "$vmlinux" -D --start-address=0xc000000000000000 \ --stop-address=${end_intr} | grep -e "^c[0-9a-f]*:[[:space:]]*\([0-9a-f][0-9a-f][[:space:]]\)\{4\}[[:space:]]*b" | grep -v '\<__start_initialization_multiplatform>' | --- linux-riscv-5.4.0.orig/arch/powerpc/xmon/Makefile +++ linux-riscv-5.4.0/arch/powerpc/xmon/Makefile @@ -1,8 +1,8 @@ # SPDX-License-Identifier: GPL-2.0 # Makefile for xmon -# Disable clang warning for using setjmp without setjmp.h header -subdir-ccflags-y := $(call cc-disable-warning, builtin-requires-header) +# Avoid clang warnings around longjmp/setjmp declarations +subdir-ccflags-y := -ffreestanding GCOV_PROFILE := n KCOV_INSTRUMENT := n --- linux-riscv-5.4.0.orig/arch/powerpc/xmon/xmon.c +++ linux-riscv-5.4.0/arch/powerpc/xmon/xmon.c @@ -25,6 +25,7 @@ #include #include #include +#include #include #include @@ -187,6 +188,8 @@ static void dump_tlb_book3e(void); #endif +static void clear_all_bpt(void); + #ifdef CONFIG_PPC64 #define REG "%.16lx" #else @@ -283,10 +286,38 @@ " U show uptime information\n" " ? help\n" " # n limit output to n lines per page (for dp, dpa, dl)\n" -" zr reboot\n\ - zh halt\n" +" zr reboot\n" +" zh halt\n" ; +#ifdef CONFIG_SECURITY +static bool xmon_is_locked_down(void) +{ + static bool lockdown; + + if (!lockdown) { + lockdown = !!security_locked_down(LOCKDOWN_XMON_RW); + if (lockdown) { + printf("xmon: Disabled due to kernel lockdown\n"); + xmon_is_ro = true; + } + } + + if (!xmon_is_ro) { + xmon_is_ro = !!security_locked_down(LOCKDOWN_XMON_WR); + if (xmon_is_ro) + printf("xmon: Read-only due to kernel lockdown\n"); + } + + return lockdown; +} +#else /* CONFIG_SECURITY */ +static inline bool xmon_is_locked_down(void) +{ + return false; +} +#endif + static struct pt_regs *xmon_regs; static inline void sync(void) @@ -438,7 +469,10 @@ return false; } -#endif /* CONFIG_SMP */ +#else /* CONFIG_SMP */ +static inline void get_output_lock(void) {} +static inline void release_output_lock(void) {} +#endif static inline int unrecoverable_excp(struct pt_regs *regs) { @@ -455,6 +489,7 @@ int cmd = 0; struct bpt *bp; long recurse_jmp[JMP_BUF_LEN]; + bool locked_down; unsigned long offset; unsigned long flags; #ifdef CONFIG_SMP @@ -465,6 +500,8 @@ local_irq_save(flags); hard_irq_disable(); + locked_down = xmon_is_locked_down(); + if (!fromipi) { tracing_enabled = tracing_is_on(); tracing_off(); @@ -518,7 +555,8 @@ if (!fromipi) { get_output_lock(); - excprint(regs); + if (!locked_down) + excprint(regs); if (bp) { printf("cpu 0x%x stopped at breakpoint 0x%tx (", cpu, BP_NUM(bp)); @@ -570,10 +608,14 @@ } remove_bpts(); disable_surveillance(); - /* for breakpoint or single step, print the current instr. */ - if (bp || TRAP(regs) == 0xd00) - ppc_inst_dump(regs->nip, 1, 0); - printf("enter ? for help\n"); + + if (!locked_down) { + /* for breakpoint or single step, print curr insn */ + if (bp || TRAP(regs) == 0xd00) + ppc_inst_dump(regs->nip, 1, 0); + printf("enter ? for help\n"); + } + mb(); xmon_gate = 1; barrier(); @@ -597,8 +639,9 @@ spin_cpu_relax(); touch_nmi_watchdog(); } else { - cmd = cmds(regs); - if (cmd != 0) { + if (!locked_down) + cmd = cmds(regs); + if (locked_down || cmd != 0) { /* exiting xmon */ insert_bpts(); xmon_gate = 0; @@ -635,13 +678,16 @@ "can't continue\n"); remove_bpts(); disable_surveillance(); - /* for breakpoint or single step, print the current instr. */ - if (bp || TRAP(regs) == 0xd00) - ppc_inst_dump(regs->nip, 1, 0); - printf("enter ? for help\n"); + if (!locked_down) { + /* for breakpoint or single step, print current insn */ + if (bp || TRAP(regs) == 0xd00) + ppc_inst_dump(regs->nip, 1, 0); + printf("enter ? for help\n"); + } } - cmd = cmds(regs); + if (!locked_down) + cmd = cmds(regs); insert_bpts(); in_xmon = 0; @@ -670,7 +716,10 @@ } } #endif - insert_cpu_bpts(); + if (locked_down) + clear_all_bpt(); + else + insert_cpu_bpts(); touch_nmi_watchdog(); local_irq_restore(flags); @@ -1894,15 +1943,14 @@ printf("pidr = %.16lx tidr = %.16lx\n", mfspr(SPRN_PID), mfspr(SPRN_TIDR)); - printf("asdr = %.16lx psscr = %.16lx\n", - mfspr(SPRN_ASDR), hv ? mfspr(SPRN_PSSCR) - : mfspr(SPRN_PSSCR_PR)); + printf("psscr = %.16lx\n", + hv ? mfspr(SPRN_PSSCR) : mfspr(SPRN_PSSCR_PR)); if (!hv) return; - printf("ptcr = %.16lx\n", - mfspr(SPRN_PTCR)); + printf("ptcr = %.16lx asdr = %.16lx\n", + mfspr(SPRN_PTCR), mfspr(SPRN_ASDR)); #endif } @@ -3762,6 +3810,11 @@ #ifdef CONFIG_MAGIC_SYSRQ static void sysrq_handle_xmon(int key) { + if (xmon_is_locked_down()) { + clear_all_bpt(); + xmon_init(0); + return; + } /* ensure xmon is enabled */ xmon_init(1); debugger(get_irq_regs()); @@ -3783,7 +3836,6 @@ device_initcall(setup_xmon_sysrq); #endif /* CONFIG_MAGIC_SYSRQ */ -#ifdef CONFIG_DEBUG_FS static void clear_all_bpt(void) { int i; @@ -3801,18 +3853,22 @@ iabr = NULL; dabr.enabled = 0; } - - printf("xmon: All breakpoints cleared\n"); } +#ifdef CONFIG_DEBUG_FS static int xmon_dbgfs_set(void *data, u64 val) { xmon_on = !!val; xmon_init(xmon_on); /* make sure all breakpoints removed when disabling */ - if (!xmon_on) + if (!xmon_on) { clear_all_bpt(); + get_output_lock(); + printf("xmon: All breakpoints cleared\n"); + release_output_lock(); + } + return 0; } @@ -3838,7 +3894,11 @@ static int __init early_parse_xmon(char *p) { - if (!p || strncmp(p, "early", 5) == 0) { + if (xmon_is_locked_down()) { + xmon_init(0); + xmon_early = 0; + xmon_on = 0; + } else if (!p || strncmp(p, "early", 5) == 0) { /* just "xmon" is equivalent to "xmon=early" */ xmon_init(1); xmon_early = 1; --- linux-riscv-5.4.0.orig/arch/riscv/Kconfig +++ linux-riscv-5.4.0/arch/riscv/Kconfig @@ -31,6 +31,7 @@ select GENERIC_SMP_IDLE_THREAD select GENERIC_ATOMIC64 if !64BIT select HAVE_ARCH_AUDITSYSCALL + select HAVE_ARCH_SECCOMP_FILTER select HAVE_ASM_MODVERSIONS select HAVE_MEMBLOCK_NODE_MAP select HAVE_DMA_CONTIGUOUS @@ -61,6 +62,7 @@ select SPARSEMEM_STATIC if 32BIT select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU select HAVE_ARCH_MMAP_RND_BITS + select HAVE_COPY_THREAD_TLS config ARCH_MMAP_RND_BITS_MIN default 18 if 64BIT @@ -100,6 +102,7 @@ config ARCH_SPARSEMEM_ENABLE def_bool y + depends on MMU select SPARSEMEM_VMEMMAP_ENABLE config ARCH_SELECT_MEMORY_MODEL @@ -272,6 +275,19 @@ source "kernel/Kconfig.hz" +config SECCOMP + bool "Enable seccomp to safely compute untrusted bytecode" + help + This kernel feature is useful for number crunching applications + that may need to compute untrusted bytecode during their + execution. By using pipes or other transports made available to + the process as file descriptors supporting the read/write + syscalls, it's possible to isolate those applications in + their own address space using seccomp. Once seccomp is + enabled via prctl(PR_SET_SECCOMP), it cannot be disabled + and the task is only allowed to execute a few safe syscalls + defined by each seccomp mode. + endmenu menu "Boot options" --- linux-riscv-5.4.0.orig/arch/riscv/include/asm/seccomp.h +++ linux-riscv-5.4.0/arch/riscv/include/asm/seccomp.h @@ -0,0 +1,10 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef _ASM_SECCOMP_H +#define _ASM_SECCOMP_H + +#include + +#include + +#endif /* _ASM_SECCOMP_H */ --- linux-riscv-5.4.0.orig/arch/riscv/include/asm/thread_info.h +++ linux-riscv-5.4.0/arch/riscv/include/asm/thread_info.h @@ -75,6 +75,7 @@ #define TIF_MEMDIE 5 /* is terminating due to OOM killer */ #define TIF_SYSCALL_TRACEPOINT 6 /* syscall tracepoint instrumentation */ #define TIF_SYSCALL_AUDIT 7 /* syscall auditing */ +#define TIF_SECCOMP 8 /* syscall secure computing */ #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE) #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME) @@ -82,11 +83,13 @@ #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) #define _TIF_SYSCALL_TRACEPOINT (1 << TIF_SYSCALL_TRACEPOINT) #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT) +#define _TIF_SECCOMP (1 << TIF_SECCOMP) #define _TIF_WORK_MASK \ (_TIF_NOTIFY_RESUME | _TIF_SIGPENDING | _TIF_NEED_RESCHED) #define _TIF_SYSCALL_WORK \ - (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_TRACEPOINT | _TIF_SYSCALL_AUDIT) + (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_TRACEPOINT | _TIF_SYSCALL_AUDIT | \ + _TIF_SECCOMP) #endif /* _ASM_RISCV_THREAD_INFO_H */ --- linux-riscv-5.4.0.orig/arch/riscv/kernel/entry.S +++ linux-riscv-5.4.0/arch/riscv/kernel/entry.S @@ -226,8 +226,26 @@ /* Check to make sure we don't jump to a bogus syscall number. */ li t0, __NR_syscalls la s0, sys_ni_syscall - /* Syscall number held in a7 */ - bgeu a7, t0, 1f + /* + * The tracer can change syscall number to valid/invalid value. + * We use syscall_set_nr helper in syscall_trace_enter thus we + * cannot trust the current value in a7 and have to reload from + * the current task pt_regs. + */ + REG_L a7, PT_A7(sp) + /* + * Syscall number held in a7. + * If syscall number is above allowed value, redirect to ni_syscall. + */ + bge a7, t0, 1f + /* + * Check if syscall is rejected by tracer or seccomp, i.e., a7 == -1. + * If yes, we pretend it was executed. + */ + li t1, -1 + beq a7, t1, ret_from_syscall_rejected + blt a7, t1, 1f + /* Call syscall */ la s0, sys_call_table slli t0, a7, RISCV_LGPTR add s0, s0, t0 @@ -238,6 +256,12 @@ ret_from_syscall: /* Set user a0 to kernel a0 */ REG_S a0, PT_A0(sp) + /* + * We didn't execute the actual syscall. + * Seccomp already set return value for the current task pt_regs. + * (If it was configured with SECCOMP_RET_ERRNO/TRACE) + */ +ret_from_syscall_rejected: /* Trace syscalls, but only if requested by the user. */ REG_L t0, TASK_TI_FLAGS(tp) andi t0, t0, _TIF_SYSCALL_WORK --- linux-riscv-5.4.0.orig/arch/riscv/kernel/ftrace.c +++ linux-riscv-5.4.0/arch/riscv/kernel/ftrace.c @@ -142,7 +142,7 @@ */ old = *parent; - if (function_graph_enter(old, self_addr, frame_pointer, parent)) + if (!function_graph_enter(old, self_addr, frame_pointer, parent)) *parent = return_hooker; } --- linux-riscv-5.4.0.orig/arch/riscv/kernel/module.c +++ linux-riscv-5.4.0/arch/riscv/kernel/module.c @@ -8,6 +8,10 @@ #include #include #include +#include +#include +#include +#include static int apply_r_riscv_32_rela(struct module *me, u32 *location, Elf_Addr v) { @@ -386,3 +390,15 @@ return 0; } + +#if defined(CONFIG_MMU) && defined(CONFIG_64BIT) +#define VMALLOC_MODULE_START \ + max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START) +void *module_alloc(unsigned long size) +{ + return __vmalloc_node_range(size, 1, VMALLOC_MODULE_START, + VMALLOC_END, GFP_KERNEL, + PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE, + __builtin_return_address(0)); +} +#endif --- linux-riscv-5.4.0.orig/arch/riscv/kernel/process.c +++ linux-riscv-5.4.0/arch/riscv/kernel/process.c @@ -99,8 +99,8 @@ return 0; } -int copy_thread(unsigned long clone_flags, unsigned long usp, - unsigned long arg, struct task_struct *p) +int copy_thread_tls(unsigned long clone_flags, unsigned long usp, + unsigned long arg, struct task_struct *p, unsigned long tls) { struct pt_regs *childregs = task_pt_regs(p); @@ -120,7 +120,7 @@ if (usp) /* User fork */ childregs->sp = usp; if (clone_flags & CLONE_SETTLS) - childregs->tp = childregs->a5; + childregs->tp = tls; childregs->a0 = 0; /* Return value of fork() */ p->thread.ra = (unsigned long)ret_from_fork; } --- linux-riscv-5.4.0.orig/arch/riscv/kernel/ptrace.c +++ linux-riscv-5.4.0/arch/riscv/kernel/ptrace.c @@ -154,6 +154,16 @@ if (tracehook_report_syscall_entry(regs)) syscall_set_nr(current, regs, -1); + /* + * Do the secure computing after ptrace; failures should be fast. + * If this fails we might have return value in a0 from seccomp + * (via SECCOMP_RET_ERRNO/TRACE). + */ + if (secure_computing(NULL) == -1) { + syscall_set_nr(current, regs, -1); + return; + } + #ifdef CONFIG_HAVE_SYSCALL_TRACEPOINTS if (test_thread_flag(TIF_SYSCALL_TRACEPOINT)) trace_sys_enter(regs, syscall_get_nr(current, regs)); --- linux-riscv-5.4.0.orig/arch/riscv/kernel/vdso/Makefile +++ linux-riscv-5.4.0/arch/riscv/kernel/vdso/Makefile @@ -58,7 +58,8 @@ cmd_vdsold = $(CC) $(KBUILD_CFLAGS) $(call cc-option, -no-pie) -nostdlib -nostartfiles $(SYSCFLAGS_$(@F)) \ -Wl,-T,$(filter-out FORCE,$^) -o $@.tmp && \ $(CROSS_COMPILE)objcopy \ - $(patsubst %, -G __vdso_%, $(vdso-syms)) $@.tmp $@ + $(patsubst %, -G __vdso_%, $(vdso-syms)) $@.tmp $@ && \ + rm $@.tmp # install commands for the unstripped file quiet_cmd_vdso_install = INSTALL $@ --- linux-riscv-5.4.0.orig/arch/riscv/mm/cacheflush.c +++ linux-riscv-5.4.0/arch/riscv/mm/cacheflush.c @@ -14,6 +14,7 @@ { sbi_remote_fence_i(NULL); } +EXPORT_SYMBOL(flush_icache_all); /* * Performs an icache flush for the given MM context. RISC-V has no direct --- linux-riscv-5.4.0.orig/arch/riscv/mm/init.c +++ linux-riscv-5.4.0/arch/riscv/mm/init.c @@ -98,7 +98,7 @@ for_each_memblock(memory, reg) { phys_addr_t end = reg->base + reg->size; - if (reg->base <= vmlinux_end && vmlinux_end <= end) { + if (reg->base <= vmlinux_start && vmlinux_end <= end) { mem_size = min(reg->size, (phys_addr_t)-PAGE_OFFSET); /* --- linux-riscv-5.4.0.orig/arch/riscv/mm/tlbflush.c +++ linux-riscv-5.4.0/arch/riscv/mm/tlbflush.c @@ -2,6 +2,7 @@ #include #include +#include #include void flush_tlb_all(void) @@ -9,13 +10,33 @@ sbi_remote_sfence_vma(NULL, 0, -1); } +/* + * This function must not be called with cmask being null. + * Kernel may panic if cmask is NULL. + */ static void __sbi_tlb_flush_range(struct cpumask *cmask, unsigned long start, unsigned long size) { struct cpumask hmask; + unsigned int cpuid; - riscv_cpuid_to_hartid_mask(cmask, &hmask); - sbi_remote_sfence_vma(hmask.bits, start, size); + if (cpumask_empty(cmask)) + return; + + cpuid = get_cpu(); + + if (cpumask_any_but(cmask, cpuid) >= nr_cpu_ids) { + /* local cpu is the only cpu present in cpumask */ + if (size <= PAGE_SIZE) + local_flush_tlb_page(start); + else + local_flush_tlb_all(); + } else { + riscv_cpuid_to_hartid_mask(cmask, &hmask); + sbi_remote_sfence_vma(cpumask_bits(&hmask), start, size); + } + + put_cpu(); } void flush_tlb_mm(struct mm_struct *mm) --- linux-riscv-5.4.0.orig/arch/riscv/net/bpf_jit_comp.c +++ linux-riscv-5.4.0/arch/riscv/net/bpf_jit_comp.c @@ -120,6 +120,11 @@ return false; } +static void mark_fp(struct rv_jit_context *ctx) +{ + __set_bit(RV_CTX_F_SEEN_S5, &ctx->flags); +} + static void mark_call(struct rv_jit_context *ctx) { __set_bit(RV_CTX_F_SEEN_CALL, &ctx->flags); @@ -596,7 +601,8 @@ emit(rv_addi(RV_REG_SP, RV_REG_SP, stack_adjust), ctx); /* Set return value. */ - emit(rv_addi(RV_REG_A0, RV_REG_A5, 0), ctx); + if (reg == RV_REG_RA) + emit(rv_addi(RV_REG_A0, RV_REG_A5, 0), ctx); emit(rv_jalr(RV_REG_ZERO, reg, 0), ctx); } @@ -631,14 +637,14 @@ return -1; emit(rv_bgeu(RV_REG_A2, RV_REG_T1, off >> 1), ctx); - /* if (--TCC < 0) + /* if (TCC-- < 0) * goto out; */ emit(rv_addi(RV_REG_T1, tcc, -1), ctx); off = (tc_ninsn - (ctx->ninsns - start_insn)) << 2; if (is_13b_check(off, insn)) return -1; - emit(rv_blt(RV_REG_T1, RV_REG_ZERO, off >> 1), ctx); + emit(rv_blt(tcc, RV_REG_ZERO, off >> 1), ctx); /* prog = array->ptrs[index]; * if (!prog) @@ -1426,6 +1432,10 @@ { int stack_adjust = 0, store_offset, bpf_stack_adjust; + bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16); + if (bpf_stack_adjust) + mark_fp(ctx); + if (seen_reg(RV_REG_RA, ctx)) stack_adjust += 8; stack_adjust += 8; /* RV_REG_FP */ @@ -1443,7 +1453,6 @@ stack_adjust += 8; stack_adjust = round_up(stack_adjust, 16); - bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16); stack_adjust += bpf_stack_adjust; store_offset = stack_adjust - 8; --- linux-riscv-5.4.0.orig/arch/s390/Kconfig +++ linux-riscv-5.4.0/arch/s390/Kconfig @@ -155,6 +155,7 @@ select HAVE_KERNEL_UNCOMPRESSED select HAVE_KERNEL_XZ select HAVE_KPROBES + select HAVE_KPROBES_ON_FTRACE select HAVE_KRETPROBES select HAVE_KVM select HAVE_LIVEPATCH @@ -1006,3 +1007,11 @@ the KVM hypervisor. endmenu + +config KMSG_IDS + def_bool y + prompt "Kernel message numbers" + help + Select this option if you want to include a message number to the + prefix for kernel messages issued by the s390 architecture and + driver code. See "Documentation/s390/kmsg.txt" for more details. --- linux-riscv-5.4.0.orig/arch/s390/Makefile +++ linux-riscv-5.4.0/arch/s390/Makefile @@ -69,7 +69,7 @@ # cflags-$(CONFIG_FRAME_POINTER) += -fno-optimize-sibling-calls -ifeq ($(call cc-option-yn,-mpacked-stack),y) +ifeq ($(call cc-option-yn,-mpacked-stack -mbackchain -msoft-float),y) cflags-$(CONFIG_PACK_STACK) += -mpacked-stack -D__PACK_STACK aflags-$(CONFIG_PACK_STACK) += -D__PACK_STACK endif @@ -146,7 +146,7 @@ #KBUILD_IMAGE is necessary for packaging targets like rpm-pkg, deb-pkg... KBUILD_IMAGE := $(boot)/bzImage -install: vmlinux +install: $(Q)$(MAKE) $(build)=$(boot) $@ bzImage: vmlinux --- linux-riscv-5.4.0.orig/arch/s390/boot/Makefile +++ linux-riscv-5.4.0/arch/s390/boot/Makefile @@ -37,7 +37,7 @@ obj-y := head.o als.o startup.o mem_detect.o ipl_parm.o ipl_report.o obj-y += string.o ebcdic.o sclp_early_core.o mem.o ipl_vmparm.o cmdline.o obj-y += version.o pgm_check_info.o ctype.o text_dma.o -obj-$(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) += uv.o +obj-$(findstring y, $(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) $(CONFIG_PGSTE)) += uv.o obj-$(CONFIG_RELOCATABLE) += machine_kexec_reloc.o obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o targets := bzImage startup.a section_cmp.boot.data section_cmp.boot.preserved.data $(obj-y) @@ -70,7 +70,7 @@ $(obj)/startup.a: $(OBJECTS) FORCE $(call if_changed,ar) -install: $(CONFIGURE) $(obj)/bzImage +install: sh -x $(srctree)/$(obj)/install.sh $(KERNELRELEASE) $(obj)/bzImage \ System.map "$(INSTALL_PATH)" --- linux-riscv-5.4.0.orig/arch/s390/boot/compressed/decompressor.c +++ linux-riscv-5.4.0/arch/s390/boot/compressed/decompressor.c @@ -30,13 +30,13 @@ extern unsigned char _compressed_end[]; #ifdef CONFIG_HAVE_KERNEL_BZIP2 -#define HEAP_SIZE 0x400000 +#define BOOT_HEAP_SIZE 0x400000 #else -#define HEAP_SIZE 0x10000 +#define BOOT_HEAP_SIZE 0x10000 #endif static unsigned long free_mem_ptr = (unsigned long) _end; -static unsigned long free_mem_end_ptr = (unsigned long) _end + HEAP_SIZE; +static unsigned long free_mem_end_ptr = (unsigned long) _end + BOOT_HEAP_SIZE; #ifdef CONFIG_KERNEL_GZIP #include "../../../../lib/decompress_inflate.c" @@ -62,7 +62,7 @@ #include "../../../../lib/decompress_unxz.c" #endif -#define decompress_offset ALIGN((unsigned long)_end + HEAP_SIZE, PAGE_SIZE) +#define decompress_offset ALIGN((unsigned long)_end + BOOT_HEAP_SIZE, PAGE_SIZE) unsigned long mem_safe_offset(void) { --- linux-riscv-5.4.0.orig/arch/s390/boot/ipl_parm.c +++ linux-riscv-5.4.0/arch/s390/boot/ipl_parm.c @@ -14,6 +14,7 @@ char __bootdata(early_command_line)[COMMAND_LINE_SIZE]; struct ipl_parameter_block __bootdata_preserved(ipl_block); int __bootdata_preserved(ipl_block_valid); +unsigned int __bootdata_preserved(zlib_dfltcc_support) = ZLIB_DFLTCC_FULL; unsigned long __bootdata(vmalloc_size) = VMALLOC_DEFAULT_SIZE; unsigned long __bootdata(memory_end); @@ -229,6 +230,19 @@ if (!strcmp(param, "vmalloc") && val) vmalloc_size = round_up(memparse(val, NULL), PAGE_SIZE); + if (!strcmp(param, "dfltcc")) { + if (!strcmp(val, "off")) + zlib_dfltcc_support = ZLIB_DFLTCC_DISABLED; + else if (!strcmp(val, "on")) + zlib_dfltcc_support = ZLIB_DFLTCC_FULL; + else if (!strcmp(val, "def_only")) + zlib_dfltcc_support = ZLIB_DFLTCC_DEFLATE_ONLY; + else if (!strcmp(val, "inf_only")) + zlib_dfltcc_support = ZLIB_DFLTCC_INFLATE_ONLY; + else if (!strcmp(val, "always")) + zlib_dfltcc_support = ZLIB_DFLTCC_FULL_DEBUG; + } + if (!strcmp(param, "noexec")) { rc = kstrtobool(val, &enabled); if (!rc && !enabled) --- linux-riscv-5.4.0.orig/arch/s390/boot/kaslr.c +++ linux-riscv-5.4.0/arch/s390/boot/kaslr.c @@ -75,7 +75,7 @@ *(unsigned long *) prng.parm_block ^= seed; for (i = 0; i < 16; i++) { cpacf_kmc(CPACF_KMC_PRNG, prng.parm_block, - (char *) entropy, (char *) entropy, + (u8 *) entropy, (u8 *) entropy, sizeof(entropy)); memcpy(prng.parm_block, entropy, sizeof(entropy)); } --- linux-riscv-5.4.0.orig/arch/s390/boot/startup.c +++ linux-riscv-5.4.0/arch/s390/boot/startup.c @@ -170,6 +170,11 @@ handle_relocs(__kaslr_offset); if (__kaslr_offset) { + /* + * Save KASLR offset for early dumps, before vmcore_info is set. + * Mark as uneven to distinguish from real vmcore_info pointer. + */ + S390_lowcore.vmcore_info = __kaslr_offset | 0x1UL; /* Clear non-relocated kernel */ if (IS_ENABLED(CONFIG_KERNEL_UNCOMPRESSED)) memset(img, 0, vmlinux.image_size); --- linux-riscv-5.4.0.orig/arch/s390/boot/uv.c +++ linux-riscv-5.4.0/arch/s390/boot/uv.c @@ -3,7 +3,13 @@ #include #include +/* will be used in arch/s390/kernel/uv.c */ +#ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST int __bootdata_preserved(prot_virt_guest); +#endif +#if IS_ENABLED(CONFIG_KVM) +struct uv_info __bootdata_preserved(uv_info); +#endif void uv_query_info(void) { @@ -15,10 +21,25 @@ if (!test_facility(158)) return; - if (uv_call(0, (uint64_t)&uvcb)) + /* rc==0x100 means that there is additional data we do not process */ + if (uv_call(0, (uint64_t)&uvcb) && uvcb.header.rc != 0x100) return; + if (IS_ENABLED(CONFIG_KVM)) { + memcpy(uv_info.inst_calls_list, uvcb.inst_calls_list, sizeof(uv_info.inst_calls_list)); + uv_info.uv_base_stor_len = uvcb.uv_base_stor_len; + uv_info.guest_base_stor_len = uvcb.conf_base_phys_stor_len; + uv_info.guest_virt_base_stor_len = uvcb.conf_base_virt_stor_len; + uv_info.guest_virt_var_stor_len = uvcb.conf_virt_var_stor_len; + uv_info.guest_cpu_stor_len = uvcb.cpu_stor_len; + uv_info.max_sec_stor_addr = ALIGN(uvcb.max_guest_stor_addr, PAGE_SIZE); + uv_info.max_num_sec_conf = uvcb.max_num_sec_conf; + uv_info.max_guest_cpus = uvcb.max_guest_cpus; + } + +#ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST if (test_bit_inv(BIT_UVC_CMD_SET_SHARED_ACCESS, (unsigned long *)uvcb.inst_calls_list) && test_bit_inv(BIT_UVC_CMD_REMOVE_SHARED_ACCESS, (unsigned long *)uvcb.inst_calls_list)) prot_virt_guest = 1; +#endif } --- linux-riscv-5.4.0.orig/arch/s390/crypto/paes_s390.c +++ linux-riscv-5.4.0/arch/s390/crypto/paes_s390.c @@ -5,7 +5,7 @@ * s390 implementation of the AES Cipher Algorithm with protected keys. * * s390 Version: - * Copyright IBM Corp. 2017,2019 + * Copyright IBM Corp. 2017,2020 * Author(s): Martin Schwidefsky * Harald Freudenberger */ @@ -20,6 +20,7 @@ #include #include #include +#include #include #include #include @@ -31,11 +32,11 @@ * is called. As paes can handle different kinds of key blobs * and padding is also possible, the limits need to be generous. */ -#define PAES_MIN_KEYSIZE 64 -#define PAES_MAX_KEYSIZE 256 +#define PAES_MIN_KEYSIZE 16 +#define PAES_MAX_KEYSIZE 320 static u8 *ctrblk; -static DEFINE_SPINLOCK(ctrblk_lock); +static DEFINE_MUTEX(ctrblk_lock); static cpacf_mask_t km_functions, kmc_functions, kmctr_functions; @@ -52,19 +53,46 @@ unsigned int keylen; }; -static inline int _copy_key_to_kb(struct key_blob *kb, - const u8 *key, - unsigned int keylen) -{ - if (keylen <= sizeof(kb->keybuf)) +static inline int _key_to_kb(struct key_blob *kb, + const u8 *key, + unsigned int keylen) +{ + struct clearkey_header { + u8 type; + u8 res0[3]; + u8 version; + u8 res1[3]; + u32 keytype; + u32 len; + } __packed * h; + + switch (keylen) { + case 16: + case 24: + case 32: + /* clear key value, prepare pkey clear key token in keybuf */ + memset(kb->keybuf, 0, sizeof(kb->keybuf)); + h = (struct clearkey_header *) kb->keybuf; + h->version = 0x02; /* TOKVER_CLEAR_KEY */ + h->keytype = (keylen - 8) >> 3; + h->len = keylen; + memcpy(kb->keybuf + sizeof(*h), key, keylen); + kb->keylen = sizeof(*h) + keylen; kb->key = kb->keybuf; - else { - kb->key = kmalloc(keylen, GFP_KERNEL); - if (!kb->key) - return -ENOMEM; + break; + default: + /* other key material, let pkey handle this */ + if (keylen <= sizeof(kb->keybuf)) + kb->key = kb->keybuf; + else { + kb->key = kmalloc(keylen, GFP_KERNEL); + if (!kb->key) + return -ENOMEM; + } + memcpy(kb->key, key, keylen); + kb->keylen = keylen; + break; } - memcpy(kb->key, key, keylen); - kb->keylen = keylen; return 0; } @@ -81,16 +109,18 @@ struct s390_paes_ctx { struct key_blob kb; struct pkey_protkey pk; + spinlock_t pk_lock; unsigned long fc; }; struct s390_pxts_ctx { struct key_blob kb[2]; struct pkey_protkey pk[2]; + spinlock_t pk_lock; unsigned long fc; }; -static inline int __paes_convert_key(struct key_blob *kb, +static inline int __paes_keyblob2pkey(struct key_blob *kb, struct pkey_protkey *pk) { int i, ret; @@ -105,22 +135,18 @@ return ret; } -static int __paes_set_key(struct s390_paes_ctx *ctx) +static inline int __paes_convert_key(struct s390_paes_ctx *ctx) { - unsigned long fc; + struct pkey_protkey pkey; - if (__paes_convert_key(&ctx->kb, &ctx->pk)) + if (__paes_keyblob2pkey(&ctx->kb, &pkey)) return -EINVAL; - /* Pick the correct function code based on the protected key type */ - fc = (ctx->pk.type == PKEY_KEYTYPE_AES_128) ? CPACF_KM_PAES_128 : - (ctx->pk.type == PKEY_KEYTYPE_AES_192) ? CPACF_KM_PAES_192 : - (ctx->pk.type == PKEY_KEYTYPE_AES_256) ? CPACF_KM_PAES_256 : 0; - - /* Check if the function code is available */ - ctx->fc = (fc && cpacf_test_func(&km_functions, fc)) ? fc : 0; + spin_lock_bh(&ctx->pk_lock); + memcpy(&ctx->pk, &pkey, sizeof(pkey)); + spin_unlock_bh(&ctx->pk_lock); - return ctx->fc ? 0 : -EINVAL; + return 0; } static int ecb_paes_init(struct crypto_tfm *tfm) @@ -128,6 +154,7 @@ struct s390_paes_ctx *ctx = crypto_tfm_ctx(tfm); ctx->kb.key = NULL; + spin_lock_init(&ctx->pk_lock); return 0; } @@ -139,6 +166,24 @@ _free_kb_keybuf(&ctx->kb); } +static inline int __ecb_paes_set_key(struct s390_paes_ctx *ctx) +{ + unsigned long fc; + + if (__paes_convert_key(ctx)) + return -EINVAL; + + /* Pick the correct function code based on the protected key type */ + fc = (ctx->pk.type == PKEY_KEYTYPE_AES_128) ? CPACF_KM_PAES_128 : + (ctx->pk.type == PKEY_KEYTYPE_AES_192) ? CPACF_KM_PAES_192 : + (ctx->pk.type == PKEY_KEYTYPE_AES_256) ? CPACF_KM_PAES_256 : 0; + + /* Check if the function code is available */ + ctx->fc = (fc && cpacf_test_func(&km_functions, fc)) ? fc : 0; + + return ctx->fc ? 0 : -EINVAL; +} + static int ecb_paes_set_key(struct crypto_tfm *tfm, const u8 *in_key, unsigned int key_len) { @@ -146,13 +191,14 @@ struct s390_paes_ctx *ctx = crypto_tfm_ctx(tfm); _free_kb_keybuf(&ctx->kb); - rc = _copy_key_to_kb(&ctx->kb, in_key, key_len); + rc = _key_to_kb(&ctx->kb, in_key, key_len); if (rc) return rc; - if (__paes_set_key(ctx)) { + if (__ecb_paes_set_key(ctx)) { tfm->crt_flags |= CRYPTO_TFM_RES_BAD_KEY_LEN; return -EINVAL; + } return 0; } @@ -164,18 +210,31 @@ struct s390_paes_ctx *ctx = crypto_blkcipher_ctx(desc->tfm); unsigned int nbytes, n, k; int ret; + struct { + u8 key[MAXPROTKEYSIZE]; + } param; ret = blkcipher_walk_virt(desc, walk); + if (ret) + return ret; + + spin_lock_bh(&ctx->pk_lock); + memcpy(param.key, ctx->pk.protkey, MAXPROTKEYSIZE); + spin_unlock_bh(&ctx->pk_lock); + while ((nbytes = walk->nbytes) >= AES_BLOCK_SIZE) { /* only use complete blocks */ n = nbytes & ~(AES_BLOCK_SIZE - 1); - k = cpacf_km(ctx->fc | modifier, ctx->pk.protkey, + k = cpacf_km(ctx->fc | modifier, ¶m, walk->dst.virt.addr, walk->src.virt.addr, n); if (k) ret = blkcipher_walk_done(desc, walk, nbytes - k); if (k < n) { - if (__paes_set_key(ctx) != 0) + if (__paes_convert_key(ctx)) return blkcipher_walk_done(desc, walk, -EIO); + spin_lock_bh(&ctx->pk_lock); + memcpy(param.key, ctx->pk.protkey, MAXPROTKEYSIZE); + spin_unlock_bh(&ctx->pk_lock); } } return ret; @@ -229,6 +288,7 @@ struct s390_paes_ctx *ctx = crypto_tfm_ctx(tfm); ctx->kb.key = NULL; + spin_lock_init(&ctx->pk_lock); return 0; } @@ -240,11 +300,11 @@ _free_kb_keybuf(&ctx->kb); } -static int __cbc_paes_set_key(struct s390_paes_ctx *ctx) +static inline int __cbc_paes_set_key(struct s390_paes_ctx *ctx) { unsigned long fc; - if (__paes_convert_key(&ctx->kb, &ctx->pk)) + if (__paes_convert_key(ctx)) return -EINVAL; /* Pick the correct function code based on the protected key type */ @@ -265,7 +325,7 @@ struct s390_paes_ctx *ctx = crypto_tfm_ctx(tfm); _free_kb_keybuf(&ctx->kb); - rc = _copy_key_to_kb(&ctx->kb, in_key, key_len); + rc = _key_to_kb(&ctx->kb, in_key, key_len); if (rc) return rc; @@ -288,22 +348,31 @@ } param; ret = blkcipher_walk_virt(desc, walk); + if (ret) + return ret; + memcpy(param.iv, walk->iv, AES_BLOCK_SIZE); + spin_lock_bh(&ctx->pk_lock); memcpy(param.key, ctx->pk.protkey, MAXPROTKEYSIZE); + spin_unlock_bh(&ctx->pk_lock); + while ((nbytes = walk->nbytes) >= AES_BLOCK_SIZE) { /* only use complete blocks */ n = nbytes & ~(AES_BLOCK_SIZE - 1); k = cpacf_kmc(ctx->fc | modifier, ¶m, walk->dst.virt.addr, walk->src.virt.addr, n); - if (k) + if (k) { + memcpy(walk->iv, param.iv, AES_BLOCK_SIZE); ret = blkcipher_walk_done(desc, walk, nbytes - k); + } if (k < n) { - if (__cbc_paes_set_key(ctx) != 0) + if (__paes_convert_key(ctx)) return blkcipher_walk_done(desc, walk, -EIO); + spin_lock_bh(&ctx->pk_lock); memcpy(param.key, ctx->pk.protkey, MAXPROTKEYSIZE); + spin_unlock_bh(&ctx->pk_lock); } } - memcpy(walk->iv, param.iv, AES_BLOCK_SIZE); return ret; } @@ -357,6 +426,7 @@ ctx->kb[0].key = NULL; ctx->kb[1].key = NULL; + spin_lock_init(&ctx->pk_lock); return 0; } @@ -369,12 +439,27 @@ _free_kb_keybuf(&ctx->kb[1]); } -static int __xts_paes_set_key(struct s390_pxts_ctx *ctx) +static inline int __xts_paes_convert_key(struct s390_pxts_ctx *ctx) +{ + struct pkey_protkey pkey0, pkey1; + + if (__paes_keyblob2pkey(&ctx->kb[0], &pkey0) || + __paes_keyblob2pkey(&ctx->kb[1], &pkey1)) + return -EINVAL; + + spin_lock_bh(&ctx->pk_lock); + memcpy(&ctx->pk[0], &pkey0, sizeof(pkey0)); + memcpy(&ctx->pk[1], &pkey1, sizeof(pkey1)); + spin_unlock_bh(&ctx->pk_lock); + + return 0; +} + +static inline int __xts_paes_set_key(struct s390_pxts_ctx *ctx) { unsigned long fc; - if (__paes_convert_key(&ctx->kb[0], &ctx->pk[0]) || - __paes_convert_key(&ctx->kb[1], &ctx->pk[1])) + if (__xts_paes_convert_key(ctx)) return -EINVAL; if (ctx->pk[0].type != ctx->pk[1].type) @@ -406,10 +491,10 @@ _free_kb_keybuf(&ctx->kb[0]); _free_kb_keybuf(&ctx->kb[1]); - rc = _copy_key_to_kb(&ctx->kb[0], in_key, key_len); + rc = _key_to_kb(&ctx->kb[0], in_key, key_len); if (rc) return rc; - rc = _copy_key_to_kb(&ctx->kb[1], in_key + key_len, key_len); + rc = _key_to_kb(&ctx->kb[1], in_key + key_len, key_len); if (rc) return rc; @@ -449,15 +534,19 @@ } xts_param; ret = blkcipher_walk_virt(desc, walk); + if (ret) + return ret; + keylen = (ctx->pk[0].type == PKEY_KEYTYPE_AES_128) ? 48 : 64; offset = (ctx->pk[0].type == PKEY_KEYTYPE_AES_128) ? 16 : 0; -retry: + memset(&pcc_param, 0, sizeof(pcc_param)); memcpy(pcc_param.tweak, walk->iv, sizeof(pcc_param.tweak)); + spin_lock_bh(&ctx->pk_lock); memcpy(pcc_param.key + offset, ctx->pk[1].protkey, keylen); - cpacf_pcc(ctx->fc, pcc_param.key + offset); - memcpy(xts_param.key + offset, ctx->pk[0].protkey, keylen); + spin_unlock_bh(&ctx->pk_lock); + cpacf_pcc(ctx->fc, pcc_param.key + offset); memcpy(xts_param.init, pcc_param.xts, 16); while ((nbytes = walk->nbytes) >= AES_BLOCK_SIZE) { @@ -468,11 +557,15 @@ if (k) ret = blkcipher_walk_done(desc, walk, nbytes - k); if (k < n) { - if (__xts_paes_set_key(ctx) != 0) + if (__xts_paes_convert_key(ctx)) return blkcipher_walk_done(desc, walk, -EIO); - goto retry; + spin_lock_bh(&ctx->pk_lock); + memcpy(xts_param.key + offset, + ctx->pk[0].protkey, keylen); + spin_unlock_bh(&ctx->pk_lock); } } + return ret; } @@ -525,6 +618,7 @@ struct s390_paes_ctx *ctx = crypto_tfm_ctx(tfm); ctx->kb.key = NULL; + spin_lock_init(&ctx->pk_lock); return 0; } @@ -536,11 +630,11 @@ _free_kb_keybuf(&ctx->kb); } -static int __ctr_paes_set_key(struct s390_paes_ctx *ctx) +static inline int __ctr_paes_set_key(struct s390_paes_ctx *ctx) { unsigned long fc; - if (__paes_convert_key(&ctx->kb, &ctx->pk)) + if (__paes_convert_key(ctx)) return -EINVAL; /* Pick the correct function code based on the protected key type */ @@ -562,7 +656,7 @@ struct s390_paes_ctx *ctx = crypto_tfm_ctx(tfm); _free_kb_keybuf(&ctx->kb); - rc = _copy_key_to_kb(&ctx->kb, in_key, key_len); + rc = _key_to_kb(&ctx->kb, in_key, key_len); if (rc) return rc; @@ -595,47 +689,62 @@ u8 buf[AES_BLOCK_SIZE], *ctrptr; unsigned int nbytes, n, k; int ret, locked; - - locked = spin_trylock(&ctrblk_lock); + struct { + u8 key[MAXPROTKEYSIZE]; + } param; ret = blkcipher_walk_virt_block(desc, walk, AES_BLOCK_SIZE); + if (ret) + return ret; + + spin_lock_bh(&ctx->pk_lock); + memcpy(param.key, ctx->pk.protkey, MAXPROTKEYSIZE); + spin_unlock_bh(&ctx->pk_lock); + + locked = mutex_trylock(&ctrblk_lock); + + while ((nbytes = walk->nbytes) >= AES_BLOCK_SIZE) { n = AES_BLOCK_SIZE; if (nbytes >= 2*AES_BLOCK_SIZE && locked) n = __ctrblk_init(ctrblk, walk->iv, nbytes); ctrptr = (n > AES_BLOCK_SIZE) ? ctrblk : walk->iv; - k = cpacf_kmctr(ctx->fc | modifier, ctx->pk.protkey, - walk->dst.virt.addr, walk->src.virt.addr, - n, ctrptr); + k = cpacf_kmctr(ctx->fc | modifier, ¶m, walk->dst.virt.addr, + walk->src.virt.addr, n, ctrptr); if (k) { if (ctrptr == ctrblk) memcpy(walk->iv, ctrptr + k - AES_BLOCK_SIZE, AES_BLOCK_SIZE); crypto_inc(walk->iv, AES_BLOCK_SIZE); - ret = blkcipher_walk_done(desc, walk, nbytes - n); + ret = blkcipher_walk_done(desc, walk, nbytes - k); } if (k < n) { - if (__ctr_paes_set_key(ctx) != 0) { + if (__paes_convert_key(ctx)) { if (locked) - spin_unlock(&ctrblk_lock); + mutex_unlock(&ctrblk_lock); return blkcipher_walk_done(desc, walk, -EIO); } + spin_lock_bh(&ctx->pk_lock); + memcpy(param.key, ctx->pk.protkey, MAXPROTKEYSIZE); + spin_unlock_bh(&ctx->pk_lock); } } if (locked) - spin_unlock(&ctrblk_lock); + mutex_unlock(&ctrblk_lock); /* * final block may be < AES_BLOCK_SIZE, copy only nbytes */ if (nbytes) { while (1) { - if (cpacf_kmctr(ctx->fc | modifier, - ctx->pk.protkey, buf, + if (cpacf_kmctr(ctx->fc | modifier, ¶m, buf, walk->src.virt.addr, AES_BLOCK_SIZE, walk->iv) == AES_BLOCK_SIZE) break; - if (__ctr_paes_set_key(ctx) != 0) + if (__paes_convert_key(ctx)) return blkcipher_walk_done(desc, walk, -EIO); + spin_lock_bh(&ctx->pk_lock); + memcpy(param.key, ctx->pk.protkey, MAXPROTKEYSIZE); + spin_unlock_bh(&ctx->pk_lock); } memcpy(walk->dst.virt.addr, buf, nbytes); crypto_inc(walk->iv, AES_BLOCK_SIZE); @@ -697,12 +806,12 @@ static void paes_s390_fini(void) { - if (ctrblk) - free_page((unsigned long) ctrblk); __crypto_unregister_alg(&ctr_paes_alg); __crypto_unregister_alg(&xts_paes_alg); __crypto_unregister_alg(&cbc_paes_alg); __crypto_unregister_alg(&ecb_paes_alg); + if (ctrblk) + free_page((unsigned long) ctrblk); } static int __init paes_s390_init(void) @@ -740,14 +849,14 @@ if (cpacf_test_func(&kmctr_functions, CPACF_KMCTR_PAES_128) || cpacf_test_func(&kmctr_functions, CPACF_KMCTR_PAES_192) || cpacf_test_func(&kmctr_functions, CPACF_KMCTR_PAES_256)) { - ret = crypto_register_alg(&ctr_paes_alg); - if (ret) - goto out_err; ctrblk = (u8 *) __get_free_page(GFP_KERNEL); if (!ctrblk) { ret = -ENOMEM; goto out_err; } + ret = crypto_register_alg(&ctr_paes_alg); + if (ret) + goto out_err; } return 0; --- linux-riscv-5.4.0.orig/arch/s390/crypto/sha_common.c +++ linux-riscv-5.4.0/arch/s390/crypto/sha_common.c @@ -74,14 +74,17 @@ struct s390_sha_ctx *ctx = shash_desc_ctx(desc); unsigned int bsize = crypto_shash_blocksize(desc->tfm); u64 bits; - unsigned int n, mbl_offset; + unsigned int n; + int mbl_offset; n = ctx->count % bsize; bits = ctx->count * 8; - mbl_offset = s390_crypto_shash_parmsize(ctx->func) / sizeof(u32); + mbl_offset = s390_crypto_shash_parmsize(ctx->func); if (mbl_offset < 0) return -EINVAL; + mbl_offset = mbl_offset / sizeof(u32); + /* set total msg bit length (mbl) in CPACF parmblock */ switch (ctx->func) { case CPACF_KLMD_SHA_1: --- linux-riscv-5.4.0.orig/arch/s390/include/asm/gmap.h +++ linux-riscv-5.4.0/arch/s390/include/asm/gmap.h @@ -9,6 +9,7 @@ #ifndef _ASM_S390_GMAP_H #define _ASM_S390_GMAP_H +#include #include /* Generic bits for GMAP notification on DAT table entry changes. */ @@ -31,6 +32,7 @@ * @table: pointer to the page directory * @asce: address space control element for gmap page table * @pfault_enabled: defines if pfaults are applicable for the guest + * @guest_handle: protected virtual machine handle for the ultravisor * @host_to_rmap: radix tree with gmap_rmap lists * @children: list of shadow gmap structures * @pt_list: list of all page tables used in the shadow guest address space @@ -54,6 +56,8 @@ unsigned long asce_end; void *private; bool pfault_enabled; + /* only set for protected virtual machines */ + unsigned long guest_handle; /* Additional data for shadow guest address spaces */ struct radix_tree_root host_to_rmap; struct list_head children; @@ -144,4 +148,6 @@ void gmap_sync_dirty_log_pmd(struct gmap *gmap, unsigned long dirty_bitmap[4], unsigned long gaddr, unsigned long vmaddr); +int gmap_mark_unmergeable(void); +void s390_reset_acc(struct mm_struct *mm); #endif /* _ASM_S390_GMAP_H */ --- linux-riscv-5.4.0.orig/arch/s390/include/asm/ipl.h +++ linux-riscv-5.4.0/arch/s390/include/asm/ipl.h @@ -109,6 +109,7 @@ unsigned char flags, unsigned short cert); int ipl_report_add_certificate(struct ipl_report *report, void *key, unsigned long addr, unsigned long len); +bool ipl_get_secureboot(void); /* * DIAG 308 support --- linux-riscv-5.4.0.orig/arch/s390/include/asm/kprobes.h +++ linux-riscv-5.4.0/arch/s390/include/asm/kprobes.h @@ -54,7 +54,6 @@ struct arch_specific_insn { /* copy of original instruction */ kprobe_opcode_t *insn; - unsigned int is_ftrace_insn : 1; }; struct prev_kprobe { --- linux-riscv-5.4.0.orig/arch/s390/include/asm/kvm_host.h +++ linux-riscv-5.4.0/arch/s390/include/asm/kvm_host.h @@ -122,6 +122,17 @@ __u32 reserved; }; +#define CR0_INITIAL_MASK (CR0_UNUSED_56 | CR0_INTERRUPT_KEY_SUBMASK | \ + CR0_MEASUREMENT_ALERT_SUBMASK) +#define CR14_INITIAL_MASK (CR14_UNUSED_32 | CR14_UNUSED_33 | \ + CR14_EXTERNAL_DAMAGE_SUBMASK) + +#define SIDAD_SIZE_MASK 0xff +#define sida_origin(sie_block) \ + ((sie_block)->sidad & PAGE_MASK) +#define sida_size(sie_block) \ + ((((sie_block)->sidad & SIDAD_SIZE_MASK) + 1) * PAGE_SIZE) + #define CPUSTAT_STOPPED 0x80000000 #define CPUSTAT_WAIT 0x10000000 #define CPUSTAT_ECALL_PEND 0x08000000 @@ -155,7 +166,13 @@ __u8 reserved08[4]; /* 0x0008 */ #define PROG_IN_SIE (1<<0) __u32 prog0c; /* 0x000c */ - __u8 reserved10[16]; /* 0x0010 */ + union { + __u8 reserved10[16]; /* 0x0010 */ + struct { + __u64 pv_handle_cpu; + __u64 pv_handle_config; + }; + }; #define PROG_BLOCK_SIE (1<<0) #define PROG_REQUEST (1<<1) atomic_t prog20; /* 0x0020 */ @@ -204,10 +221,23 @@ #define ICPT_PARTEXEC 0x38 #define ICPT_IOINST 0x40 #define ICPT_KSS 0x5c +#define ICPT_MCHKREQ 0x60 +#define ICPT_INT_ENABLE 0x64 +#define ICPT_PV_INSTR 0x68 +#define ICPT_PV_NOTIFY 0x6c +#define ICPT_PV_PREF 0x70 __u8 icptcode; /* 0x0050 */ __u8 icptstatus; /* 0x0051 */ __u16 ihcpu; /* 0x0052 */ - __u8 reserved54[2]; /* 0x0054 */ + __u8 reserved54; /* 0x0054 */ +#define IICTL_CODE_NONE 0x00 +#define IICTL_CODE_MCHK 0x01 +#define IICTL_CODE_EXT 0x02 +#define IICTL_CODE_IO 0x03 +#define IICTL_CODE_RESTART 0x04 +#define IICTL_CODE_SPECIFICATION 0x10 +#define IICTL_CODE_OPERAND 0x11 + __u8 iictl; /* 0x0055 */ __u16 ipa; /* 0x0056 */ __u32 ipb; /* 0x0058 */ __u32 scaoh; /* 0x005c */ @@ -228,7 +258,7 @@ #define ECB3_RI 0x01 __u8 ecb3; /* 0x0063 */ __u32 scaol; /* 0x0064 */ - __u8 reserved68; /* 0x0068 */ + __u8 sdf; /* 0x0068 */ __u8 epdx; /* 0x0069 */ __u8 reserved6a[2]; /* 0x006a */ __u32 todpr; /* 0x006c */ @@ -244,31 +274,58 @@ #define HPID_KVM 0x4 #define HPID_VSIE 0x5 __u8 hpid; /* 0x00b8 */ - __u8 reservedb9[11]; /* 0x00b9 */ - __u16 extcpuaddr; /* 0x00c4 */ - __u16 eic; /* 0x00c6 */ + __u8 reservedb9[7]; /* 0x00b9 */ + union { + struct { + __u32 eiparams; /* 0x00c0 */ + __u16 extcpuaddr; /* 0x00c4 */ + __u16 eic; /* 0x00c6 */ + }; + __u64 mcic; /* 0x00c0 */ + } __packed; __u32 reservedc8; /* 0x00c8 */ - __u16 pgmilc; /* 0x00cc */ - __u16 iprcc; /* 0x00ce */ - __u32 dxc; /* 0x00d0 */ - __u16 mcn; /* 0x00d4 */ - __u8 perc; /* 0x00d6 */ - __u8 peratmid; /* 0x00d7 */ + union { + struct { + __u16 pgmilc; /* 0x00cc */ + __u16 iprcc; /* 0x00ce */ + }; + __u32 edc; /* 0x00cc */ + } __packed; + union { + struct { + __u32 dxc; /* 0x00d0 */ + __u16 mcn; /* 0x00d4 */ + __u8 perc; /* 0x00d6 */ + __u8 peratmid; /* 0x00d7 */ + }; + __u64 faddr; /* 0x00d0 */ + } __packed; __u64 peraddr; /* 0x00d8 */ __u8 eai; /* 0x00e0 */ __u8 peraid; /* 0x00e1 */ __u8 oai; /* 0x00e2 */ __u8 armid; /* 0x00e3 */ __u8 reservede4[4]; /* 0x00e4 */ - __u64 tecmc; /* 0x00e8 */ - __u8 reservedf0[12]; /* 0x00f0 */ + union { + __u64 tecmc; /* 0x00e8 */ + struct { + __u16 subchannel_id; /* 0x00e8 */ + __u16 subchannel_nr; /* 0x00ea */ + __u32 io_int_parm; /* 0x00ec */ + __u32 io_int_word; /* 0x00f0 */ + }; + } __packed; + __u8 reservedf4[8]; /* 0x00f4 */ #define CRYCB_FORMAT_MASK 0x00000003 #define CRYCB_FORMAT0 0x00000000 #define CRYCB_FORMAT1 0x00000001 #define CRYCB_FORMAT2 0x00000003 __u32 crycbd; /* 0x00fc */ __u64 gcr[16]; /* 0x0100 */ - __u64 gbea; /* 0x0180 */ + union { + __u64 gbea; /* 0x0180 */ + __u64 sidad; + }; __u8 reserved188[8]; /* 0x0188 */ __u64 sdnxo; /* 0x0190 */ __u8 reserved198[8]; /* 0x0198 */ @@ -296,7 +353,9 @@ struct sie_page { struct kvm_s390_sie_block sie_block; struct mcck_volatile_info mcck_info; /* 0x0200 */ - __u8 reserved218[1000]; /* 0x0218 */ + __u8 reserved218[360]; /* 0x0218 */ + __u64 pv_grregs[16]; /* 0x0380 */ + __u8 reserved400[512]; /* 0x0400 */ struct kvm_s390_itdb itdb; /* 0x0600 */ __u8 reserved700[2304]; /* 0x0700 */ }; @@ -470,6 +529,7 @@ IRQ_PEND_PFAULT_INIT, IRQ_PEND_EXT_HOST, IRQ_PEND_EXT_SERVICE, + IRQ_PEND_EXT_SERVICE_EV, IRQ_PEND_EXT_TIMING, IRQ_PEND_EXT_CPU_TIMER, IRQ_PEND_EXT_CLOCK_COMP, @@ -514,6 +574,7 @@ (1UL << IRQ_PEND_EXT_TIMING) | \ (1UL << IRQ_PEND_EXT_HOST) | \ (1UL << IRQ_PEND_EXT_SERVICE) | \ + (1UL << IRQ_PEND_EXT_SERVICE_EV) | \ (1UL << IRQ_PEND_VIRTIO) | \ (1UL << IRQ_PEND_PFAULT_INIT) | \ (1UL << IRQ_PEND_PFAULT_DONE)) @@ -530,6 +591,13 @@ #define IRQ_PEND_MCHK_MASK ((1UL << IRQ_PEND_MCHK_REP) | \ (1UL << IRQ_PEND_MCHK_EX)) +#define IRQ_PEND_EXT_II_MASK ((1UL << IRQ_PEND_EXT_CPU_TIMER) | \ + (1UL << IRQ_PEND_EXT_CLOCK_COMP) | \ + (1UL << IRQ_PEND_EXT_EMERGENCY) | \ + (1UL << IRQ_PEND_EXT_EXTERNAL) | \ + (1UL << IRQ_PEND_EXT_SERVICE) | \ + (1UL << IRQ_PEND_EXT_SERVICE_EV)) + struct kvm_s390_interrupt_info { struct list_head list; u64 type; @@ -588,6 +656,7 @@ struct kvm_s390_float_interrupt { unsigned long pending_irqs; + unsigned long masked_irqs; spinlock_t lock; struct list_head lists[FIRQ_LIST_COUNT]; int counters[FIRQ_MAX_COUNT]; @@ -639,6 +708,11 @@ unsigned long last_bp; }; +struct kvm_s390_pv_vcpu { + u64 handle; + unsigned long stor_base; +}; + struct kvm_vcpu_arch { struct kvm_s390_sie_block *sie_block; /* if vsie is active, currently executed shadow sie control block */ @@ -667,6 +741,7 @@ __u64 cputm_start; bool gs_enabled; bool skey_enabled; + struct kvm_s390_pv_vcpu pv; }; struct kvm_vm_stat { @@ -695,9 +770,6 @@ bool masked; bool swap; bool suppressible; - struct rw_semaphore maps_lock; - struct list_head maps; - atomic_t nr_maps; }; #define MAX_S390_IO_ADAPTERS ((MAX_ISC + 1) * 8) @@ -840,6 +912,13 @@ DECLARE_BITMAP(kicked_mask, KVM_MAX_VCPUS); }; +struct kvm_s390_pv { + u64 handle; + u64 guest_len; + unsigned long stor_base; + void *stor_var; +}; + struct kvm_arch{ void *sca; int use_esca; @@ -875,6 +954,7 @@ DECLARE_BITMAP(cpu_feat, KVM_S390_VM_CPU_FEAT_NR_BITS); DECLARE_BITMAP(idle_mask, KVM_MAX_VCPUS); struct kvm_s390_gisa_interrupt gisa_int; + struct kvm_s390_pv pv; }; #define KVM_HVA_ERR_BAD (-1UL) --- linux-riscv-5.4.0.orig/arch/s390/include/asm/mmu.h +++ linux-riscv-5.4.0/arch/s390/include/asm/mmu.h @@ -16,6 +16,8 @@ unsigned long asce; unsigned long asce_limit; unsigned long vdso_base; + /* The mmu context belongs to a secure guest. */ + atomic_t is_protected; /* * The following bitfields need a down_write on the mm * semaphore when they are written to. As they are only --- linux-riscv-5.4.0.orig/arch/s390/include/asm/mmu_context.h +++ linux-riscv-5.4.0/arch/s390/include/asm/mmu_context.h @@ -23,6 +23,7 @@ INIT_LIST_HEAD(&mm->context.gmap_list); cpumask_clear(&mm->context.cpu_attach_mask); atomic_set(&mm->context.flush_count, 0); + atomic_set(&mm->context.is_protected, 0); mm->context.gmap_asce = 0; mm->context.flush_mm = 0; mm->context.compat_mm = test_thread_flag(TIF_31BIT); --- linux-riscv-5.4.0.orig/arch/s390/include/asm/page.h +++ linux-riscv-5.4.0/arch/s390/include/asm/page.h @@ -33,6 +33,8 @@ #define ARCH_HAS_PREPARE_HUGEPAGE #define ARCH_HAS_HUGEPAGE_CLEAR_FLUSH +#define HAVE_ARCH_HUGETLB_UNMAPPED_AREA + #include #ifndef __ASSEMBLY__ @@ -40,7 +42,7 @@ static inline void storage_key_init_range(unsigned long start, unsigned long end) { - if (PAGE_DEFAULT_KEY) + if (PAGE_DEFAULT_KEY != 0) __storage_key_init_range(start, end); } @@ -151,6 +153,11 @@ #define HAVE_ARCH_FREE_PAGE #define HAVE_ARCH_ALLOC_PAGE +#if IS_ENABLED(CONFIG_PGSTE) +int arch_make_page_accessible(struct page *page); +#define HAVE_ARCH_MAKE_PAGE_ACCESSIBLE +#endif + #endif /* !__ASSEMBLY__ */ #define __PAGE_OFFSET 0x0UL --- linux-riscv-5.4.0.orig/arch/s390/include/asm/pci.h +++ linux-riscv-5.4.0/arch/s390/include/asm/pci.h @@ -183,7 +183,7 @@ /* CLP */ int clp_scan_pci_devices(void); int clp_rescan_pci_devices(void); -int clp_rescan_pci_devices_simple(void); +int clp_rescan_pci_devices_simple(u32 *fid); int clp_add_pci_device(u32, u32, int); int clp_enable_fh(struct zpci_dev *, u8); int clp_disable_fh(struct zpci_dev *); --- linux-riscv-5.4.0.orig/arch/s390/include/asm/pgalloc.h +++ linux-riscv-5.4.0/arch/s390/include/asm/pgalloc.h @@ -56,7 +56,12 @@ crst_table_init(table, _REGION2_ENTRY_EMPTY); return (p4d_t *) table; } -#define p4d_free(mm, p4d) crst_table_free(mm, (unsigned long *) p4d) + +static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d) +{ + if (!mm_p4d_folded(mm)) + crst_table_free(mm, (unsigned long *) p4d); +} static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long address) { @@ -65,7 +70,12 @@ crst_table_init(table, _REGION3_ENTRY_EMPTY); return (pud_t *) table; } -#define pud_free(mm, pud) crst_table_free(mm, (unsigned long *) pud) + +static inline void pud_free(struct mm_struct *mm, pud_t *pud) +{ + if (!mm_pud_folded(mm)) + crst_table_free(mm, (unsigned long *) pud); +} static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long vmaddr) { @@ -83,6 +93,8 @@ static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd) { + if (mm_pmd_folded(mm)) + return; pgtable_pmd_page_dtor(virt_to_page(pmd)); crst_table_free(mm, (unsigned long *) pmd); } --- linux-riscv-5.4.0.orig/arch/s390/include/asm/pgtable.h +++ linux-riscv-5.4.0/arch/s390/include/asm/pgtable.h @@ -19,6 +19,7 @@ #include #include #include +#include extern pgd_t swapper_pg_dir[]; extern void paging_init(void); @@ -522,6 +523,15 @@ return 0; } +static inline int mm_is_protected(struct mm_struct *mm) +{ +#ifdef CONFIG_PGSTE + if (unlikely(atomic_read(&mm->context.is_protected))) + return 1; +#endif + return 0; +} + static inline int mm_alloc_pgste(struct mm_struct *mm) { #ifdef CONFIG_PGSTE @@ -756,6 +766,12 @@ return (pmd_val(pmd) & _SEGMENT_ENTRY_WRITE) != 0; } +#define pud_write pud_write +static inline int pud_write(pud_t pud) +{ + return (pud_val(pud) & _REGION3_ENTRY_WRITE) != 0; +} + static inline int pmd_dirty(pmd_t pmd) { int dirty = 1; @@ -1071,7 +1087,12 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - return ptep_xchg_lazy(mm, addr, ptep, __pte(_PAGE_INVALID)); + pte_t res; + + res = ptep_xchg_lazy(mm, addr, ptep, __pte(_PAGE_INVALID)); + if (mm_is_protected(mm) && pte_present(res)) + uv_convert_from_secure(pte_val(res) & PAGE_MASK); + return res; } #define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION @@ -1083,7 +1104,12 @@ static inline pte_t ptep_clear_flush(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { - return ptep_xchg_direct(vma->vm_mm, addr, ptep, __pte(_PAGE_INVALID)); + pte_t res; + + res = ptep_xchg_direct(vma->vm_mm, addr, ptep, __pte(_PAGE_INVALID)); + if (mm_is_protected(vma->vm_mm) && pte_present(res)) + uv_convert_from_secure(pte_val(res) & PAGE_MASK); + return res; } /* @@ -1098,12 +1124,17 @@ unsigned long addr, pte_t *ptep, int full) { + pte_t res; + if (full) { - pte_t pte = *ptep; + res = *ptep; *ptep = __pte(_PAGE_INVALID); - return pte; + } else { + res = ptep_xchg_lazy(mm, addr, ptep, __pte(_PAGE_INVALID)); } - return ptep_xchg_lazy(mm, addr, ptep, __pte(_PAGE_INVALID)); + if (mm_is_protected(mm) && pte_present(res)) + uv_convert_from_secure(pte_val(res) & PAGE_MASK); + return res; } #define __HAVE_ARCH_PTEP_SET_WRPROTECT @@ -1173,8 +1204,6 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t entry) { - if (!MACHINE_HAS_NX) - pte_val(entry) &= ~_PAGE_NOEXEC; if (pte_present(entry)) pte_val(entry) &= ~_PAGE_UNUSED; if (mm_has_pgste(mm)) @@ -1191,6 +1220,8 @@ { pte_t __pte; pte_val(__pte) = physpage + pgprot_val(pgprot); + if (!MACHINE_HAS_NX) + pte_val(__pte) &= ~_PAGE_NOEXEC; return pte_mkyoung(__pte); } --- linux-riscv-5.4.0.orig/arch/s390/include/asm/qdio.h +++ linux-riscv-5.4.0/arch/s390/include/asm/qdio.h @@ -227,7 +227,7 @@ * @sbal: absolute SBAL address */ struct sl_element { - unsigned long sbal; + u64 sbal; } __attribute__ ((packed)); /** --- linux-riscv-5.4.0.orig/arch/s390/include/asm/setup.h +++ linux-riscv-5.4.0/arch/s390/include/asm/setup.h @@ -80,6 +80,13 @@ char command_line[ARCH_COMMAND_LINE_SIZE]; /* 0x10480 */ }; +extern unsigned int zlib_dfltcc_support; +#define ZLIB_DFLTCC_DISABLED 0 +#define ZLIB_DFLTCC_FULL 1 +#define ZLIB_DFLTCC_DEFLATE_ONLY 2 +#define ZLIB_DFLTCC_INFLATE_ONLY 3 +#define ZLIB_DFLTCC_FULL_DEBUG 4 + extern int noexec_disabled; extern int memory_end_set; extern unsigned long memory_end; --- linux-riscv-5.4.0.orig/arch/s390/include/asm/timex.h +++ linux-riscv-5.4.0/arch/s390/include/asm/timex.h @@ -10,8 +10,9 @@ #ifndef _ASM_S390_TIMEX_H #define _ASM_S390_TIMEX_H -#include +#include #include +#include /* The value of the TOD clock for 1.1.1970. */ #define TOD_UNIX_EPOCH 0x7d91048bca000000ULL @@ -154,7 +155,7 @@ static inline unsigned long long get_tod_clock(void) { - unsigned char clk[STORE_CLOCK_EXT_SIZE]; + char clk[STORE_CLOCK_EXT_SIZE]; get_tod_clock_ext(clk); return *((unsigned long long *)&clk[1]); @@ -186,15 +187,18 @@ /** * get_clock_monotonic - returns current time in clock rate units * - * The caller must ensure that preemption is disabled. * The clock and tod_clock_base get changed via stop_machine. - * Therefore preemption must be disabled when calling this - * function, otherwise the returned value is not guaranteed to - * be monotonic. + * Therefore preemption must be disabled, otherwise the returned + * value is not guaranteed to be monotonic. */ static inline unsigned long long get_tod_clock_monotonic(void) { - return get_tod_clock() - *(unsigned long long *) &tod_clock_base[1]; + unsigned long long tod; + + preempt_disable_notrace(); + tod = get_tod_clock() - *(unsigned long long *) &tod_clock_base[1]; + preempt_enable_notrace(); + return tod; } /** --- linux-riscv-5.4.0.orig/arch/s390/include/asm/topology.h +++ linux-riscv-5.4.0/arch/s390/include/asm/topology.h @@ -68,11 +68,8 @@ #ifdef CONFIG_NUMA -#define cpu_to_node cpu_to_node -static inline int cpu_to_node(int cpu) -{ - return cpu_topology[cpu].node_id; -} +extern int __cpu_to_node(int cpu); +#define cpu_to_node __cpu_to_node /* Returns a pointer to the cpumask of CPUs on node 'node'. */ #define cpumask_of_node cpumask_of_node --- linux-riscv-5.4.0.orig/arch/s390/include/asm/uv.h +++ linux-riscv-5.4.0/arch/s390/include/asm/uv.h @@ -14,23 +14,62 @@ #include #include #include +#include #include +#include #define UVC_RC_EXECUTED 0x0001 #define UVC_RC_INV_CMD 0x0002 #define UVC_RC_INV_STATE 0x0003 #define UVC_RC_INV_LEN 0x0005 #define UVC_RC_NO_RESUME 0x0007 +#define UVC_RC_NEED_DESTROY 0x8000 #define UVC_CMD_QUI 0x0001 +#define UVC_CMD_INIT_UV 0x000f +#define UVC_CMD_CREATE_SEC_CONF 0x0100 +#define UVC_CMD_DESTROY_SEC_CONF 0x0101 +#define UVC_CMD_CREATE_SEC_CPU 0x0120 +#define UVC_CMD_DESTROY_SEC_CPU 0x0121 +#define UVC_CMD_CONV_TO_SEC_STOR 0x0200 +#define UVC_CMD_CONV_FROM_SEC_STOR 0x0201 +#define UVC_CMD_SET_SEC_CONF_PARAMS 0x0300 +#define UVC_CMD_UNPACK_IMG 0x0301 +#define UVC_CMD_VERIFY_IMG 0x0302 +#define UVC_CMD_CPU_RESET 0x0310 +#define UVC_CMD_CPU_RESET_INITIAL 0x0311 +#define UVC_CMD_PREPARE_RESET 0x0320 +#define UVC_CMD_CPU_RESET_CLEAR 0x0321 +#define UVC_CMD_CPU_SET_STATE 0x0330 +#define UVC_CMD_SET_UNSHARE_ALL 0x0340 +#define UVC_CMD_PIN_PAGE_SHARED 0x0341 +#define UVC_CMD_UNPIN_PAGE_SHARED 0x0342 #define UVC_CMD_SET_SHARED_ACCESS 0x1000 #define UVC_CMD_REMOVE_SHARED_ACCESS 0x1001 /* Bits in installed uv calls */ enum uv_cmds_inst { BIT_UVC_CMD_QUI = 0, + BIT_UVC_CMD_INIT_UV = 1, + BIT_UVC_CMD_CREATE_SEC_CONF = 2, + BIT_UVC_CMD_DESTROY_SEC_CONF = 3, + BIT_UVC_CMD_CREATE_SEC_CPU = 4, + BIT_UVC_CMD_DESTROY_SEC_CPU = 5, + BIT_UVC_CMD_CONV_TO_SEC_STOR = 6, + BIT_UVC_CMD_CONV_FROM_SEC_STOR = 7, BIT_UVC_CMD_SET_SHARED_ACCESS = 8, BIT_UVC_CMD_REMOVE_SHARED_ACCESS = 9, + BIT_UVC_CMD_SET_SEC_PARMS = 11, + BIT_UVC_CMD_UNPACK_IMG = 13, + BIT_UVC_CMD_VERIFY_IMG = 14, + BIT_UVC_CMD_CPU_RESET = 15, + BIT_UVC_CMD_CPU_RESET_INITIAL = 16, + BIT_UVC_CMD_CPU_SET_STATE = 17, + BIT_UVC_CMD_PREPARE_RESET = 18, + BIT_UVC_CMD_CPU_PERFORM_CLEAR_RESET = 19, + BIT_UVC_CMD_UNSHARE_ALL = 20, + BIT_UVC_CMD_PIN_PAGE_SHARED = 21, + BIT_UVC_CMD_UNPIN_PAGE_SHARED = 22, }; struct uv_cb_header { @@ -40,13 +79,127 @@ u16 rrc; /* Return Reason Code */ } __packed __aligned(8); +/* Query Ultravisor Information */ struct uv_cb_qui { struct uv_cb_header header; u64 reserved08; u64 inst_calls_list[4]; - u64 reserved30[15]; + u64 reserved30[2]; + u64 uv_base_stor_len; + u64 reserved48; + u64 conf_base_phys_stor_len; + u64 conf_base_virt_stor_len; + u64 conf_virt_var_stor_len; + u64 cpu_stor_len; + u32 reserved70[3]; + u32 max_num_sec_conf; + u64 max_guest_stor_addr; + u8 reserved88[158 - 136]; + u16 max_guest_cpus; + u8 reserveda0[200 - 160]; } __packed __aligned(8); +/* Initialize Ultravisor */ +struct uv_cb_init { + struct uv_cb_header header; + u64 reserved08[2]; + u64 stor_origin; + u64 stor_len; + u64 reserved28[4]; +} __packed __aligned(8); + +/* Create Guest Configuration */ +struct uv_cb_cgc { + struct uv_cb_header header; + u64 reserved08[2]; + u64 guest_handle; + u64 conf_base_stor_origin; + u64 conf_virt_stor_origin; + u64 reserved30; + u64 guest_stor_origin; + u64 guest_stor_len; + u64 guest_sca; + u64 guest_asce; + u64 reserved58[5]; +} __packed __aligned(8); + +/* Create Secure CPU */ +struct uv_cb_csc { + struct uv_cb_header header; + u64 reserved08[2]; + u64 cpu_handle; + u64 guest_handle; + u64 stor_origin; + u8 reserved30[6]; + u16 num; + u64 state_origin; + u64 reserved40[4]; +} __packed __aligned(8); + +/* Convert to Secure */ +struct uv_cb_cts { + struct uv_cb_header header; + u64 reserved08[2]; + u64 guest_handle; + u64 gaddr; +} __packed __aligned(8); + +/* Convert from Secure / Pin Page Shared */ +struct uv_cb_cfs { + struct uv_cb_header header; + u64 reserved08[2]; + u64 paddr; +} __packed __aligned(8); + +/* Set Secure Config Parameter */ +struct uv_cb_ssc { + struct uv_cb_header header; + u64 reserved08[2]; + u64 guest_handle; + u64 sec_header_origin; + u32 sec_header_len; + u32 reserved2c; + u64 reserved30[4]; +} __packed __aligned(8); + +/* Unpack */ +struct uv_cb_unp { + struct uv_cb_header header; + u64 reserved08[2]; + u64 guest_handle; + u64 gaddr; + u64 tweak[2]; + u64 reserved38[3]; +} __packed __aligned(8); + +#define PV_CPU_STATE_OPR 1 +#define PV_CPU_STATE_STP 2 +#define PV_CPU_STATE_CHKSTP 3 +#define PV_CPU_STATE_OPR_LOAD 5 + +struct uv_cb_cpu_set_state { + struct uv_cb_header header; + u64 reserved08[2]; + u64 cpu_handle; + u8 reserved20[7]; + u8 state; + u64 reserved28[5]; +}; + +/* + * A common UV call struct for calls that take no payload + * Examples: + * Destroy cpu/config + * Verify + */ +struct uv_cb_nodata { + struct uv_cb_header header; + u64 reserved08[2]; + u64 handle; + u64 reserved20[4]; +} __packed __aligned(8); + +/* Set Shared Access */ struct uv_cb_share { struct uv_cb_header header; u64 reserved08[3]; @@ -54,21 +207,76 @@ u64 reserved28; } __packed __aligned(8); -static inline int uv_call(unsigned long r1, unsigned long r2) +static inline int __uv_call(unsigned long r1, unsigned long r2) { int cc; asm volatile( - "0: .insn rrf,0xB9A40000,%[r1],%[r2],0,0\n" - " brc 3,0b\n" - " ipm %[cc]\n" - " srl %[cc],28\n" + " .insn rrf,0xB9A40000,%[r1],%[r2],0,0\n" + " ipm %[cc]\n" + " srl %[cc],28\n" : [cc] "=d" (cc) : [r1] "a" (r1), [r2] "a" (r2) : "memory", "cc"); return cc; } +static inline int uv_call(unsigned long r1, unsigned long r2) +{ + int cc; + + do { + cc = __uv_call(r1, r2); + } while (cc > 1); + return cc; +} + +/* Low level uv_call that avoids stalls for long running busy conditions */ +static inline int uv_call_sched(unsigned long r1, unsigned long r2) +{ + int cc; + + do { + cc = __uv_call(r1, r2); + cond_resched(); + } while (cc > 1); + return cc; +} + +/* + * special variant of uv_call that only transports the cpu or guest + * handle and the command, like destroy or verify. + */ +static inline int uv_cmd_nodata(u64 handle, u16 cmd, u16 *rc, u16 *rrc) +{ + struct uv_cb_nodata uvcb = { + .header.cmd = cmd, + .header.len = sizeof(uvcb), + .handle = handle, + }; + int cc; + + WARN(!handle, "No handle provided to Ultravisor call cmd %x\n", cmd); + cc = uv_call_sched(0, (u64)&uvcb); + *rc = uvcb.header.rc; + *rrc = uvcb.header.rrc; + return cc ? -EINVAL : 0; +} + +struct uv_info { + unsigned long inst_calls_list[4]; + unsigned long uv_base_stor_len; + unsigned long guest_base_stor_len; + unsigned long guest_virt_base_stor_len; + unsigned long guest_virt_var_stor_len; + unsigned long guest_cpu_stor_len; + unsigned long max_sec_stor_addr; + unsigned int max_num_sec_conf; + unsigned short max_guest_cpus; +}; + +extern struct uv_info uv_info; + #ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST extern int prot_virt_guest; @@ -121,11 +329,40 @@ return share(addr, UVC_CMD_REMOVE_SHARED_ACCESS); } -void uv_query_info(void); #else #define is_prot_virt_guest() 0 static inline int uv_set_shared(unsigned long addr) { return 0; } static inline int uv_remove_shared(unsigned long addr) { return 0; } +#endif + +#if IS_ENABLED(CONFIG_KVM) +extern int prot_virt_host; + +static inline int is_prot_virt_host(void) +{ + return prot_virt_host; +} + +int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb); +int uv_convert_from_secure(unsigned long paddr); +int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr); + +void setup_uv(void); +void adjust_to_uv_max(unsigned long *vmax); +#else +#define is_prot_virt_host() 0 +static inline void setup_uv(void) {} +static inline void adjust_to_uv_max(unsigned long *vmax) {} + +static inline int uv_convert_from_secure(unsigned long paddr) +{ + return 0; +} +#endif + +#if defined(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) || IS_ENABLED(CONFIG_KVM) +void uv_query_info(void); +#else static inline void uv_query_info(void) {} #endif --- linux-riscv-5.4.0.orig/arch/s390/include/uapi/asm/pkey.h +++ linux-riscv-5.4.0/arch/s390/include/uapi/asm/pkey.h @@ -25,10 +25,11 @@ #define MAXPROTKEYSIZE 64 /* a protected key blob may be up to 64 bytes */ #define MAXCLRKEYSIZE 32 /* a clear key value may be up to 32 bytes */ #define MAXAESCIPHERKEYSIZE 136 /* our aes cipher keys have always 136 bytes */ +#define MINEP11AESKEYBLOBSIZE 256 /* min EP11 AES key blob size */ +#define MAXEP11AESKEYBLOBSIZE 320 /* max EP11 AES key blob size */ -/* Minimum and maximum size of a key blob */ +/* Minimum size of a key blob */ #define MINKEYBLOBSIZE SECKEYBLOBSIZE -#define MAXKEYBLOBSIZE MAXAESCIPHERKEYSIZE /* defines for the type field within the pkey_protkey struct */ #define PKEY_KEYTYPE_AES_128 1 @@ -39,6 +40,7 @@ enum pkey_key_type { PKEY_TYPE_CCA_DATA = (__u32) 1, PKEY_TYPE_CCA_CIPHER = (__u32) 2, + PKEY_TYPE_EP11 = (__u32) 3, }; /* the newer ioctls use a pkey_key_size enum for key size information */ @@ -200,7 +202,7 @@ /* * Generate secure key, version 2. - * Generate either a CCA AES secure key or a CCA AES cipher key. + * Generate CCA AES secure key, CCA AES cipher key or EP11 AES secure key. * There needs to be a list of apqns given with at least one entry in there. * All apqns in the list need to be exact apqns, 0xFFFF as ANY card or domain * is not supported. The implementation walks through the list of apqns and @@ -210,10 +212,13 @@ * (return -1 with errno ENODEV). You may use the PKEY_APQNS4KT ioctl to * generate a list of apqns based on the key type to generate. * The keygenflags argument is passed to the low level generation functions - * individual for the key type and has a key type specific meaning. Currently - * only CCA AES cipher keys react to this parameter: Use one or more of the - * PKEY_KEYGEN_* flags to widen the export possibilities. By default a cipher - * key is only exportable for CPACF (PKEY_KEYGEN_XPRT_CPAC). + * individual for the key type and has a key type specific meaning. When + * generating CCA cipher keys you can use one or more of the PKEY_KEYGEN_* + * flags to widen the export possibilities. By default a cipher key is + * only exportable for CPACF (PKEY_KEYGEN_XPRT_CPAC). + * The keygenflag argument for generating an EP11 AES key should either be 0 + * to use the defaults which are XCP_BLOB_ENCRYPT, XCP_BLOB_DECRYPT and + * XCP_BLOB_PROTKEY_EXTRACTABLE or a valid combination of XCP_BLOB_* flags. */ struct pkey_genseck2 { struct pkey_apqn __user *apqns; /* in: ptr to list of apqn targets*/ @@ -229,8 +234,8 @@ /* * Generate secure key from clear key value, version 2. - * Construct a CCA AES secure key or CCA AES cipher key from a given clear key - * value. + * Construct an CCA AES secure key, CCA AES cipher key or EP11 AES secure + * key from a given clear key value. * There needs to be a list of apqns given with at least one entry in there. * All apqns in the list need to be exact apqns, 0xFFFF as ANY card or domain * is not supported. The implementation walks through the list of apqns and @@ -240,10 +245,13 @@ * (return -1 with errno ENODEV). You may use the PKEY_APQNS4KT ioctl to * generate a list of apqns based on the key type to generate. * The keygenflags argument is passed to the low level generation functions - * individual for the key type and has a key type specific meaning. Currently - * only CCA AES cipher keys react to this parameter: Use one or more of the - * PKEY_KEYGEN_* flags to widen the export possibilities. By default a cipher - * key is only exportable for CPACF (PKEY_KEYGEN_XPRT_CPAC). + * individual for the key type and has a key type specific meaning. When + * generating CCA cipher keys you can use one or more of the PKEY_KEYGEN_* + * flags to widen the export possibilities. By default a cipher key is + * only exportable for CPACF (PKEY_KEYGEN_XPRT_CPAC). + * The keygenflag argument for generating an EP11 AES key should either be 0 + * to use the defaults which are XCP_BLOB_ENCRYPT, XCP_BLOB_DECRYPT and + * XCP_BLOB_PROTKEY_EXTRACTABLE or a valid combination of XCP_BLOB_* flags. */ struct pkey_clr2seck2 { struct pkey_apqn __user *apqns; /* in: ptr to list of apqn targets */ @@ -266,14 +274,19 @@ * with one apqn able to handle this key. * The function also checks for the master key verification patterns * of the key matching to the current or alternate mkvp of the apqn. - * Currently CCA AES secure keys and CCA AES cipher keys are supported. - * The flags field is updated with some additional info about the apqn mkvp + * For CCA AES secure keys and CCA AES cipher keys this means to check + * the key's mkvp against the current or old mkvp of the apqns. The flags + * field is updated with some additional info about the apqn mkvp * match: If the current mkvp matches to the key's mkvp then the * PKEY_FLAGS_MATCH_CUR_MKVP bit is set, if the alternate mkvp matches to * the key's mkvp the PKEY_FLAGS_MATCH_ALT_MKVP is set. For CCA keys the * alternate mkvp is the old master key verification pattern. * CCA AES secure keys are also checked to have the CPACF export allowed * bit enabled (XPRTCPAC) in the kmf1 field. + * EP11 keys are also supported and the wkvp of the key is checked against + * the current wkvp of the apqns. There is no alternate for this type of + * key and so on a match the flag PKEY_FLAGS_MATCH_CUR_MKVP always is set. + * EP11 keys are also checked to have XCP_BLOB_PROTKEY_EXTRACTABLE set. * The ioctl returns 0 as long as the given or found apqn matches to * matches with the current or alternate mkvp to the key's mkvp. If the given * apqn does not match or there is no such apqn found, -1 with errno @@ -313,16 +326,20 @@ /* * Build a list of APQNs based on a key blob given. * Is able to find out which type of secure key is given (CCA AES secure - * key or CCA AES cipher key) and tries to find all matching crypto cards - * based on the MKVP and maybe other criterias (like CCA AES cipher keys - * need a CEX5C or higher). The list of APQNs is further filtered by the key's - * mkvp which needs to match to either the current mkvp or the alternate mkvp - * (which is the old mkvp on CCA adapters) of the apqns. The flags argument may - * be used to limit the matching apqns. If the PKEY_FLAGS_MATCH_CUR_MKVP is - * given, only the current mkvp of each apqn is compared. Likewise with the - * PKEY_FLAGS_MATCH_ALT_MKVP. If both are given, it is assumed to - * return apqns where either the current or the alternate mkvp + * key, CCA AES cipher key or EP11 AES key) and tries to find all matching + * crypto cards based on the MKVP and maybe other criterias (like CCA AES + * cipher keys need a CEX5C or higher, EP11 keys with BLOB_PKEY_EXTRACTABLE + * need a CEX7 and EP11 api version 4). The list of APQNs is further filtered + * by the key's mkvp which needs to match to either the current mkvp (CCA and + * EP11) or the alternate mkvp (old mkvp, CCA adapters only) of the apqns. The + * flags argument may be used to limit the matching apqns. If the + * PKEY_FLAGS_MATCH_CUR_MKVP is given, only the current mkvp of each apqn is + * compared. Likewise with the PKEY_FLAGS_MATCH_ALT_MKVP. If both are given, it + * is assumed to return apqns where either the current or the alternate mkvp * matches. At least one of the matching flags needs to be given. + * The flags argument for EP11 keys has no further action and is currently + * ignored (but needs to be given as PKEY_FLAGS_MATCH_CUR_MKVP) as there is only + * the wkvp from the key to match against the apqn's wkvp. * The list of matching apqns is stored into the space given by the apqns * argument and the number of stored entries goes into apqn_entries. If the list * is empty (apqn_entries is 0) the apqn_entries field is updated to the number @@ -356,6 +373,10 @@ * If both are given, it is assumed to return apqns where either the * current or the alternate mkvp matches. If no match flag is given * (flags is 0) the mkvp values are ignored for the match process. + * For EP11 keys there is only the current wkvp. So if the apqns should also + * match to a given wkvp, then the PKEY_FLAGS_MATCH_CUR_MKVP flag should be + * set. The wkvp value is 32 bytes but only the leftmost 16 bytes are compared + * against the leftmost 16 byte of the wkvp of the apqn. * The list of matching apqns is stored into the space given by the apqns * argument and the number of stored entries goes into apqn_entries. If the list * is empty (apqn_entries is 0) the apqn_entries field is updated to the number --- linux-riscv-5.4.0.orig/arch/s390/include/uapi/asm/zcrypt.h +++ linux-riscv-5.4.0/arch/s390/include/uapi/asm/zcrypt.h @@ -161,17 +161,17 @@ * @payload_len: Payload length */ struct ep11_cprb { - __u16 cprb_len; - unsigned char cprb_ver_id; - unsigned char pad_000[2]; - unsigned char flags; - unsigned char func_id[2]; - __u32 source_id; - __u32 target_id; - __u32 ret_code; - __u32 reserved1; - __u32 reserved2; - __u32 payload_len; + __u16 cprb_len; + __u8 cprb_ver_id; + __u8 pad_000[2]; + __u8 flags; + __u8 func_id[2]; + __u32 source_id; + __u32 target_id; + __u32 ret_code; + __u32 reserved1; + __u32 reserved2; + __u32 payload_len; } __attribute__((packed)); /** @@ -197,13 +197,13 @@ */ struct ep11_urb { __u16 targets_num; - __u64 targets; + __u8 __user *targets; __u64 weight; __u64 req_no; __u64 req_len; - __u64 req; + __u8 __user *req; __u64 resp_len; - __u64 resp; + __u8 __user *resp; } __attribute__((packed)); /** @@ -237,7 +237,9 @@ struct zcrypt_device_status_ext device[MAX_ZDEV_ENTRIES_EXT]; }; -#define AUTOSELECT 0xFFFFFFFF +#define AUTOSELECT 0xFFFFFFFF +#define AUTOSEL_AP ((__u16) 0xFFFF) +#define AUTOSEL_DOM ((__u16) 0xFFFF) #define ZCRYPT_IOCTL_MAGIC 'z' --- linux-riscv-5.4.0.orig/arch/s390/kernel/Makefile +++ linux-riscv-5.4.0/arch/s390/kernel/Makefile @@ -78,7 +78,11 @@ obj-$(CONFIG_PERF_EVENTS) += perf_cpum_cf_diag.o obj-$(CONFIG_TRACEPOINTS) += trace.o +obj-$(findstring y, $(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) $(CONFIG_PGSTE)) += uv.o # vdso obj-y += vdso64/ obj-$(CONFIG_COMPAT_VDSO) += vdso32/ + +# kernel message catalog +obj-$(CONFIG_KMSG_IDS) += kmsg.o --- linux-riscv-5.4.0.orig/arch/s390/kernel/dis.c +++ linux-riscv-5.4.0/arch/s390/kernel/dis.c @@ -461,10 +461,11 @@ ptr += sprintf(ptr, "%%c%i", value); else if (operand->flags & OPERAND_VR) ptr += sprintf(ptr, "%%v%i", value); - else if (operand->flags & OPERAND_PCREL) - ptr += sprintf(ptr, "%lx", (signed int) value - + addr); - else if (operand->flags & OPERAND_SIGNED) + else if (operand->flags & OPERAND_PCREL) { + void *pcrel = (void *)((int)value + addr); + + ptr += sprintf(ptr, "%px", pcrel); + } else if (operand->flags & OPERAND_SIGNED) ptr += sprintf(ptr, "%i", value); else ptr += sprintf(ptr, "%u", value); @@ -536,7 +537,7 @@ else *ptr++ = ' '; addr = regs->psw.addr + start - 32; - ptr += sprintf(ptr, "%016lx: ", addr); + ptr += sprintf(ptr, "%px: ", (void *)addr); if (start + opsize >= end) break; for (i = 0; i < opsize; i++) @@ -564,7 +565,7 @@ opsize = insn_length(*code); if (opsize > len) break; - ptr += sprintf(ptr, "%p: ", code); + ptr += sprintf(ptr, "%px: ", code); for (i = 0; i < opsize; i++) ptr += sprintf(ptr, "%02x", code[i]); *ptr++ = '\t'; --- linux-riscv-5.4.0.orig/arch/s390/kernel/entry.h +++ linux-riscv-5.4.0/arch/s390/kernel/entry.h @@ -24,6 +24,8 @@ void do_protection_exception(struct pt_regs *regs); void do_dat_exception(struct pt_regs *regs); +void do_secure_storage_access(struct pt_regs *regs); +void do_non_secure_storage_access(struct pt_regs *regs); void addressing_exception(struct pt_regs *regs); void data_exception(struct pt_regs *regs); --- linux-riscv-5.4.0.orig/arch/s390/kernel/ftrace.c +++ linux-riscv-5.4.0/arch/s390/kernel/ftrace.c @@ -72,15 +72,6 @@ #endif } -static inline int is_kprobe_on_ftrace(struct ftrace_insn *insn) -{ -#ifdef CONFIG_KPROBES - if (insn->opc == BREAKPOINT_INSTRUCTION) - return 1; -#endif - return 0; -} - static inline void ftrace_generate_kprobe_nop_insn(struct ftrace_insn *insn) { #ifdef CONFIG_KPROBES @@ -114,16 +105,6 @@ /* Initial code replacement */ ftrace_generate_orig_insn(&orig); ftrace_generate_nop_insn(&new); - } else if (is_kprobe_on_ftrace(&old)) { - /* - * If we find a breakpoint instruction, a kprobe has been - * placed at the beginning of the function. We write the - * constant KPROBE_ON_FTRACE_NOP into the remaining four - * bytes of the original instruction so that the kprobes - * handler can execute a nop, if it reaches this breakpoint. - */ - ftrace_generate_kprobe_call_insn(&orig); - ftrace_generate_kprobe_nop_insn(&new); } else { /* Replace ftrace call with a nop. */ ftrace_generate_call_insn(&orig, rec->ip); @@ -142,21 +123,10 @@ if (probe_kernel_read(&old, (void *) rec->ip, sizeof(old))) return -EFAULT; - if (is_kprobe_on_ftrace(&old)) { - /* - * If we find a breakpoint instruction, a kprobe has been - * placed at the beginning of the function. We write the - * constant KPROBE_ON_FTRACE_CALL into the remaining four - * bytes of the original instruction so that the kprobes - * handler can execute a brasl if it reaches this breakpoint. - */ - ftrace_generate_kprobe_nop_insn(&orig); - ftrace_generate_kprobe_call_insn(&new); - } else { - /* Replace nop with an ftrace call. */ - ftrace_generate_nop_insn(&orig); - ftrace_generate_call_insn(&new, rec->ip); - } + /* Replace nop with an ftrace call. */ + ftrace_generate_nop_insn(&orig); + ftrace_generate_call_insn(&new, rec->ip); + /* Verify that the to be replaced code matches what we expect. */ if (memcmp(&orig, &old, sizeof(old))) return -EINVAL; @@ -241,3 +211,45 @@ } #endif /* CONFIG_FUNCTION_GRAPH_TRACER */ + +#ifdef CONFIG_KPROBES_ON_FTRACE +void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip, + struct ftrace_ops *ops, struct pt_regs *regs) +{ + struct kprobe_ctlblk *kcb; + struct kprobe *p = get_kprobe((kprobe_opcode_t *)ip); + + if (unlikely(!p) || kprobe_disabled(p)) + return; + + if (kprobe_running()) { + kprobes_inc_nmissed_count(p); + return; + } + + __this_cpu_write(current_kprobe, p); + + kcb = get_kprobe_ctlblk(); + kcb->kprobe_status = KPROBE_HIT_ACTIVE; + + instruction_pointer_set(regs, ip); + + if (!p->pre_handler || !p->pre_handler(p, regs)) { + + instruction_pointer_set(regs, ip + MCOUNT_INSN_SIZE); + + if (unlikely(p->post_handler)) { + kcb->kprobe_status = KPROBE_HIT_SSDONE; + p->post_handler(p, regs, 0); + } + } + __this_cpu_write(current_kprobe, NULL); +} +NOKPROBE_SYMBOL(kprobe_ftrace_handler); + +int arch_prepare_kprobe_ftrace(struct kprobe *p) +{ + p->ainsn.insn = NULL; + return 0; +} +#endif --- linux-riscv-5.4.0.orig/arch/s390/kernel/ipl.c +++ linux-riscv-5.4.0/arch/s390/kernel/ipl.c @@ -1842,3 +1842,8 @@ } #endif + +bool ipl_get_secureboot(void) +{ + return !!ipl_secure_flag; +} --- linux-riscv-5.4.0.orig/arch/s390/kernel/kmsg.c +++ linux-riscv-5.4.0/arch/s390/kernel/kmsg.c @@ -0,0 +1,114 @@ +/* + * Message printing with message catalog prefixes. + * + * Copyright IBM Corp. 2012 + */ + +#include +#include +#include +#include +#include + +static inline u32 __printk_jhash(const void *key, u32 length) +{ + u32 a, b, c, len; + const u8 *k; + u8 zk[12]; + + a = b = 0x9e3779b9; + c = 0; + for (len = length + 12, k = key; len >= 12; len -= 12, k += 12) { + if (len >= 24) { + a += k[0] | k[1] << 8 | k[2] << 16 | k[3] << 24; + b += k[4] | k[5] << 8 | k[6] << 16 | k[7] << 24; + c += k[8] | k[9] << 8 | k[10] << 16 | k[11] << 24; + } else { + memset(zk, 0, 12); + memcpy(zk, k, len - 12); + a += zk[0] | zk[1] << 8 | zk[2] << 16 | zk[3] << 24; + b += zk[4] | zk[5] << 8 | zk[6] << 16 | zk[7] << 24; + c += (u32) zk[8] << 8; + c += (u32) zk[9] << 16; + c += (u32) zk[10] << 24; + c += length; + } + a -= b + c; a ^= (c>>13); + b -= a + c; b ^= (a<<8); + c -= a + b; c ^= (b>>13); + a -= b + c; a ^= (c>>12); + b -= a + c; b ^= (a<<16); + c -= a + b; c ^= (b>>5); + a -= b + c; a ^= (c>>3); + b -= a + c; b ^= (a<<10); + c -= a + b; c ^= (b>>15); + } + return c; +} + +/** + * __jhash_string - calculate the six digit jhash of a string + * @str: string to calculate the jhash + */ +unsigned long long __jhash_string(const char *str) +{ + return __printk_jhash(str, strlen(str)) & 0xffffff; +} +EXPORT_SYMBOL(__jhash_string); + +static int __dev_printk_hash(const char *level, const struct device *dev, + struct va_format *vaf) +{ + if (!dev) + return printk("%s(NULL device *): %pV", level, vaf); + + return printk("%s%s.%06x: %pV", level, dev_driver_string(dev), + __printk_jhash(vaf->fmt, strlen(vaf->fmt)) & 0xffffff, + vaf); +} + +int dev_printk_hash(const char *level, const struct device *dev, + const char *fmt, ...) +{ + struct va_format vaf; + va_list args; + int r; + + va_start(args, fmt); + + vaf.fmt = fmt; + vaf.va = &args; + + r = __dev_printk_hash(level, dev, &vaf); + va_end(args); + + return r; +} +EXPORT_SYMBOL(dev_printk_hash); + +#define define_dev_printk_hash_level(func, kern_level) \ +int func(const struct device *dev, const char *fmt, ...) \ +{ \ + struct va_format vaf; \ + va_list args; \ + int r; \ + \ + va_start(args, fmt); \ + \ + vaf.fmt = fmt; \ + vaf.va = &args; \ + \ + r = __dev_printk_hash(kern_level, dev, &vaf); \ + va_end(args); \ + \ + return r; \ +} \ +EXPORT_SYMBOL(func); + +define_dev_printk_hash_level(dev_emerg_hash, KERN_EMERG); +define_dev_printk_hash_level(dev_alert_hash, KERN_ALERT); +define_dev_printk_hash_level(dev_crit_hash, KERN_CRIT); +define_dev_printk_hash_level(dev_err_hash, KERN_ERR); +define_dev_printk_hash_level(dev_warn_hash, KERN_WARNING); +define_dev_printk_hash_level(dev_notice_hash, KERN_NOTICE); +define_dev_printk_hash_level(_dev_info_hash, KERN_INFO); --- linux-riscv-5.4.0.orig/arch/s390/kernel/kprobes.c +++ linux-riscv-5.4.0/arch/s390/kernel/kprobes.c @@ -56,21 +56,10 @@ static void copy_instruction(struct kprobe *p) { - unsigned long ip = (unsigned long) p->addr; s64 disp, new_disp; u64 addr, new_addr; - if (ftrace_location(ip) == ip) { - /* - * If kprobes patches the instruction that is morphed by - * ftrace make sure that kprobes always sees the branch - * "jg .+24" that skips the mcount block or the "brcl 0,0" - * in case of hotpatch. - */ - ftrace_generate_nop_insn((struct ftrace_insn *)p->ainsn.insn); - p->ainsn.is_ftrace_insn = 1; - } else - memcpy(p->ainsn.insn, p->addr, insn_length(*p->addr >> 8)); + memcpy(p->ainsn.insn, p->addr, insn_length(*p->addr >> 8)); p->opcode = p->ainsn.insn[0]; if (!probe_is_insn_relative_long(p->ainsn.insn)) return; @@ -136,11 +125,6 @@ } NOKPROBE_SYMBOL(arch_prepare_kprobe); -int arch_check_ftrace_location(struct kprobe *p) -{ - return 0; -} - struct swap_insn_args { struct kprobe *p; unsigned int arm_kprobe : 1; @@ -149,28 +133,11 @@ static int swap_instruction(void *data) { struct swap_insn_args *args = data; - struct ftrace_insn new_insn, *insn; struct kprobe *p = args->p; - size_t len; + u16 opc; - new_insn.opc = args->arm_kprobe ? BREAKPOINT_INSTRUCTION : p->opcode; - len = sizeof(new_insn.opc); - if (!p->ainsn.is_ftrace_insn) - goto skip_ftrace; - len = sizeof(new_insn); - insn = (struct ftrace_insn *) p->addr; - if (args->arm_kprobe) { - if (is_ftrace_nop(insn)) - new_insn.disp = KPROBE_ON_FTRACE_NOP; - else - new_insn.disp = KPROBE_ON_FTRACE_CALL; - } else { - ftrace_generate_call_insn(&new_insn, (unsigned long)p->addr); - if (insn->disp == KPROBE_ON_FTRACE_NOP) - ftrace_generate_nop_insn(&new_insn); - } -skip_ftrace: - s390_kernel_write(p->addr, &new_insn, len); + opc = args->arm_kprobe ? BREAKPOINT_INSTRUCTION : p->opcode; + s390_kernel_write(p->addr, &opc, sizeof(opc)); return 0; } NOKPROBE_SYMBOL(swap_instruction); @@ -464,24 +431,6 @@ unsigned long ip = regs->psw.addr; int fixup = probe_get_fixup_type(p->ainsn.insn); - /* Check if the kprobes location is an enabled ftrace caller */ - if (p->ainsn.is_ftrace_insn) { - struct ftrace_insn *insn = (struct ftrace_insn *) p->addr; - struct ftrace_insn call_insn; - - ftrace_generate_call_insn(&call_insn, (unsigned long) p->addr); - /* - * A kprobe on an enabled ftrace call site actually single - * stepped an unconditional branch (ftrace nop equivalent). - * Now we need to fixup things and pretend that a brasl r0,... - * was executed instead. - */ - if (insn->disp == KPROBE_ON_FTRACE_CALL) { - ip += call_insn.disp * 2 - MCOUNT_INSN_SIZE; - regs->gprs[0] = (unsigned long)p->addr + sizeof(*insn); - } - } - if (fixup & FIXUP_PSW_NORMAL) ip += (unsigned long) p->addr - (unsigned long) p->ainsn.insn; --- linux-riscv-5.4.0.orig/arch/s390/kernel/machine_kexec.c +++ linux-riscv-5.4.0/arch/s390/kernel/machine_kexec.c @@ -164,7 +164,9 @@ #ifdef CONFIG_CRASH_DUMP int rc; + preempt_disable(); rc = CALL_ON_STACK(do_start_kdump, S390_lowcore.nodat_stack, 1, image); + preempt_enable(); return rc == 0; #else return false; @@ -254,10 +256,10 @@ VMCOREINFO_SYMBOL(lowcore_ptr); VMCOREINFO_SYMBOL(high_memory); VMCOREINFO_LENGTH(lowcore_ptr, NR_CPUS); - mem_assign_absolute(S390_lowcore.vmcore_info, paddr_vmcoreinfo_note()); vmcoreinfo_append_str("SDMA=%lx\n", __sdma); vmcoreinfo_append_str("EDMA=%lx\n", __edma); vmcoreinfo_append_str("KERNELOFFSET=%lx\n", kaslr_offset()); + mem_assign_absolute(S390_lowcore.vmcore_info, paddr_vmcoreinfo_note()); } void machine_shutdown(void) --- linux-riscv-5.4.0.orig/arch/s390/kernel/mcount.S +++ linux-riscv-5.4.0/arch/s390/kernel/mcount.S @@ -26,6 +26,12 @@ #define STACK_PTREGS (STACK_FRAME_OVERHEAD) #define STACK_PTREGS_GPRS (STACK_PTREGS + __PT_GPRS) #define STACK_PTREGS_PSW (STACK_PTREGS + __PT_PSW) +#ifdef __PACK_STACK +/* allocate just enough for r14, r15 and backchain */ +#define TRACED_FUNC_FRAME_SIZE 24 +#else +#define TRACED_FUNC_FRAME_SIZE STACK_FRAME_OVERHEAD +#endif ENTRY(_mcount) BR_EX %r14 @@ -35,13 +41,27 @@ ENTRY(ftrace_caller) .globl ftrace_regs_caller .set ftrace_regs_caller,ftrace_caller + stg %r14,(__SF_GPRS+8*8)(%r15) # save traced function caller + lghi %r14,0 # save condition code + ipm %r14 # don't put any instructions + sllg %r14,%r14,16 # clobbering CC before this point lgr %r1,%r15 #if !(defined(CC_USING_HOTPATCH) || defined(CC_USING_NOP_MCOUNT)) aghi %r0,MCOUNT_RETURN_FIXUP #endif - aghi %r15,-STACK_FRAME_SIZE + # allocate stack frame for ftrace_caller to contain traced function + aghi %r15,-TRACED_FUNC_FRAME_SIZE stg %r1,__SF_BACKCHAIN(%r15) + stg %r0,(__SF_GPRS+8*8)(%r15) + stg %r15,(__SF_GPRS+9*8)(%r15) + # allocate pt_regs and stack frame for ftrace_trace_function + aghi %r15,-STACK_FRAME_SIZE stg %r1,(STACK_PTREGS_GPRS+15*8)(%r15) + stg %r14,(STACK_PTREGS_PSW)(%r15) + lg %r14,(__SF_GPRS+8*8)(%r1) # restore original return address + stosm (STACK_PTREGS_PSW)(%r15),0 + aghi %r1,-TRACED_FUNC_FRAME_SIZE + stg %r1,__SF_BACKCHAIN(%r15) stg %r0,(STACK_PTREGS_PSW+8)(%r15) stmg %r2,%r14,(STACK_PTREGS_GPRS+2*8)(%r15) #ifdef CONFIG_HAVE_MARCH_Z196_FEATURES --- linux-riscv-5.4.0.orig/arch/s390/kernel/perf_cpum_cf.c +++ linux-riscv-5.4.0/arch/s390/kernel/perf_cpum_cf.c @@ -199,7 +199,7 @@ [PERF_COUNT_HW_BUS_CYCLES] = -1, }; -static int __hw_perf_event_init(struct perf_event *event) +static int __hw_perf_event_init(struct perf_event *event, unsigned int type) { struct perf_event_attr *attr = &event->attr; struct hw_perf_event *hwc = &event->hw; @@ -207,7 +207,7 @@ int err = 0; u64 ev; - switch (attr->type) { + switch (type) { case PERF_TYPE_RAW: /* Raw events are used to access counters directly, * hence do not permit excludes */ @@ -294,17 +294,16 @@ static int cpumf_pmu_event_init(struct perf_event *event) { + unsigned int type = event->attr.type; int err; - switch (event->attr.type) { - case PERF_TYPE_HARDWARE: - case PERF_TYPE_HW_CACHE: - case PERF_TYPE_RAW: - err = __hw_perf_event_init(event); - break; - default: + if (type == PERF_TYPE_HARDWARE || type == PERF_TYPE_RAW) + err = __hw_perf_event_init(event, type); + else if (event->pmu->type == type) + /* Registered as unknown PMU */ + err = __hw_perf_event_init(event, PERF_TYPE_RAW); + else return -ENOENT; - } if (unlikely(err) && event->destroy) event->destroy(event); @@ -553,7 +552,7 @@ return -ENODEV; cpumf_pmu.attr_groups = cpumf_cf_event_group(); - rc = perf_pmu_register(&cpumf_pmu, "cpum_cf", PERF_TYPE_RAW); + rc = perf_pmu_register(&cpumf_pmu, "cpum_cf", -1); if (rc) pr_err("Registering the cpum_cf PMU failed with rc=%i\n", rc); return rc; --- linux-riscv-5.4.0.orig/arch/s390/kernel/perf_cpum_cf_diag.c +++ linux-riscv-5.4.0/arch/s390/kernel/perf_cpum_cf_diag.c @@ -243,13 +243,13 @@ int err = -ENOENT; debug_sprintf_event(cf_diag_dbg, 5, - "%s event %p cpu %d config %#llx " + "%s event %p cpu %d config %#llx type:%u " "sample_type %#llx cf_diag_events %d\n", __func__, - event, event->cpu, attr->config, attr->sample_type, - atomic_read(&cf_diag_events)); + event, event->cpu, attr->config, event->pmu->type, + attr->sample_type, atomic_read(&cf_diag_events)); if (event->attr.config != PERF_EVENT_CPUM_CF_DIAG || - event->attr.type != PERF_TYPE_RAW) + event->attr.type != event->pmu->type) goto out; /* Raw events are used to access counters directly, @@ -693,7 +693,7 @@ } debug_register_view(cf_diag_dbg, &debug_sprintf_view); - rc = perf_pmu_register(&cf_diag, "cpum_cf_diag", PERF_TYPE_RAW); + rc = perf_pmu_register(&cf_diag, "cpum_cf_diag", -1); if (rc) { debug_unregister_view(cf_diag_dbg, &debug_sprintf_view); debug_unregister(cf_diag_dbg); --- linux-riscv-5.4.0.orig/arch/s390/kernel/perf_cpum_sf.c +++ linux-riscv-5.4.0/arch/s390/kernel/perf_cpum_sf.c @@ -193,7 +193,7 @@ unsigned long num_sdb, gfp_t gfp_flags) { int i, rc; - unsigned long *new, *tail; + unsigned long *new, *tail, *tail_prev = NULL; if (!sfb->sdbt || !sfb->tail) return -EINVAL; @@ -232,6 +232,7 @@ sfb->num_sdbt++; /* Link current page to tail of chain */ *tail = (unsigned long)(void *) new + 1; + tail_prev = tail; tail = new; } @@ -241,10 +242,22 @@ * issue, a new realloc call (if required) might succeed. */ rc = alloc_sample_data_block(tail, gfp_flags); - if (rc) + if (rc) { + /* Undo last SDBT. An SDBT with no SDB at its first + * entry but with an SDBT entry instead can not be + * handled by the interrupt handler code. + * Avoid this situation. + */ + if (tail_prev) { + sfb->num_sdbt--; + free_page((unsigned long) new); + tail = tail_prev; + } break; + } sfb->num_sdb++; tail++; + tail_prev = new = NULL; /* Allocated at least one SBD */ } /* Link sampling buffer to its origin */ @@ -1300,18 +1313,28 @@ */ if (flush_all && done) break; - - /* If an event overflow happened, discard samples by - * processing any remaining sample-data-blocks. - */ - if (event_overflow) - flush_all = 1; } /* Account sample overflows in the event hardware structure */ if (sampl_overflow) OVERFLOW_REG(hwc) = DIV_ROUND_UP(OVERFLOW_REG(hwc) + sampl_overflow, 1 + num_sdb); + + /* Perf_event_overflow() and perf_event_account_interrupt() limit + * the interrupt rate to an upper limit. Roughly 1000 samples per + * task tick. + * Hitting this limit results in a large number + * of throttled REF_REPORT_THROTTLE entries and the samples + * are dropped. + * Slightly increase the interval to avoid hitting this limit. + */ + if (event_overflow) { + SAMPL_RATE(hwc) += DIV_ROUND_UP(SAMPL_RATE(hwc), 10); + debug_sprintf_event(sfdbg, 1, "%s: rate adjustment %ld\n", + __func__, + DIV_ROUND_UP(SAMPL_RATE(hwc), 10)); + } + if (sampl_overflow || event_overflow) debug_sprintf_event(sfdbg, 4, "hw_perf_event_update: " "overflow stats: sample=%llu event=%llu\n", --- linux-riscv-5.4.0.orig/arch/s390/kernel/perf_event.c +++ linux-riscv-5.4.0/arch/s390/kernel/perf_event.c @@ -224,9 +224,13 @@ struct pt_regs *regs) { struct unwind_state state; + unsigned long addr; - unwind_for_each_frame(&state, current, regs, 0) - perf_callchain_store(entry, state.ip); + unwind_for_each_frame(&state, current, regs, 0) { + addr = unwind_get_return_address(&state); + if (!addr || perf_callchain_store(entry, addr)) + return; + } } /* Perf definitions for PMU event attributes in sysfs */ --- linux-riscv-5.4.0.orig/arch/s390/kernel/pgm_check.S +++ linux-riscv-5.4.0/arch/s390/kernel/pgm_check.S @@ -78,8 +78,8 @@ PGM_CHECK(do_dat_exception) /* 3a */ PGM_CHECK(do_dat_exception) /* 3b */ PGM_CHECK_DEFAULT /* 3c */ -PGM_CHECK_DEFAULT /* 3d */ -PGM_CHECK_DEFAULT /* 3e */ +PGM_CHECK(do_secure_storage_access) /* 3d */ +PGM_CHECK(do_non_secure_storage_access) /* 3e */ PGM_CHECK_DEFAULT /* 3f */ PGM_CHECK_DEFAULT /* 40 */ PGM_CHECK_DEFAULT /* 41 */ --- linux-riscv-5.4.0.orig/arch/s390/kernel/setup.c +++ linux-riscv-5.4.0/arch/s390/kernel/setup.c @@ -49,6 +49,7 @@ #include #include #include +#include #include #include @@ -92,10 +93,6 @@ unsigned long int_hwcap = 0; -#ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST -int __bootdata_preserved(prot_virt_guest); -#endif - int __bootdata(noexec_disabled); int __bootdata(memory_end_set); unsigned long __bootdata(memory_end); @@ -111,6 +108,8 @@ unsigned long __bootdata_preserved(__sdma); unsigned long __bootdata_preserved(__edma); unsigned long __bootdata_preserved(__kaslr_offset); +unsigned int __bootdata_preserved(zlib_dfltcc_support); +EXPORT_SYMBOL(zlib_dfltcc_support); unsigned long VMALLOC_START; EXPORT_SYMBOL(VMALLOC_START); @@ -571,6 +570,9 @@ vmax = _REGION1_SIZE; /* 4-level kernel page table */ } + if (is_prot_virt_host()) + adjust_to_uv_max(&vmax); + /* module area is at the end of the kernel address space. */ MODULES_END = vmax; MODULES_VADDR = MODULES_END - MODULES_LEN; @@ -1059,7 +1061,7 @@ if (!early_ipl_comp_list_addr) return; - if (ipl_block.hdr.flags & IPL_PL_FLAG_IPLSR) + if (ipl_block.hdr.flags & IPL_PL_FLAG_SIPL) pr_info("Linux is running with Secure-IPL enabled\n"); else pr_info("Linux is running with Secure-IPL disabled\n"); @@ -1103,6 +1105,12 @@ log_component_list(); +#ifdef CONFIG_LOCK_DOWN_IN_SECURE_BOOT + if (ipl_get_secureboot()) + security_lock_kernel_down("Secure IPL", + LOCKDOWN_INTEGRITY_MAX); +#endif + /* Have one command line that is parsed and saved in /proc/cmdline */ /* boot_command_line has been already set up in early.c */ *cmdline_p = boot_command_line; @@ -1151,6 +1159,8 @@ */ memblock_trim_memory(1UL << (MAX_ORDER - 1 + PAGE_SHIFT)); + if (is_prot_virt_host()) + setup_uv(); setup_memory_end(); setup_memory(); dma_contiguous_reserve(memory_end); --- linux-riscv-5.4.0.orig/arch/s390/kernel/smp.c +++ linux-riscv-5.4.0/arch/s390/kernel/smp.c @@ -262,10 +262,13 @@ lc->spinlock_index = 0; lc->percpu_offset = __per_cpu_offset[cpu]; lc->kernel_asce = S390_lowcore.kernel_asce; + lc->user_asce = S390_lowcore.kernel_asce; lc->machine_flags = S390_lowcore.machine_flags; lc->user_timer = lc->system_timer = lc->steal_timer = lc->avg_steal_timer = 0; __ctl_store(lc->cregs_save_area, 0, 15); + lc->cregs_save_area[1] = lc->kernel_asce; + lc->cregs_save_area[7] = lc->vdso_asce; save_access_regs((unsigned int *) lc->access_regs_save_area); memcpy(lc->stfle_fac_list, S390_lowcore.stfle_fac_list, sizeof(lc->stfle_fac_list)); @@ -724,39 +727,67 @@ static int smp_add_present_cpu(int cpu); -static int __smp_rescan_cpus(struct sclp_core_info *info, int sysfs_add) +static int smp_add_core(struct sclp_core_entry *core, cpumask_t *avail, + bool configured, bool early) { struct pcpu *pcpu; - cpumask_t avail; - int cpu, nr, i, j; + int cpu, nr, i; u16 address; nr = 0; - cpumask_xor(&avail, cpu_possible_mask, cpu_present_mask); - cpu = cpumask_first(&avail); - for (i = 0; (i < info->combined) && (cpu < nr_cpu_ids); i++) { - if (sclp.has_core_type && info->core[i].type != boot_core_type) + if (sclp.has_core_type && core->type != boot_core_type) + return nr; + cpu = cpumask_first(avail); + address = core->core_id << smp_cpu_mt_shift; + for (i = 0; (i <= smp_cpu_mtid) && (cpu < nr_cpu_ids); i++) { + if (pcpu_find_address(cpu_present_mask, address + i)) continue; - address = info->core[i].core_id << smp_cpu_mt_shift; - for (j = 0; j <= smp_cpu_mtid; j++) { - if (pcpu_find_address(cpu_present_mask, address + j)) - continue; - pcpu = pcpu_devices + cpu; - pcpu->address = address + j; - pcpu->state = - (cpu >= info->configured*(smp_cpu_mtid + 1)) ? - CPU_STATE_STANDBY : CPU_STATE_CONFIGURED; - smp_cpu_set_polarization(cpu, POLARIZATION_UNKNOWN); - set_cpu_present(cpu, true); - if (sysfs_add && smp_add_present_cpu(cpu) != 0) - set_cpu_present(cpu, false); - else - nr++; - cpu = cpumask_next(cpu, &avail); - if (cpu >= nr_cpu_ids) + pcpu = pcpu_devices + cpu; + pcpu->address = address + i; + if (configured) + pcpu->state = CPU_STATE_CONFIGURED; + else + pcpu->state = CPU_STATE_STANDBY; + smp_cpu_set_polarization(cpu, POLARIZATION_UNKNOWN); + set_cpu_present(cpu, true); + if (!early && smp_add_present_cpu(cpu) != 0) + set_cpu_present(cpu, false); + else + nr++; + cpumask_clear_cpu(cpu, avail); + cpu = cpumask_next(cpu, avail); + } + return nr; +} + +static int __smp_rescan_cpus(struct sclp_core_info *info, bool early) +{ + struct sclp_core_entry *core; + cpumask_t avail; + bool configured; + u16 core_id; + int nr, i; + + nr = 0; + cpumask_xor(&avail, cpu_possible_mask, cpu_present_mask); + /* + * Add IPL core first (which got logical CPU number 0) to make sure + * that all SMT threads get subsequent logical CPU numbers. + */ + if (early) { + core_id = pcpu_devices[0].address >> smp_cpu_mt_shift; + for (i = 0; i < info->configured; i++) { + core = &info->core[i]; + if (core->core_id == core_id) { + nr += smp_add_core(core, &avail, true, early); break; + } } } + for (i = 0; i < info->combined; i++) { + configured = i < info->configured; + nr += smp_add_core(&info->core[i], &avail, configured, early); + } return nr; } @@ -805,7 +836,7 @@ /* Add CPUs present at boot */ get_online_cpus(); - __smp_rescan_cpus(info, 0); + __smp_rescan_cpus(info, true); put_online_cpus(); memblock_free_early((unsigned long)info, sizeof(*info)); } @@ -816,6 +847,8 @@ S390_lowcore.last_update_clock = get_tod_clock(); restore_access_regs(S390_lowcore.access_regs_save_area); + set_cpu_flag(CIF_ASCE_PRIMARY); + set_cpu_flag(CIF_ASCE_SECONDARY); cpu_init(); preempt_disable(); init_cpu_timer(); @@ -1148,7 +1181,7 @@ smp_get_core_info(info, 0); get_online_cpus(); mutex_lock(&smp_cpu_state_mutex); - nr = __smp_rescan_cpus(info, 1); + nr = __smp_rescan_cpus(info, false); mutex_unlock(&smp_cpu_state_mutex); put_online_cpus(); kfree(info); --- linux-riscv-5.4.0.orig/arch/s390/kernel/topology.c +++ linux-riscv-5.4.0/arch/s390/kernel/topology.c @@ -65,6 +65,13 @@ cpumask_t cpus_with_topology; +int __cpu_to_node(int cpu) +{ + return cpu_topology[cpu].node_id; +} + +EXPORT_SYMBOL(__cpu_to_node); + static cpumask_t cpu_group_map(struct mask_info *info, unsigned int cpu) { cpumask_t mask; --- linux-riscv-5.4.0.orig/arch/s390/kernel/unwind_bc.c +++ linux-riscv-5.4.0/arch/s390/kernel/unwind_bc.c @@ -60,6 +60,11 @@ ip = READ_ONCE_NOCHECK(sf->gprs[8]); reliable = false; regs = NULL; + if (!__kernel_text_address(ip)) { + /* skip bogus %r14 */ + state->regs = NULL; + return unwind_next_frame(state); + } } else { sf = (struct stack_frame *) state->sp; sp = READ_ONCE_NOCHECK(sf->back_chain); --- linux-riscv-5.4.0.orig/arch/s390/kernel/uv.c +++ linux-riscv-5.4.0/arch/s390/kernel/uv.c @@ -0,0 +1,414 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Common Ultravisor functions and initialization + * + * Copyright IBM Corp. 2019, 2020 + */ +#define KMSG_COMPONENT "prot_virt" +#define pr_fmt(fmt) KMSG_COMPONENT ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* the bootdata_preserved fields come from ones in arch/s390/boot/uv.c */ +#ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST +int __bootdata_preserved(prot_virt_guest); +#endif + +#if IS_ENABLED(CONFIG_KVM) +int prot_virt_host; +EXPORT_SYMBOL(prot_virt_host); +struct uv_info __bootdata_preserved(uv_info); +EXPORT_SYMBOL(uv_info); + +static int __init prot_virt_setup(char *val) +{ + bool enabled; + int rc; + + rc = kstrtobool(val, &enabled); + if (!rc && enabled) + prot_virt_host = 1; + + if (is_prot_virt_guest() && prot_virt_host) { + prot_virt_host = 0; + pr_warn("Protected virtualization not available in protected guests."); + } + + if (prot_virt_host && !test_facility(158)) { + prot_virt_host = 0; + pr_warn("Protected virtualization not supported by the hardware."); + } + + return rc; +} +early_param("prot_virt", prot_virt_setup); + +static int __init uv_init(unsigned long stor_base, unsigned long stor_len) +{ + struct uv_cb_init uvcb = { + .header.cmd = UVC_CMD_INIT_UV, + .header.len = sizeof(uvcb), + .stor_origin = stor_base, + .stor_len = stor_len, + }; + + if (uv_call(0, (uint64_t)&uvcb)) { + pr_err("Ultravisor init failed with rc: 0x%x rrc: 0%x\n", + uvcb.header.rc, uvcb.header.rrc); + return -1; + } + return 0; +} + +void __init setup_uv(void) +{ + unsigned long uv_stor_base; + + uv_stor_base = (unsigned long)memblock_alloc_try_nid( + uv_info.uv_base_stor_len, SZ_1M, SZ_2G, + MEMBLOCK_ALLOC_ACCESSIBLE, NUMA_NO_NODE); + if (!uv_stor_base) { + pr_warn("Failed to reserve %lu bytes for ultravisor base storage\n", + uv_info.uv_base_stor_len); + goto fail; + } + + if (uv_init(uv_stor_base, uv_info.uv_base_stor_len)) { + memblock_free(uv_stor_base, uv_info.uv_base_stor_len); + goto fail; + } + + pr_info("Reserving %luMB as ultravisor base storage\n", + uv_info.uv_base_stor_len >> 20); + return; +fail: + pr_info("Disabling support for protected virtualization"); + prot_virt_host = 0; +} + +void adjust_to_uv_max(unsigned long *vmax) +{ + *vmax = min_t(unsigned long, *vmax, uv_info.max_sec_stor_addr); +} + +/* + * Requests the Ultravisor to pin the page in the shared state. This will + * cause an intercept when the guest attempts to unshare the pinned page. + */ +static int uv_pin_shared(unsigned long paddr) +{ + struct uv_cb_cfs uvcb = { + .header.cmd = UVC_CMD_PIN_PAGE_SHARED, + .header.len = sizeof(uvcb), + .paddr = paddr, + }; + + if (uv_call(0, (u64)&uvcb)) + return -EINVAL; + return 0; +} + +/* + * Requests the Ultravisor to encrypt a guest page and make it + * accessible to the host for paging (export). + * + * @paddr: Absolute host address of page to be exported + */ +int uv_convert_from_secure(unsigned long paddr) +{ + struct uv_cb_cfs uvcb = { + .header.cmd = UVC_CMD_CONV_FROM_SEC_STOR, + .header.len = sizeof(uvcb), + .paddr = paddr + }; + + if (uv_call(0, (u64)&uvcb)) + return -EINVAL; + return 0; +} + +/* + * Calculate the expected ref_count for a page that would otherwise have no + * further pins. This was cribbed from similar functions in other places in + * the kernel, but with some slight modifications. We know that a secure + * page can not be a huge page for example. + */ +static int expected_page_refs(struct page *page) +{ + int res; + + res = page_mapcount(page); + if (PageSwapCache(page)) { + res++; + } else if (page_mapping(page)) { + res++; + if (page_has_private(page)) + res++; + } + return res; +} + +static int make_secure_pte(pte_t *ptep, unsigned long addr, + struct page *exp_page, struct uv_cb_header *uvcb) +{ + pte_t entry = READ_ONCE(*ptep); + struct page *page; + int expected, rc = 0; + + if (!pte_present(entry)) + return -ENXIO; + if (pte_val(entry) & _PAGE_INVALID) + return -ENXIO; + + page = pte_page(entry); + if (page != exp_page) + return -ENXIO; + if (PageWriteback(page)) + return -EAGAIN; + expected = expected_page_refs(page); + if (!page_ref_freeze(page, expected)) + return -EBUSY; + set_bit(PG_arch_1, &page->flags); + rc = uv_call(0, (u64)uvcb); + page_ref_unfreeze(page, expected); + /* Return -ENXIO if the page was not mapped, -EINVAL otherwise */ + if (rc) + rc = uvcb->rc == 0x10a ? -ENXIO : -EINVAL; + return rc; +} + +/* + * Requests the Ultravisor to make a page accessible to a guest. + * If it's brought in the first time, it will be cleared. If + * it has been exported before, it will be decrypted and integrity + * checked. + */ +int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb) +{ + struct vm_area_struct *vma; + bool local_drain = false; + spinlock_t *ptelock; + unsigned long uaddr; + struct page *page; + pte_t *ptep; + int rc; + +again: + rc = -EFAULT; + down_read(&gmap->mm->mmap_sem); + + uaddr = __gmap_translate(gmap, gaddr); + if (IS_ERR_VALUE(uaddr)) + goto out; + vma = find_vma(gmap->mm, uaddr); + if (!vma) + goto out; + /* + * Secure pages cannot be huge and userspace should not combine both. + * In case userspace does it anyway this will result in an -EFAULT for + * the unpack. The guest is thus never reaching secure mode. If + * userspace is playing dirty tricky with mapping huge pages later + * on this will result in a segmentation fault. + */ + if (is_vm_hugetlb_page(vma)) + goto out; + + rc = -ENXIO; + page = follow_page(vma, uaddr, FOLL_WRITE); + if (IS_ERR_OR_NULL(page)) + goto out; + + lock_page(page); + ptep = get_locked_pte(gmap->mm, uaddr, &ptelock); + rc = make_secure_pte(ptep, uaddr, page, uvcb); + pte_unmap_unlock(ptep, ptelock); + unlock_page(page); +out: + up_read(&gmap->mm->mmap_sem); + + if (rc == -EAGAIN) { + wait_on_page_writeback(page); + } else if (rc == -EBUSY) { + /* + * If we have tried a local drain and the page refcount + * still does not match our expected safe value, try with a + * system wide drain. This is needed if the pagevecs holding + * the page are on a different CPU. + */ + if (local_drain) { + lru_add_drain_all(); + /* We give up here, and let the caller try again */ + return -EAGAIN; + } + /* + * We are here if the page refcount does not match the + * expected safe value. The main culprits are usually + * pagevecs. With lru_add_drain() we drain the pagevecs + * on the local CPU so that hopefully the refcount will + * reach the expected safe value. + */ + lru_add_drain(); + local_drain = true; + /* And now we try again immediately after draining */ + goto again; + } else if (rc == -ENXIO) { + if (gmap_fault(gmap, gaddr, FAULT_FLAG_WRITE)) + return -EFAULT; + return -EAGAIN; + } + return rc; +} +EXPORT_SYMBOL_GPL(gmap_make_secure); + +int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr) +{ + struct uv_cb_cts uvcb = { + .header.cmd = UVC_CMD_CONV_TO_SEC_STOR, + .header.len = sizeof(uvcb), + .guest_handle = gmap->guest_handle, + .gaddr = gaddr, + }; + + return gmap_make_secure(gmap, gaddr, &uvcb); +} +EXPORT_SYMBOL_GPL(gmap_convert_to_secure); + +/* + * To be called with the page locked or with an extra reference! This will + * prevent gmap_make_secure from touching the page concurrently. Having 2 + * parallel make_page_accessible is fine, as the UV calls will become a + * no-op if the page is already exported. + */ +int arch_make_page_accessible(struct page *page) +{ + int rc = 0; + + /* Hugepage cannot be protected, so nothing to do */ + if (PageHuge(page)) + return 0; + + /* + * PG_arch_1 is used in 3 places: + * 1. for kernel page tables during early boot + * 2. for storage keys of huge pages and KVM + * 3. As an indication that this page might be secure. This can + * overindicate, e.g. we set the bit before calling + * convert_to_secure. + * As secure pages are never huge, all 3 variants can co-exists. + */ + if (!test_bit(PG_arch_1, &page->flags)) + return 0; + + rc = uv_pin_shared(page_to_phys(page)); + if (!rc) { + clear_bit(PG_arch_1, &page->flags); + return 0; + } + + rc = uv_convert_from_secure(page_to_phys(page)); + if (!rc) { + clear_bit(PG_arch_1, &page->flags); + return 0; + } + + return rc; +} +EXPORT_SYMBOL_GPL(arch_make_page_accessible); + +#endif + +#if defined(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) || IS_ENABLED(CONFIG_KVM) +static ssize_t uv_query_facilities(struct kobject *kobj, + struct kobj_attribute *attr, char *page) +{ + return snprintf(page, PAGE_SIZE, "%lx\n%lx\n%lx\n%lx\n", + uv_info.inst_calls_list[0], + uv_info.inst_calls_list[1], + uv_info.inst_calls_list[2], + uv_info.inst_calls_list[3]); +} + +static struct kobj_attribute uv_query_facilities_attr = + __ATTR(facilities, 0444, uv_query_facilities, NULL); + +static ssize_t uv_query_max_guest_cpus(struct kobject *kobj, + struct kobj_attribute *attr, char *page) +{ + return snprintf(page, PAGE_SIZE, "%d\n", + uv_info.max_guest_cpus); +} + +static struct kobj_attribute uv_query_max_guest_cpus_attr = + __ATTR(max_cpus, 0444, uv_query_max_guest_cpus, NULL); + +static ssize_t uv_query_max_guest_vms(struct kobject *kobj, + struct kobj_attribute *attr, char *page) +{ + return snprintf(page, PAGE_SIZE, "%d\n", + uv_info.max_num_sec_conf); +} + +static struct kobj_attribute uv_query_max_guest_vms_attr = + __ATTR(max_guests, 0444, uv_query_max_guest_vms, NULL); + +static ssize_t uv_query_max_guest_addr(struct kobject *kobj, + struct kobj_attribute *attr, char *page) +{ + return snprintf(page, PAGE_SIZE, "%lx\n", + uv_info.max_sec_stor_addr); +} + +static struct kobj_attribute uv_query_max_guest_addr_attr = + __ATTR(max_address, 0444, uv_query_max_guest_addr, NULL); + +static struct attribute *uv_query_attrs[] = { + &uv_query_facilities_attr.attr, + &uv_query_max_guest_cpus_attr.attr, + &uv_query_max_guest_vms_attr.attr, + &uv_query_max_guest_addr_attr.attr, + NULL, +}; + +static struct attribute_group uv_query_attr_group = { + .attrs = uv_query_attrs, +}; + +static struct kset *uv_query_kset; +static struct kobject *uv_kobj; + +static int __init uv_info_init(void) +{ + int rc = -ENOMEM; + + if (!test_facility(158)) + return 0; + + uv_kobj = kobject_create_and_add("uv", firmware_kobj); + if (!uv_kobj) + return -ENOMEM; + + uv_query_kset = kset_create_and_add("query", NULL, uv_kobj); + if (!uv_query_kset) + goto out_kobj; + + rc = sysfs_create_group(&uv_query_kset->kobj, &uv_query_attr_group); + if (!rc) + return 0; + + kset_unregister(uv_query_kset); +out_kobj: + kobject_del(uv_kobj); + kobject_put(uv_kobj); + return rc; +} +device_initcall(uv_info_init); +#endif --- linux-riscv-5.4.0.orig/arch/s390/kvm/Makefile +++ linux-riscv-5.4.0/arch/s390/kvm/Makefile @@ -9,6 +9,6 @@ ccflags-y := -Ivirt/kvm -Iarch/s390/kvm kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o sigp.o -kvm-objs += diag.o gaccess.o guestdbg.o vsie.o +kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o obj-$(CONFIG_KVM) += kvm.o --- linux-riscv-5.4.0.orig/arch/s390/kvm/diag.c +++ linux-riscv-5.4.0/arch/s390/kvm/diag.c @@ -2,7 +2,7 @@ /* * handling diagnose instructions * - * Copyright IBM Corp. 2008, 2011 + * Copyright IBM Corp. 2008, 2020 * * Author(s): Carsten Otte * Christian Borntraeger @@ -187,6 +187,10 @@ return -EOPNOTSUPP; } + /* + * no need to check the return value of vcpu_stop as it can only have + * an error for protvirt, but protvirt means user cpu state + */ if (!kvm_s390_user_cpu_state_ctrl(vcpu->kvm)) kvm_s390_vcpu_stop(vcpu); vcpu->run->s390_reset_flags |= KVM_S390_RESET_SUBSYSTEM; --- linux-riscv-5.4.0.orig/arch/s390/kvm/intercept.c +++ linux-riscv-5.4.0/arch/s390/kvm/intercept.c @@ -2,7 +2,7 @@ /* * in-kernel handling for sie intercepts * - * Copyright IBM Corp. 2008, 2014 + * Copyright IBM Corp. 2008, 2020 * * Author(s): Carsten Otte * Christian Borntraeger @@ -16,6 +16,7 @@ #include #include #include +#include #include "kvm-s390.h" #include "gaccess.h" @@ -79,6 +80,10 @@ return rc; } + /* + * no need to check the return value of vcpu_stop as it can only have + * an error for protvirt, but protvirt means user cpu state + */ if (!kvm_s390_user_cpu_state_ctrl(vcpu->kvm)) kvm_s390_vcpu_stop(vcpu); return -EOPNOTSUPP; @@ -231,6 +236,13 @@ vcpu->stat.exit_program_interruption++; + /* + * Intercept 8 indicates a loop of specification exceptions + * for protected guests. + */ + if (kvm_s390_pv_cpu_is_protected(vcpu)) + return -EOPNOTSUPP; + if (guestdbg_enabled(vcpu) && per_event(vcpu)) { rc = kvm_s390_handle_per_event(vcpu); if (rc) @@ -384,7 +396,7 @@ goto out; } - if (addr & ~PAGE_MASK) + if (!kvm_s390_pv_cpu_is_protected(vcpu) && (addr & ~PAGE_MASK)) return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION); sctns = (void *)get_zeroed_page(GFP_KERNEL); @@ -395,10 +407,15 @@ out: if (!cc) { - r = write_guest(vcpu, addr, reg2, sctns, PAGE_SIZE); - if (r) { - free_page((unsigned long)sctns); - return kvm_s390_inject_prog_cond(vcpu, r); + if (kvm_s390_pv_cpu_is_protected(vcpu)) { + memcpy((void *)(sida_origin(vcpu->arch.sie_block)), + sctns, PAGE_SIZE); + } else { + r = write_guest(vcpu, addr, reg2, sctns, PAGE_SIZE); + if (r) { + free_page((unsigned long)sctns); + return kvm_s390_inject_prog_cond(vcpu, r); + } } } @@ -444,6 +461,77 @@ return kvm_s390_inject_program_int(vcpu, PGM_OPERATION); } +static int handle_pv_spx(struct kvm_vcpu *vcpu) +{ + u32 pref = *(u32 *)vcpu->arch.sie_block->sidad; + + kvm_s390_set_prefix(vcpu, pref); + trace_kvm_s390_handle_prefix(vcpu, 1, pref); + return 0; +} + +static int handle_pv_sclp(struct kvm_vcpu *vcpu) +{ + struct kvm_s390_float_interrupt *fi = &vcpu->kvm->arch.float_int; + + spin_lock(&fi->lock); + /* + * 2 cases: + * a: an sccb answering interrupt was already pending or in flight. + * As the sccb value is not known we can simply set some value to + * trigger delivery of a saved SCCB. UV will then use its saved + * copy of the SCCB value. + * b: an error SCCB interrupt needs to be injected so we also inject + * a fake SCCB address. Firmware will use the proper one. + * This makes sure, that both errors and real sccb returns will only + * be delivered after a notification intercept (instruction has + * finished) but not after others. + */ + fi->srv_signal.ext_params |= 0x43000; + set_bit(IRQ_PEND_EXT_SERVICE, &fi->pending_irqs); + clear_bit(IRQ_PEND_EXT_SERVICE, &fi->masked_irqs); + spin_unlock(&fi->lock); + return 0; +} + +static int handle_pv_uvc(struct kvm_vcpu *vcpu) +{ + struct uv_cb_share *guest_uvcb = (void *)vcpu->arch.sie_block->sidad; + struct uv_cb_cts uvcb = { + .header.cmd = UVC_CMD_UNPIN_PAGE_SHARED, + .header.len = sizeof(uvcb), + .guest_handle = kvm_s390_pv_get_handle(vcpu->kvm), + .gaddr = guest_uvcb->paddr, + }; + int rc; + + if (guest_uvcb->header.cmd != UVC_CMD_REMOVE_SHARED_ACCESS) { + WARN_ONCE(1, "Unexpected notification intercept for UVC 0x%x\n", + guest_uvcb->header.cmd); + return 0; + } + rc = gmap_make_secure(vcpu->arch.gmap, uvcb.gaddr, &uvcb); + /* + * If the unpin did not succeed, the guest will exit again for the UVC + * and we will retry the unpin. + */ + if (rc == -EINVAL) + return 0; + return rc; +} + +static int handle_pv_notification(struct kvm_vcpu *vcpu) +{ + if (vcpu->arch.sie_block->ipa == 0xb210) + return handle_pv_spx(vcpu); + if (vcpu->arch.sie_block->ipa == 0xb220) + return handle_pv_sclp(vcpu); + if (vcpu->arch.sie_block->ipa == 0xb9a4) + return handle_pv_uvc(vcpu); + + return handle_instruction(vcpu); +} + int kvm_handle_sie_intercept(struct kvm_vcpu *vcpu) { int rc, per_rc = 0; @@ -480,6 +568,28 @@ case ICPT_KSS: rc = kvm_s390_skey_check_enable(vcpu); break; + case ICPT_MCHKREQ: + case ICPT_INT_ENABLE: + /* + * PSW bit 13 or a CR (0, 6, 14) changed and we might + * now be able to deliver interrupts. The pre-run code + * will take care of this. + */ + rc = 0; + break; + case ICPT_PV_INSTR: + rc = handle_instruction(vcpu); + break; + case ICPT_PV_NOTIFY: + rc = handle_pv_notification(vcpu); + break; + case ICPT_PV_PREF: + rc = 0; + gmap_convert_to_secure(vcpu->arch.gmap, + kvm_s390_get_prefix(vcpu)); + gmap_convert_to_secure(vcpu->arch.gmap, + kvm_s390_get_prefix(vcpu) + PAGE_SIZE); + break; default: return -EOPNOTSUPP; } --- linux-riscv-5.4.0.orig/arch/s390/kvm/interrupt.c +++ linux-riscv-5.4.0/arch/s390/kvm/interrupt.c @@ -2,7 +2,7 @@ /* * handling kvm guest interrupts * - * Copyright IBM Corp. 2008, 2015 + * Copyright IBM Corp. 2008, 2020 * * Author(s): Carsten Otte */ @@ -324,8 +324,11 @@ static inline unsigned long pending_irqs_no_gisa(struct kvm_vcpu *vcpu) { - return vcpu->kvm->arch.float_int.pending_irqs | - vcpu->arch.local_int.pending_irqs; + unsigned long pending = vcpu->kvm->arch.float_int.pending_irqs | + vcpu->arch.local_int.pending_irqs; + + pending &= ~vcpu->kvm->arch.float_int.masked_irqs; + return pending; } static inline unsigned long pending_irqs(struct kvm_vcpu *vcpu) @@ -383,10 +386,18 @@ __clear_bit(IRQ_PEND_EXT_CLOCK_COMP, &active_mask); if (!(vcpu->arch.sie_block->gcr[0] & CR0_CPU_TIMER_SUBMASK)) __clear_bit(IRQ_PEND_EXT_CPU_TIMER, &active_mask); - if (!(vcpu->arch.sie_block->gcr[0] & CR0_SERVICE_SIGNAL_SUBMASK)) + if (!(vcpu->arch.sie_block->gcr[0] & CR0_SERVICE_SIGNAL_SUBMASK)) { __clear_bit(IRQ_PEND_EXT_SERVICE, &active_mask); + __clear_bit(IRQ_PEND_EXT_SERVICE_EV, &active_mask); + } if (psw_mchk_disabled(vcpu)) active_mask &= ~IRQ_PEND_MCHK_MASK; + /* PV guest cpus can have a single interruption injected at a time. */ + if (kvm_s390_pv_cpu_is_protected(vcpu) && + vcpu->arch.sie_block->iictl != IICTL_CODE_NONE) + active_mask &= ~(IRQ_PEND_EXT_II_MASK | + IRQ_PEND_IO_MASK | + IRQ_PEND_MCHK_MASK); /* * Check both floating and local interrupt's cr14 because * bit IRQ_PEND_MCHK_REP could be set in both cases. @@ -479,19 +490,23 @@ static int __must_check __deliver_cpu_timer(struct kvm_vcpu *vcpu) { struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int; - int rc; + int rc = 0; vcpu->stat.deliver_cputm++; trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_INT_CPU_TIMER, 0, 0); - - rc = put_guest_lc(vcpu, EXT_IRQ_CPU_TIMER, - (u16 *)__LC_EXT_INT_CODE); - rc |= put_guest_lc(vcpu, 0, (u16 *)__LC_EXT_CPU_ADDR); - rc |= write_guest_lc(vcpu, __LC_EXT_OLD_PSW, - &vcpu->arch.sie_block->gpsw, sizeof(psw_t)); - rc |= read_guest_lc(vcpu, __LC_EXT_NEW_PSW, - &vcpu->arch.sie_block->gpsw, sizeof(psw_t)); + if (kvm_s390_pv_cpu_is_protected(vcpu)) { + vcpu->arch.sie_block->iictl = IICTL_CODE_EXT; + vcpu->arch.sie_block->eic = EXT_IRQ_CPU_TIMER; + } else { + rc = put_guest_lc(vcpu, EXT_IRQ_CPU_TIMER, + (u16 *)__LC_EXT_INT_CODE); + rc |= put_guest_lc(vcpu, 0, (u16 *)__LC_EXT_CPU_ADDR); + rc |= write_guest_lc(vcpu, __LC_EXT_OLD_PSW, + &vcpu->arch.sie_block->gpsw, sizeof(psw_t)); + rc |= read_guest_lc(vcpu, __LC_EXT_NEW_PSW, + &vcpu->arch.sie_block->gpsw, sizeof(psw_t)); + } clear_bit(IRQ_PEND_EXT_CPU_TIMER, &li->pending_irqs); return rc ? -EFAULT : 0; } @@ -499,19 +514,23 @@ static int __must_check __deliver_ckc(struct kvm_vcpu *vcpu) { struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int; - int rc; + int rc = 0; vcpu->stat.deliver_ckc++; trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_INT_CLOCK_COMP, 0, 0); - - rc = put_guest_lc(vcpu, EXT_IRQ_CLK_COMP, - (u16 __user *)__LC_EXT_INT_CODE); - rc |= put_guest_lc(vcpu, 0, (u16 *)__LC_EXT_CPU_ADDR); - rc |= write_guest_lc(vcpu, __LC_EXT_OLD_PSW, - &vcpu->arch.sie_block->gpsw, sizeof(psw_t)); - rc |= read_guest_lc(vcpu, __LC_EXT_NEW_PSW, - &vcpu->arch.sie_block->gpsw, sizeof(psw_t)); + if (kvm_s390_pv_cpu_is_protected(vcpu)) { + vcpu->arch.sie_block->iictl = IICTL_CODE_EXT; + vcpu->arch.sie_block->eic = EXT_IRQ_CLK_COMP; + } else { + rc = put_guest_lc(vcpu, EXT_IRQ_CLK_COMP, + (u16 __user *)__LC_EXT_INT_CODE); + rc |= put_guest_lc(vcpu, 0, (u16 *)__LC_EXT_CPU_ADDR); + rc |= write_guest_lc(vcpu, __LC_EXT_OLD_PSW, + &vcpu->arch.sie_block->gpsw, sizeof(psw_t)); + rc |= read_guest_lc(vcpu, __LC_EXT_NEW_PSW, + &vcpu->arch.sie_block->gpsw, sizeof(psw_t)); + } clear_bit(IRQ_PEND_EXT_CLOCK_COMP, &li->pending_irqs); return rc ? -EFAULT : 0; } @@ -553,6 +572,20 @@ union mci mci; int rc; + /* + * All other possible payload for a machine check (e.g. the register + * contents in the save area) will be handled by the ultravisor, as + * the hypervisor does not not have the needed information for + * protected guests. + */ + if (kvm_s390_pv_cpu_is_protected(vcpu)) { + vcpu->arch.sie_block->iictl = IICTL_CODE_MCHK; + vcpu->arch.sie_block->mcic = mchk->mcic; + vcpu->arch.sie_block->faddr = mchk->failing_storage_address; + vcpu->arch.sie_block->edc = mchk->ext_damage_code; + return 0; + } + mci.val = mchk->mcic; /* take care of lazy register loading */ save_fpu_regs(); @@ -696,17 +729,21 @@ static int __must_check __deliver_restart(struct kvm_vcpu *vcpu) { struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int; - int rc; + int rc = 0; VCPU_EVENT(vcpu, 3, "%s", "deliver: cpu restart"); vcpu->stat.deliver_restart_signal++; trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_RESTART, 0, 0); - rc = write_guest_lc(vcpu, - offsetof(struct lowcore, restart_old_psw), - &vcpu->arch.sie_block->gpsw, sizeof(psw_t)); - rc |= read_guest_lc(vcpu, offsetof(struct lowcore, restart_psw), - &vcpu->arch.sie_block->gpsw, sizeof(psw_t)); + if (kvm_s390_pv_cpu_is_protected(vcpu)) { + vcpu->arch.sie_block->iictl = IICTL_CODE_RESTART; + } else { + rc = write_guest_lc(vcpu, + offsetof(struct lowcore, restart_old_psw), + &vcpu->arch.sie_block->gpsw, sizeof(psw_t)); + rc |= read_guest_lc(vcpu, offsetof(struct lowcore, restart_psw), + &vcpu->arch.sie_block->gpsw, sizeof(psw_t)); + } clear_bit(IRQ_PEND_RESTART, &li->pending_irqs); return rc ? -EFAULT : 0; } @@ -748,6 +785,12 @@ vcpu->stat.deliver_emergency_signal++; trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_INT_EMERGENCY, cpu_addr, 0); + if (kvm_s390_pv_cpu_is_protected(vcpu)) { + vcpu->arch.sie_block->iictl = IICTL_CODE_EXT; + vcpu->arch.sie_block->eic = EXT_IRQ_EMERGENCY_SIG; + vcpu->arch.sie_block->extcpuaddr = cpu_addr; + return 0; + } rc = put_guest_lc(vcpu, EXT_IRQ_EMERGENCY_SIG, (u16 *)__LC_EXT_INT_CODE); @@ -776,6 +819,12 @@ trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_INT_EXTERNAL_CALL, extcall.code, 0); + if (kvm_s390_pv_cpu_is_protected(vcpu)) { + vcpu->arch.sie_block->iictl = IICTL_CODE_EXT; + vcpu->arch.sie_block->eic = EXT_IRQ_EXTERNAL_CALL; + vcpu->arch.sie_block->extcpuaddr = extcall.code; + return 0; + } rc = put_guest_lc(vcpu, EXT_IRQ_EXTERNAL_CALL, (u16 *)__LC_EXT_INT_CODE); @@ -787,6 +836,21 @@ return rc ? -EFAULT : 0; } +static int __deliver_prog_pv(struct kvm_vcpu *vcpu, u16 code) +{ + switch (code) { + case PGM_SPECIFICATION: + vcpu->arch.sie_block->iictl = IICTL_CODE_SPECIFICATION; + break; + case PGM_OPERAND: + vcpu->arch.sie_block->iictl = IICTL_CODE_OPERAND; + break; + default: + return -EINVAL; + } + return 0; +} + static int __must_check __deliver_prog(struct kvm_vcpu *vcpu) { struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int; @@ -807,6 +871,10 @@ trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_PROGRAM_INT, pgm_info.code, 0); + /* PER is handled by the ultravisor */ + if (kvm_s390_pv_cpu_is_protected(vcpu)) + return __deliver_prog_pv(vcpu, pgm_info.code & ~PGM_PER); + switch (pgm_info.code & ~PGM_PER) { case PGM_AFX_TRANSLATION: case PGM_ASX_TRANSLATION: @@ -902,20 +970,49 @@ return rc ? -EFAULT : 0; } +#define SCCB_MASK 0xFFFFFFF8 +#define SCCB_EVENT_PENDING 0x3 + +static int write_sclp(struct kvm_vcpu *vcpu, u32 parm) +{ + int rc; + + if (kvm_s390_pv_cpu_get_handle(vcpu)) { + vcpu->arch.sie_block->iictl = IICTL_CODE_EXT; + vcpu->arch.sie_block->eic = EXT_IRQ_SERVICE_SIG; + vcpu->arch.sie_block->eiparams = parm; + return 0; + } + + rc = put_guest_lc(vcpu, EXT_IRQ_SERVICE_SIG, (u16 *)__LC_EXT_INT_CODE); + rc |= put_guest_lc(vcpu, 0, (u16 *)__LC_EXT_CPU_ADDR); + rc |= write_guest_lc(vcpu, __LC_EXT_OLD_PSW, + &vcpu->arch.sie_block->gpsw, sizeof(psw_t)); + rc |= read_guest_lc(vcpu, __LC_EXT_NEW_PSW, + &vcpu->arch.sie_block->gpsw, sizeof(psw_t)); + rc |= put_guest_lc(vcpu, parm, + (u32 *)__LC_EXT_PARAMS); + + return rc ? -EFAULT : 0; +} + static int __must_check __deliver_service(struct kvm_vcpu *vcpu) { struct kvm_s390_float_interrupt *fi = &vcpu->kvm->arch.float_int; struct kvm_s390_ext_info ext; - int rc = 0; spin_lock(&fi->lock); - if (!(test_bit(IRQ_PEND_EXT_SERVICE, &fi->pending_irqs))) { + if (test_bit(IRQ_PEND_EXT_SERVICE, &fi->masked_irqs) || + !(test_bit(IRQ_PEND_EXT_SERVICE, &fi->pending_irqs))) { spin_unlock(&fi->lock); return 0; } ext = fi->srv_signal; memset(&fi->srv_signal, 0, sizeof(ext)); clear_bit(IRQ_PEND_EXT_SERVICE, &fi->pending_irqs); + clear_bit(IRQ_PEND_EXT_SERVICE_EV, &fi->pending_irqs); + if (kvm_s390_pv_cpu_is_protected(vcpu)) + set_bit(IRQ_PEND_EXT_SERVICE, &fi->masked_irqs); spin_unlock(&fi->lock); VCPU_EVENT(vcpu, 4, "deliver: sclp parameter 0x%x", @@ -924,16 +1021,31 @@ trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_INT_SERVICE, ext.ext_params, 0); - rc = put_guest_lc(vcpu, EXT_IRQ_SERVICE_SIG, (u16 *)__LC_EXT_INT_CODE); - rc |= put_guest_lc(vcpu, 0, (u16 *)__LC_EXT_CPU_ADDR); - rc |= write_guest_lc(vcpu, __LC_EXT_OLD_PSW, - &vcpu->arch.sie_block->gpsw, sizeof(psw_t)); - rc |= read_guest_lc(vcpu, __LC_EXT_NEW_PSW, - &vcpu->arch.sie_block->gpsw, sizeof(psw_t)); - rc |= put_guest_lc(vcpu, ext.ext_params, - (u32 *)__LC_EXT_PARAMS); + return write_sclp(vcpu, ext.ext_params); +} - return rc ? -EFAULT : 0; +static int __must_check __deliver_service_ev(struct kvm_vcpu *vcpu) +{ + struct kvm_s390_float_interrupt *fi = &vcpu->kvm->arch.float_int; + struct kvm_s390_ext_info ext; + + spin_lock(&fi->lock); + if (!(test_bit(IRQ_PEND_EXT_SERVICE_EV, &fi->pending_irqs))) { + spin_unlock(&fi->lock); + return 0; + } + ext = fi->srv_signal; + /* only clear the event bit */ + fi->srv_signal.ext_params &= ~SCCB_EVENT_PENDING; + clear_bit(IRQ_PEND_EXT_SERVICE_EV, &fi->pending_irqs); + spin_unlock(&fi->lock); + + VCPU_EVENT(vcpu, 4, "%s", "deliver: sclp parameter event"); + vcpu->stat.deliver_service_signal++; + trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, KVM_S390_INT_SERVICE, + ext.ext_params, 0); + + return write_sclp(vcpu, SCCB_EVENT_PENDING); } static int __must_check __deliver_pfault_done(struct kvm_vcpu *vcpu) @@ -1028,6 +1140,15 @@ { int rc; + if (kvm_s390_pv_cpu_is_protected(vcpu)) { + vcpu->arch.sie_block->iictl = IICTL_CODE_IO; + vcpu->arch.sie_block->subchannel_id = io->subchannel_id; + vcpu->arch.sie_block->subchannel_nr = io->subchannel_nr; + vcpu->arch.sie_block->io_int_parm = io->io_int_parm; + vcpu->arch.sie_block->io_int_word = io->io_int_word; + return 0; + } + rc = put_guest_lc(vcpu, io->subchannel_id, (u16 *)__LC_SUBCHANNEL_ID); rc |= put_guest_lc(vcpu, io->subchannel_nr, (u16 *)__LC_SUBCHANNEL_NR); rc |= put_guest_lc(vcpu, io->io_int_parm, (u32 *)__LC_IO_INT_PARM); @@ -1329,6 +1450,9 @@ case IRQ_PEND_EXT_SERVICE: rc = __deliver_service(vcpu); break; + case IRQ_PEND_EXT_SERVICE_EV: + rc = __deliver_service_ev(vcpu); + break; case IRQ_PEND_PFAULT_DONE: rc = __deliver_pfault_done(vcpu); break; @@ -1421,7 +1545,7 @@ if (kvm_get_vcpu_by_id(vcpu->kvm, src_id) == NULL) return -EINVAL; - if (sclp.has_sigpif) + if (sclp.has_sigpif && !kvm_s390_pv_cpu_get_handle(vcpu)) return sca_inject_ext_call(vcpu, src_id); if (test_and_set_bit(IRQ_PEND_EXT_EXTERNAL, &li->pending_irqs)) @@ -1682,9 +1806,6 @@ return inti; } -#define SCCB_MASK 0xFFFFFFF8 -#define SCCB_EVENT_PENDING 0x3 - static int __inject_service(struct kvm *kvm, struct kvm_s390_interrupt_info *inti) { @@ -1693,6 +1814,11 @@ kvm->stat.inject_service_signal++; spin_lock(&fi->lock); fi->srv_signal.ext_params |= inti->ext.ext_params & SCCB_EVENT_PENDING; + + /* We always allow events, track them separately from the sccb ints */ + if (fi->srv_signal.ext_params & SCCB_EVENT_PENDING) + set_bit(IRQ_PEND_EXT_SERVICE_EV, &fi->pending_irqs); + /* * Early versions of the QEMU s390 bios will inject several * service interrupts after another without handling a @@ -1774,7 +1900,14 @@ kvm->stat.inject_io++; isc = int_word_to_isc(inti->io.io_int_word); - if (gi->origin && inti->type & KVM_S390_INT_IO_AI_MASK) { + /* + * Do not make use of gisa in protected mode. We do not use the lock + * checking variant as this is just a performance optimization and we + * do not hold the lock here. This is ok as the code will pick + * interrupts from both "lists" for delivery. + */ + if (!kvm_s390_pv_get_handle(kvm) && + gi->origin && inti->type & KVM_S390_INT_IO_AI_MASK) { VM_EVENT(kvm, 4, "%s isc %1u", "inject: I/O (AI/gisa)", isc); gisa_set_ipm_gisc(gi->origin, isc); kfree(inti); @@ -1835,7 +1968,8 @@ break; case KVM_S390_INT_IO_MIN...KVM_S390_INT_IO_MAX: if (!(type & KVM_S390_INT_IO_AI_MASK && - kvm->arch.gisa_int.origin)) + kvm->arch.gisa_int.origin) || + kvm_s390_pv_cpu_get_handle(dst_vcpu)) kvm_s390_set_cpuflags(dst_vcpu, CPUSTAT_IO_INT); break; default: @@ -2081,6 +2215,10 @@ struct kvm_s390_float_interrupt *fi = &kvm->arch.float_int; int i; + mutex_lock(&kvm->lock); + if (!kvm_s390_pv_is_protected(kvm)) + fi->masked_irqs = 0; + mutex_unlock(&kvm->lock); spin_lock(&fi->lock); fi->pending_irqs = 0; memset(&fi->srv_signal, 0, sizeof(fi->srv_signal)); @@ -2147,7 +2285,8 @@ n++; } } - if (test_bit(IRQ_PEND_EXT_SERVICE, &fi->pending_irqs)) { + if (test_bit(IRQ_PEND_EXT_SERVICE, &fi->pending_irqs) || + test_bit(IRQ_PEND_EXT_SERVICE_EV, &fi->pending_irqs)) { if (n == max_irqs) { /* signal userspace to try again */ ret = -ENOMEM; @@ -2191,7 +2330,7 @@ return -EINVAL; if (!test_kvm_facility(kvm, 72)) - return -ENOTSUPP; + return -EOPNOTSUPP; mutex_lock(&fi->ais_lock); ais.simm = fi->simm; @@ -2328,9 +2467,6 @@ if (!adapter) return -ENOMEM; - INIT_LIST_HEAD(&adapter->maps); - init_rwsem(&adapter->maps_lock); - atomic_set(&adapter->nr_maps, 0); adapter->id = adapter_info.id; adapter->isc = adapter_info.isc; adapter->maskable = adapter_info.maskable; @@ -2355,87 +2491,12 @@ return ret; } -static int kvm_s390_adapter_map(struct kvm *kvm, unsigned int id, __u64 addr) -{ - struct s390_io_adapter *adapter = get_io_adapter(kvm, id); - struct s390_map_info *map; - int ret; - - if (!adapter || !addr) - return -EINVAL; - - map = kzalloc(sizeof(*map), GFP_KERNEL); - if (!map) { - ret = -ENOMEM; - goto out; - } - INIT_LIST_HEAD(&map->list); - map->guest_addr = addr; - map->addr = gmap_translate(kvm->arch.gmap, addr); - if (map->addr == -EFAULT) { - ret = -EFAULT; - goto out; - } - ret = get_user_pages_fast(map->addr, 1, FOLL_WRITE, &map->page); - if (ret < 0) - goto out; - BUG_ON(ret != 1); - down_write(&adapter->maps_lock); - if (atomic_inc_return(&adapter->nr_maps) < MAX_S390_ADAPTER_MAPS) { - list_add_tail(&map->list, &adapter->maps); - ret = 0; - } else { - put_page(map->page); - ret = -EINVAL; - } - up_write(&adapter->maps_lock); -out: - if (ret) - kfree(map); - return ret; -} - -static int kvm_s390_adapter_unmap(struct kvm *kvm, unsigned int id, __u64 addr) -{ - struct s390_io_adapter *adapter = get_io_adapter(kvm, id); - struct s390_map_info *map, *tmp; - int found = 0; - - if (!adapter || !addr) - return -EINVAL; - - down_write(&adapter->maps_lock); - list_for_each_entry_safe(map, tmp, &adapter->maps, list) { - if (map->guest_addr == addr) { - found = 1; - atomic_dec(&adapter->nr_maps); - list_del(&map->list); - put_page(map->page); - kfree(map); - break; - } - } - up_write(&adapter->maps_lock); - - return found ? 0 : -EINVAL; -} - void kvm_s390_destroy_adapters(struct kvm *kvm) { int i; - struct s390_map_info *map, *tmp; - for (i = 0; i < MAX_S390_IO_ADAPTERS; i++) { - if (!kvm->arch.adapters[i]) - continue; - list_for_each_entry_safe(map, tmp, - &kvm->arch.adapters[i]->maps, list) { - list_del(&map->list); - put_page(map->page); - kfree(map); - } + for (i = 0; i < MAX_S390_IO_ADAPTERS; i++) kfree(kvm->arch.adapters[i]); - } } static int modify_io_adapter(struct kvm_device *dev, @@ -2457,11 +2518,14 @@ if (ret > 0) ret = 0; break; + /* + * The following operations are no longer needed and therefore no-ops. + * The gpa to hva translation is done when an IRQ route is set up. The + * set_irq code uses get_user_pages_remote() to do the actual write. + */ case KVM_S390_IO_ADAPTER_MAP: - ret = kvm_s390_adapter_map(dev->kvm, req.id, req.addr); - break; case KVM_S390_IO_ADAPTER_UNMAP: - ret = kvm_s390_adapter_unmap(dev->kvm, req.id, req.addr); + ret = 0; break; default: ret = -EINVAL; @@ -2500,7 +2564,7 @@ int ret = 0; if (!test_kvm_facility(kvm, 72)) - return -ENOTSUPP; + return -EOPNOTSUPP; if (copy_from_user(&req, (void __user *)attr->addr, sizeof(req))) return -EFAULT; @@ -2580,7 +2644,7 @@ struct kvm_s390_ais_all ais; if (!test_kvm_facility(kvm, 72)) - return -ENOTSUPP; + return -EOPNOTSUPP; if (copy_from_user(&ais, (void __user *)attr->addr, sizeof(ais))) return -EFAULT; @@ -2700,19 +2764,15 @@ return swap ? (bit ^ (BITS_PER_LONG - 1)) : bit; } -static struct s390_map_info *get_map_info(struct s390_io_adapter *adapter, - u64 addr) +static struct page *get_map_page(struct kvm *kvm, u64 uaddr) { - struct s390_map_info *map; + struct page *page = NULL; - if (!adapter) - return NULL; - - list_for_each_entry(map, &adapter->maps, list) { - if (map->guest_addr == addr) - return map; - } - return NULL; + down_read(&kvm->mm->mmap_sem); + get_user_pages_remote(NULL, kvm->mm, uaddr, 1, FOLL_WRITE, + &page, NULL, NULL); + up_read(&kvm->mm->mmap_sem); + return page; } static int adapter_indicators_set(struct kvm *kvm, @@ -2721,30 +2781,35 @@ { unsigned long bit; int summary_set, idx; - struct s390_map_info *info; + struct page *ind_page, *summary_page; void *map; - info = get_map_info(adapter, adapter_int->ind_addr); - if (!info) + ind_page = get_map_page(kvm, adapter_int->ind_addr); + if (!ind_page) return -1; - map = page_address(info->page); - bit = get_ind_bit(info->addr, adapter_int->ind_offset, adapter->swap); - set_bit(bit, map); - idx = srcu_read_lock(&kvm->srcu); - mark_page_dirty(kvm, info->guest_addr >> PAGE_SHIFT); - set_page_dirty_lock(info->page); - info = get_map_info(adapter, adapter_int->summary_addr); - if (!info) { - srcu_read_unlock(&kvm->srcu, idx); + summary_page = get_map_page(kvm, adapter_int->summary_addr); + if (!summary_page) { + put_page(ind_page); return -1; } - map = page_address(info->page); - bit = get_ind_bit(info->addr, adapter_int->summary_offset, - adapter->swap); + + idx = srcu_read_lock(&kvm->srcu); + map = page_address(ind_page); + bit = get_ind_bit(adapter_int->ind_addr, + adapter_int->ind_offset, adapter->swap); + set_bit(bit, map); + mark_page_dirty(kvm, adapter_int->ind_addr >> PAGE_SHIFT); + set_page_dirty_lock(ind_page); + map = page_address(summary_page); + bit = get_ind_bit(adapter_int->summary_addr, + adapter_int->summary_offset, adapter->swap); summary_set = test_and_set_bit(bit, map); - mark_page_dirty(kvm, info->guest_addr >> PAGE_SHIFT); - set_page_dirty_lock(info->page); + mark_page_dirty(kvm, adapter_int->summary_addr >> PAGE_SHIFT); + set_page_dirty_lock(summary_page); srcu_read_unlock(&kvm->srcu, idx); + + put_page(ind_page); + put_page(summary_page); return summary_set ? 0 : 1; } @@ -2766,9 +2831,7 @@ adapter = get_io_adapter(kvm, e->adapter.adapter_id); if (!adapter) return -1; - down_read(&adapter->maps_lock); ret = adapter_indicators_set(kvm, adapter, &e->adapter); - up_read(&adapter->maps_lock); if ((ret > 0) && !adapter->masked) { ret = kvm_s390_inject_airq(kvm, adapter); if (ret == 0) @@ -2819,23 +2882,27 @@ struct kvm_kernel_irq_routing_entry *e, const struct kvm_irq_routing_entry *ue) { - int ret; + u64 uaddr; switch (ue->type) { + /* we store the userspace addresses instead of the guest addresses */ case KVM_IRQ_ROUTING_S390_ADAPTER: e->set = set_adapter_int; - e->adapter.summary_addr = ue->u.adapter.summary_addr; - e->adapter.ind_addr = ue->u.adapter.ind_addr; + uaddr = gmap_translate(kvm->arch.gmap, ue->u.adapter.summary_addr); + if (uaddr == -EFAULT) + return -EFAULT; + e->adapter.summary_addr = uaddr; + uaddr = gmap_translate(kvm->arch.gmap, ue->u.adapter.ind_addr); + if (uaddr == -EFAULT) + return -EFAULT; + e->adapter.ind_addr = uaddr; e->adapter.summary_offset = ue->u.adapter.summary_offset; e->adapter.ind_offset = ue->u.adapter.ind_offset; e->adapter.adapter_id = ue->u.adapter.adapter_id; - ret = 0; - break; + return 0; default: - ret = -EINVAL; + return -EINVAL; } - - return ret; } int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e, struct kvm *kvm, --- linux-riscv-5.4.0.orig/arch/s390/kvm/kvm-s390.c +++ linux-riscv-5.4.0/arch/s390/kvm/kvm-s390.c @@ -2,7 +2,7 @@ /* * hosting IBM Z kernel virtual machines (s390x) * - * Copyright IBM Corp. 2008, 2018 + * Copyright IBM Corp. 2008, 2020 * * Author(s): Carsten Otte * Christian Borntraeger @@ -44,6 +44,7 @@ #include #include #include +#include #include "kvm-s390.h" #include "gaccess.h" @@ -219,6 +220,7 @@ static struct gmap_notifier gmap_notifier; static struct gmap_notifier vsie_gmap_notifier; debug_info_t *kvm_s390_dbf; +debug_info_t *kvm_s390_dbf_uv; /* Section: not file related */ int kvm_arch_hardware_enable(void) @@ -232,8 +234,10 @@ return 0; } +/* forward declarations */ static void kvm_gmap_notifier(struct gmap *gmap, unsigned long start, unsigned long end); +static int sca_switch_to_extended(struct kvm *kvm); static void kvm_clock_sync_scb(struct kvm_s390_sie_block *scb, u64 delta) { @@ -453,16 +457,19 @@ int kvm_arch_init(void *opaque) { - int rc; + int rc = -ENOMEM; kvm_s390_dbf = debug_register("kvm-trace", 32, 1, 7 * sizeof(long)); if (!kvm_s390_dbf) return -ENOMEM; - if (debug_register_view(kvm_s390_dbf, &debug_sprintf_view)) { - rc = -ENOMEM; - goto out_debug_unreg; - } + kvm_s390_dbf_uv = debug_register("kvm-uv", 32, 1, 7 * sizeof(long)); + if (!kvm_s390_dbf_uv) + goto out; + + if (debug_register_view(kvm_s390_dbf, &debug_sprintf_view) || + debug_register_view(kvm_s390_dbf_uv, &debug_sprintf_view)) + goto out; kvm_s390_cpu_feat_init(); @@ -470,19 +477,17 @@ rc = kvm_register_device_ops(&kvm_flic_ops, KVM_DEV_TYPE_FLIC); if (rc) { pr_err("A FLIC registration call failed with rc=%d\n", rc); - goto out_debug_unreg; + goto out; } rc = kvm_s390_gib_init(GAL_ISC); if (rc) - goto out_gib_destroy; + goto out; return 0; -out_gib_destroy: - kvm_s390_gib_destroy(); -out_debug_unreg: - debug_unregister(kvm_s390_dbf); +out: + kvm_arch_exit(); return rc; } @@ -490,6 +495,7 @@ { kvm_s390_gib_destroy(); debug_unregister(kvm_s390_dbf); + debug_unregister(kvm_s390_dbf_uv); } /* Section: device related */ @@ -532,6 +538,7 @@ case KVM_CAP_S390_CMMA_MIGRATION: case KVM_CAP_S390_AIS: case KVM_CAP_S390_AIS_MIGRATION: + case KVM_CAP_S390_VCPU_RESETS: r = 1; break; case KVM_CAP_S390_HPAGE_1M: @@ -566,6 +573,9 @@ case KVM_CAP_S390_BPB: r = test_facility(82); break; + case KVM_CAP_S390_PROTECTED: + r = is_prot_virt_host(); + break; default: r = 0; } @@ -2160,6 +2170,194 @@ return r; } +static int kvm_s390_cpus_from_pv(struct kvm *kvm, u16 *rcp, u16 *rrcp) +{ + struct kvm_vcpu *vcpu; + u16 rc, rrc; + int ret = 0; + int i; + + /* + * We ignore failures and try to destroy as many CPUs as possible. + * At the same time we must not free the assigned resources when + * this fails, as the ultravisor has still access to that memory. + * So kvm_s390_pv_destroy_cpu can leave a "wanted" memory leak + * behind. + * We want to return the first failure rc and rrc, though. + */ + kvm_for_each_vcpu(i, vcpu, kvm) { + mutex_lock(&vcpu->mutex); + if (kvm_s390_pv_destroy_cpu(vcpu, &rc, &rrc) && !ret) { + *rcp = rc; + *rrcp = rrc; + ret = -EIO; + } + mutex_unlock(&vcpu->mutex); + } + return ret; +} + +static int kvm_s390_cpus_to_pv(struct kvm *kvm, u16 *rc, u16 *rrc) +{ + int i, r = 0; + u16 dummy; + + struct kvm_vcpu *vcpu; + + kvm_for_each_vcpu(i, vcpu, kvm) { + mutex_lock(&vcpu->mutex); + r = kvm_s390_pv_create_cpu(vcpu, rc, rrc); + mutex_unlock(&vcpu->mutex); + if (r) + break; + } + if (r) + kvm_s390_cpus_from_pv(kvm, &dummy, &dummy); + return r; +} + +static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd) +{ + int r = 0; + u16 dummy; + void __user *argp = (void __user *)cmd->data; + + switch (cmd->cmd) { + case KVM_PV_ENABLE: { + r = -EINVAL; + if (kvm_s390_pv_is_protected(kvm)) + break; + + /* + * FMT 4 SIE needs esca. As we never switch back to bsca from + * esca, we need no cleanup in the error cases below + */ + r = sca_switch_to_extended(kvm); + if (r) + break; + + down_write(¤t->mm->mmap_sem); + r = gmap_mark_unmergeable(); + up_write(¤t->mm->mmap_sem); + if (r) + break; + + r = kvm_s390_pv_init_vm(kvm, &cmd->rc, &cmd->rrc); + if (r) + break; + + r = kvm_s390_cpus_to_pv(kvm, &cmd->rc, &cmd->rrc); + if (r) + kvm_s390_pv_deinit_vm(kvm, &dummy, &dummy); + + /* we need to block service interrupts from now on */ + set_bit(IRQ_PEND_EXT_SERVICE, &kvm->arch.float_int.masked_irqs); + break; + } + case KVM_PV_DISABLE: { + r = -EINVAL; + if (!kvm_s390_pv_is_protected(kvm)) + break; + + r = kvm_s390_cpus_from_pv(kvm, &cmd->rc, &cmd->rrc); + /* + * If a CPU could not be destroyed, destroy VM will also fail. + * There is no point in trying to destroy it. Instead return + * the rc and rrc from the first CPU that failed destroying. + */ + if (r) + break; + r = kvm_s390_pv_deinit_vm(kvm, &cmd->rc, &cmd->rrc); + + /* no need to block service interrupts any more */ + clear_bit(IRQ_PEND_EXT_SERVICE, &kvm->arch.float_int.masked_irqs); + break; + } + case KVM_PV_SET_SEC_PARMS: { + struct kvm_s390_pv_sec_parm parms = {}; + void *hdr; + + r = -EINVAL; + if (!kvm_s390_pv_is_protected(kvm)) + break; + + r = -EFAULT; + if (copy_from_user(&parms, argp, sizeof(parms))) + break; + + /* Currently restricted to 8KB */ + r = -EINVAL; + if (parms.length > PAGE_SIZE * 2) + break; + + r = -ENOMEM; + hdr = vmalloc(parms.length); + if (!hdr) + break; + + r = -EFAULT; + if (!copy_from_user(hdr, (void __user *)parms.origin, + parms.length)) + r = kvm_s390_pv_set_sec_parms(kvm, hdr, parms.length, + &cmd->rc, &cmd->rrc); + + vfree(hdr); + break; + } + case KVM_PV_UNPACK: { + struct kvm_s390_pv_unp unp = {}; + + r = -EINVAL; + if (!kvm_s390_pv_is_protected(kvm)) + break; + + r = -EFAULT; + if (copy_from_user(&unp, argp, sizeof(unp))) + break; + + r = kvm_s390_pv_unpack(kvm, unp.addr, unp.size, unp.tweak, + &cmd->rc, &cmd->rrc); + break; + } + case KVM_PV_VERIFY: { + r = -EINVAL; + if (!kvm_s390_pv_is_protected(kvm)) + break; + + r = uv_cmd_nodata(kvm_s390_pv_get_handle(kvm), + UVC_CMD_VERIFY_IMG, &cmd->rc, &cmd->rrc); + KVM_UV_EVENT(kvm, 3, "PROTVIRT VERIFY: rc %x rrc %x", cmd->rc, + cmd->rrc); + break; + } + case KVM_PV_PREP_RESET: { + r = -EINVAL; + if (!kvm_s390_pv_is_protected(kvm)) + break; + + r = uv_cmd_nodata(kvm_s390_pv_get_handle(kvm), + UVC_CMD_PREPARE_RESET, &cmd->rc, &cmd->rrc); + KVM_UV_EVENT(kvm, 3, "PROTVIRT PREP RESET: rc %x rrc %x", + cmd->rc, cmd->rrc); + break; + } + case KVM_PV_UNSHARE_ALL: { + r = -EINVAL; + if (!kvm_s390_pv_is_protected(kvm)) + break; + + r = uv_cmd_nodata(kvm_s390_pv_get_handle(kvm), + UVC_CMD_SET_UNSHARE_ALL, &cmd->rc, &cmd->rrc); + KVM_UV_EVENT(kvm, 3, "PROTVIRT UNSHARE: rc %x rrc %x", + cmd->rc, cmd->rrc); + break; + } + default: + r = -ENOTTY; + } + return r; +} + long kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) { @@ -2257,6 +2455,33 @@ mutex_unlock(&kvm->slots_lock); break; } + case KVM_S390_PV_COMMAND: { + struct kvm_pv_cmd args; + + /* protvirt means user sigp */ + kvm->arch.user_cpu_state_ctrl = 1; + r = 0; + if (!is_prot_virt_host()) { + r = -EINVAL; + break; + } + if (copy_from_user(&args, argp, sizeof(args))) { + r = -EFAULT; + break; + } + if (args.flags) { + r = -EINVAL; + break; + } + mutex_lock(&kvm->lock); + r = kvm_s390_handle_pv(kvm, &args); + mutex_unlock(&kvm->lock); + if (copy_to_user(argp, &args, sizeof(args))) { + r = -EFAULT; + break; + } + break; + } default: r = -ENOTTY; } @@ -2520,6 +2745,8 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu) { + u16 rc, rrc; + VCPU_EVENT(vcpu, 3, "%s", "free cpu"); trace_kvm_s390_destroy_vcpu(vcpu->vcpu_id); kvm_s390_clear_local_irqs(vcpu); @@ -2532,6 +2759,9 @@ if (vcpu->kvm->arch.use_cmma) kvm_s390_vcpu_unsetup_cmma(vcpu); + /* We can not hold the vcpu mutex here, we are already dying */ + if (kvm_s390_pv_cpu_get_handle(vcpu)) + kvm_s390_pv_destroy_cpu(vcpu, &rc, &rrc); free_page((unsigned long)(vcpu->arch.sie_block)); kvm_vcpu_uninit(vcpu); @@ -2556,10 +2786,20 @@ void kvm_arch_destroy_vm(struct kvm *kvm) { + u16 rc, rrc; + kvm_free_vcpus(kvm); sca_dispose(kvm); - debug_unregister(kvm->arch.dbf); kvm_s390_gisa_destroy(kvm); + /* + * We are already at the end of life and kvm->lock is not taken. + * This is ok as the file descriptor is closed by now and nobody + * can mess with the pv state. To avoid lockdep_assert_held from + * complaining we do not use kvm_s390_pv_is_protected. + */ + if (kvm_s390_pv_get_handle(kvm)) + kvm_s390_pv_deinit_vm(kvm, &rc, &rrc); + debug_unregister(kvm->arch.dbf); free_page((unsigned long)kvm->arch.sie_page2); if (!kvm_is_ucontrol(kvm)) gmap_remove(kvm->arch.gmap); @@ -2655,6 +2895,9 @@ unsigned int vcpu_idx; u32 scaol, scaoh; + if (kvm->arch.use_esca) + return 0; + new_sca = alloc_pages_exact(sizeof(*new_sca), GFP_KERNEL|__GFP_ZERO); if (!new_sca) return -ENOMEM; @@ -2847,35 +3090,6 @@ } -static void kvm_s390_vcpu_initial_reset(struct kvm_vcpu *vcpu) -{ - /* this equals initial cpu reset in pop, but we don't switch to ESA */ - vcpu->arch.sie_block->gpsw.mask = 0UL; - vcpu->arch.sie_block->gpsw.addr = 0UL; - kvm_s390_set_prefix(vcpu, 0); - kvm_s390_set_cpu_timer(vcpu, 0); - vcpu->arch.sie_block->ckc = 0UL; - vcpu->arch.sie_block->todpr = 0; - memset(vcpu->arch.sie_block->gcr, 0, 16 * sizeof(__u64)); - vcpu->arch.sie_block->gcr[0] = CR0_UNUSED_56 | - CR0_INTERRUPT_KEY_SUBMASK | - CR0_MEASUREMENT_ALERT_SUBMASK; - vcpu->arch.sie_block->gcr[14] = CR14_UNUSED_32 | - CR14_UNUSED_33 | - CR14_EXTERNAL_DAMAGE_SUBMASK; - /* make sure the new fpc will be lazily loaded */ - save_fpu_regs(); - current->thread.fpu.fpc = 0; - vcpu->arch.sie_block->gbea = 1; - vcpu->arch.sie_block->pp = 0; - vcpu->arch.sie_block->fpf &= ~FPF_BPBC; - vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID; - kvm_clear_async_pf_completion_queue(vcpu); - if (!kvm_s390_user_cpu_state_ctrl(vcpu->kvm)) - kvm_s390_vcpu_stop(vcpu); - kvm_s390_clear_local_irqs(vcpu); -} - void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu) { mutex_lock(&vcpu->kvm->lock); @@ -2968,6 +3182,7 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu) { int rc = 0; + u16 uvrc, uvrrc; atomic_set(&vcpu->arch.sie_block->cpuflags, CPUSTAT_ZARCH | CPUSTAT_SM | @@ -3035,6 +3250,14 @@ kvm_s390_vcpu_crypto_setup(vcpu); + mutex_lock(&vcpu->kvm->lock); + if (kvm_s390_pv_is_protected(vcpu->kvm)) { + rc = kvm_s390_pv_create_cpu(vcpu, &uvrc, &uvrrc); + if (rc) + kvm_s390_vcpu_unsetup_cmma(vcpu); + } + mutex_unlock(&vcpu->kvm->lock); + return rc; } @@ -3290,10 +3513,60 @@ return r; } -static int kvm_arch_vcpu_ioctl_initial_reset(struct kvm_vcpu *vcpu) +static void kvm_arch_vcpu_ioctl_normal_reset(struct kvm_vcpu *vcpu) { - kvm_s390_vcpu_initial_reset(vcpu); - return 0; + vcpu->arch.sie_block->gpsw.mask &= ~PSW_MASK_RI; + vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID; + memset(vcpu->run->s.regs.riccb, 0, sizeof(vcpu->run->s.regs.riccb)); + + kvm_clear_async_pf_completion_queue(vcpu); + if (!kvm_s390_user_cpu_state_ctrl(vcpu->kvm)) + kvm_s390_vcpu_stop(vcpu); + kvm_s390_clear_local_irqs(vcpu); +} + +static void kvm_arch_vcpu_ioctl_initial_reset(struct kvm_vcpu *vcpu) +{ + /* Initial reset is a superset of the normal reset */ + kvm_arch_vcpu_ioctl_normal_reset(vcpu); + + /* this equals initial cpu reset in pop, but we don't switch to ESA */ + vcpu->arch.sie_block->gpsw.mask = 0; + vcpu->arch.sie_block->gpsw.addr = 0; + kvm_s390_set_prefix(vcpu, 0); + kvm_s390_set_cpu_timer(vcpu, 0); + vcpu->arch.sie_block->ckc = 0; + memset(vcpu->arch.sie_block->gcr, 0, sizeof(vcpu->arch.sie_block->gcr)); + vcpu->arch.sie_block->gcr[0] = CR0_INITIAL_MASK; + vcpu->arch.sie_block->gcr[14] = CR14_INITIAL_MASK; + vcpu->run->s.regs.fpc = 0; + /* + * Do not reset these registers in the protected case, as some of + * them are overlayed and they are not accessible in this case + * anyway. + */ + if (!kvm_s390_pv_cpu_is_protected(vcpu)) { + vcpu->arch.sie_block->gbea = 1; + vcpu->arch.sie_block->pp = 0; + vcpu->arch.sie_block->fpf &= ~FPF_BPBC; + vcpu->arch.sie_block->todpr = 0; + } +} + +static void kvm_arch_vcpu_ioctl_clear_reset(struct kvm_vcpu *vcpu) +{ + struct kvm_sync_regs *regs = &vcpu->run->s.regs; + + /* Clear reset is a superset of the initial reset */ + kvm_arch_vcpu_ioctl_initial_reset(vcpu); + + memset(®s->gprs, 0, sizeof(regs->gprs)); + memset(®s->vrs, 0, sizeof(regs->vrs)); + memset(®s->acrs, 0, sizeof(regs->acrs)); + memset(®s->gscb, 0, sizeof(regs->gscb)); + + regs->etoken = 0; + regs->etoken_extension = 0; } int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) @@ -3467,12 +3740,18 @@ switch (mp_state->mp_state) { case KVM_MP_STATE_STOPPED: - kvm_s390_vcpu_stop(vcpu); + rc = kvm_s390_vcpu_stop(vcpu); break; case KVM_MP_STATE_OPERATING: - kvm_s390_vcpu_start(vcpu); + rc = kvm_s390_vcpu_start(vcpu); break; case KVM_MP_STATE_LOAD: + if (!kvm_s390_pv_cpu_is_protected(vcpu)) { + rc = -ENXIO; + break; + } + rc = kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_OPR_LOAD); + break; case KVM_MP_STATE_CHECK_STOP: /* fall through - CHECK_STOP and LOAD are not supported yet */ default: @@ -3824,9 +4103,11 @@ return vcpu_post_run_fault_in_sie(vcpu); } +#define PSW_INT_MASK (PSW_MASK_EXT | PSW_MASK_IO | PSW_MASK_MCHECK) static int __vcpu_run(struct kvm_vcpu *vcpu) { int rc, exit_reason; + struct sie_page *sie_page = (struct sie_page *)vcpu->arch.sie_block; /* * We try to hold kvm->srcu during most of vcpu_run (except when run- @@ -3848,8 +4129,28 @@ guest_enter_irqoff(); __disable_cpu_timer_accounting(vcpu); local_irq_enable(); + if (kvm_s390_pv_cpu_is_protected(vcpu)) { + memcpy(sie_page->pv_grregs, + vcpu->run->s.regs.gprs, + sizeof(sie_page->pv_grregs)); + } exit_reason = sie64a(vcpu->arch.sie_block, vcpu->run->s.regs.gprs); + if (kvm_s390_pv_cpu_is_protected(vcpu)) { + memcpy(vcpu->run->s.regs.gprs, + sie_page->pv_grregs, + sizeof(sie_page->pv_grregs)); + /* + * We're not allowed to inject interrupts on intercepts + * that leave the guest state in an "in-between" state + * where the next SIE entry will do a continuation. + * Fence interrupts in our "internal" PSW. + */ + if (vcpu->arch.sie_block->icptcode == ICPT_PV_INSTR || + vcpu->arch.sie_block->icptcode == ICPT_PV_PREF) { + vcpu->arch.sie_block->gpsw.mask &= ~PSW_INT_MASK; + } + } local_irq_disable(); __enable_cpu_timer_accounting(vcpu); guest_exit_irqoff(); @@ -3863,7 +4164,7 @@ return rc; } -static void sync_regs(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) +static void sync_regs_fmt2(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) { struct runtime_instr_cb *riccb; struct gs_cb *gscb; @@ -3872,16 +4173,7 @@ gscb = (struct gs_cb *) &kvm_run->s.regs.gscb; vcpu->arch.sie_block->gpsw.mask = kvm_run->psw_mask; vcpu->arch.sie_block->gpsw.addr = kvm_run->psw_addr; - if (kvm_run->kvm_dirty_regs & KVM_SYNC_PREFIX) - kvm_s390_set_prefix(vcpu, kvm_run->s.regs.prefix); - if (kvm_run->kvm_dirty_regs & KVM_SYNC_CRS) { - memcpy(&vcpu->arch.sie_block->gcr, &kvm_run->s.regs.crs, 128); - /* some control register changes require a tlb flush */ - kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu); - } if (kvm_run->kvm_dirty_regs & KVM_SYNC_ARCH0) { - kvm_s390_set_cpu_timer(vcpu, kvm_run->s.regs.cputm); - vcpu->arch.sie_block->ckc = kvm_run->s.regs.ckc; vcpu->arch.sie_block->todpr = kvm_run->s.regs.todpr; vcpu->arch.sie_block->pp = kvm_run->s.regs.pp; vcpu->arch.sie_block->gbea = kvm_run->s.regs.gbea; @@ -3922,6 +4214,36 @@ vcpu->arch.sie_block->fpf &= ~FPF_BPBC; vcpu->arch.sie_block->fpf |= kvm_run->s.regs.bpbc ? FPF_BPBC : 0; } + if (MACHINE_HAS_GS) { + preempt_disable(); + __ctl_set_bit(2, 4); + if (current->thread.gs_cb) { + vcpu->arch.host_gscb = current->thread.gs_cb; + save_gs_cb(vcpu->arch.host_gscb); + } + if (vcpu->arch.gs_enabled) { + current->thread.gs_cb = (struct gs_cb *) + &vcpu->run->s.regs.gscb; + restore_gs_cb(current->thread.gs_cb); + } + preempt_enable(); + } + /* SIE will load etoken directly from SDNX and therefore kvm_run */ +} + +static void sync_regs(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) +{ + if (kvm_run->kvm_dirty_regs & KVM_SYNC_PREFIX) + kvm_s390_set_prefix(vcpu, kvm_run->s.regs.prefix); + if (kvm_run->kvm_dirty_regs & KVM_SYNC_CRS) { + memcpy(&vcpu->arch.sie_block->gcr, &kvm_run->s.regs.crs, 128); + /* some control register changes require a tlb flush */ + kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu); + } + if (kvm_run->kvm_dirty_regs & KVM_SYNC_ARCH0) { + kvm_s390_set_cpu_timer(vcpu, kvm_run->s.regs.cputm); + vcpu->arch.sie_block->ckc = kvm_run->s.regs.ckc; + } save_access_regs(vcpu->arch.host_acrs); restore_access_regs(vcpu->run->s.regs.acrs); /* save host (userspace) fprs/vrs */ @@ -3936,23 +4258,47 @@ if (test_fp_ctl(current->thread.fpu.fpc)) /* User space provided an invalid FPC, let's clear it */ current->thread.fpu.fpc = 0; + + /* Sync fmt2 only data */ + if (likely(!kvm_s390_pv_cpu_is_protected(vcpu))) { + sync_regs_fmt2(vcpu, kvm_run); + } else { + /* + * In several places we have to modify our internal view to + * not do things that are disallowed by the ultravisor. For + * example we must not inject interrupts after specific exits + * (e.g. 112 prefix page not secure). We do this by turning + * off the machine check, external and I/O interrupt bits + * of our PSW copy. To avoid getting validity intercepts, we + * do only accept the condition code from userspace. + */ + vcpu->arch.sie_block->gpsw.mask &= ~PSW_MASK_CC; + vcpu->arch.sie_block->gpsw.mask |= kvm_run->psw_mask & + PSW_MASK_CC; + } + + kvm_run->kvm_dirty_regs = 0; +} + +static void store_regs_fmt2(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) +{ + kvm_run->s.regs.todpr = vcpu->arch.sie_block->todpr; + kvm_run->s.regs.pp = vcpu->arch.sie_block->pp; + kvm_run->s.regs.gbea = vcpu->arch.sie_block->gbea; + kvm_run->s.regs.bpbc = (vcpu->arch.sie_block->fpf & FPF_BPBC) == FPF_BPBC; if (MACHINE_HAS_GS) { - preempt_disable(); __ctl_set_bit(2, 4); - if (current->thread.gs_cb) { - vcpu->arch.host_gscb = current->thread.gs_cb; - save_gs_cb(vcpu->arch.host_gscb); - } - if (vcpu->arch.gs_enabled) { - current->thread.gs_cb = (struct gs_cb *) - &vcpu->run->s.regs.gscb; - restore_gs_cb(current->thread.gs_cb); - } + if (vcpu->arch.gs_enabled) + save_gs_cb(current->thread.gs_cb); + preempt_disable(); + current->thread.gs_cb = vcpu->arch.host_gscb; + restore_gs_cb(vcpu->arch.host_gscb); preempt_enable(); + if (!vcpu->arch.host_gscb) + __ctl_clear_bit(2, 4); + vcpu->arch.host_gscb = NULL; } - /* SIE will load etoken directly from SDNX and therefore kvm_run */ - - kvm_run->kvm_dirty_regs = 0; + /* SIE will save etoken directly into SDNX and therefore kvm_run */ } static void store_regs(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) @@ -3963,13 +4309,9 @@ memcpy(&kvm_run->s.regs.crs, &vcpu->arch.sie_block->gcr, 128); kvm_run->s.regs.cputm = kvm_s390_get_cpu_timer(vcpu); kvm_run->s.regs.ckc = vcpu->arch.sie_block->ckc; - kvm_run->s.regs.todpr = vcpu->arch.sie_block->todpr; - kvm_run->s.regs.pp = vcpu->arch.sie_block->pp; - kvm_run->s.regs.gbea = vcpu->arch.sie_block->gbea; kvm_run->s.regs.pft = vcpu->arch.pfault_token; kvm_run->s.regs.pfs = vcpu->arch.pfault_select; kvm_run->s.regs.pfc = vcpu->arch.pfault_compare; - kvm_run->s.regs.bpbc = (vcpu->arch.sie_block->fpf & FPF_BPBC) == FPF_BPBC; save_access_regs(vcpu->run->s.regs.acrs); restore_access_regs(vcpu->arch.host_acrs); /* Save guest register state */ @@ -3978,19 +4320,8 @@ /* Restore will be done lazily at return */ current->thread.fpu.fpc = vcpu->arch.host_fpregs.fpc; current->thread.fpu.regs = vcpu->arch.host_fpregs.regs; - if (MACHINE_HAS_GS) { - __ctl_set_bit(2, 4); - if (vcpu->arch.gs_enabled) - save_gs_cb(current->thread.gs_cb); - preempt_disable(); - current->thread.gs_cb = vcpu->arch.host_gscb; - restore_gs_cb(vcpu->arch.host_gscb); - preempt_enable(); - if (!vcpu->arch.host_gscb) - __ctl_clear_bit(2, 4); - vcpu->arch.host_gscb = NULL; - } - /* SIE will save etoken directly into SDNX and therefore kvm_run */ + if (likely(!kvm_s390_pv_cpu_is_protected(vcpu))) + store_regs_fmt2(vcpu, kvm_run); } int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) @@ -4014,6 +4345,10 @@ kvm_sigset_activate(vcpu); + /* + * no need to check the return value of vcpu_start as it can only have + * an error for protvirt, but protvirt means user cpu state + */ if (!kvm_s390_user_cpu_state_ctrl(vcpu->kvm)) { kvm_s390_vcpu_start(vcpu); } else if (is_vcpu_stopped(vcpu)) { @@ -4151,18 +4486,27 @@ kvm_s390_sync_request(KVM_REQ_ENABLE_IBS, vcpu); } -void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu) +int kvm_s390_vcpu_start(struct kvm_vcpu *vcpu) { - int i, online_vcpus, started_vcpus = 0; + int i, online_vcpus, r = 0, started_vcpus = 0; if (!is_vcpu_stopped(vcpu)) - return; + return 0; trace_kvm_s390_vcpu_start_stop(vcpu->vcpu_id, 1); /* Only one cpu at a time may enter/leave the STOPPED state. */ spin_lock(&vcpu->kvm->arch.start_stop_lock); online_vcpus = atomic_read(&vcpu->kvm->online_vcpus); + /* Let's tell the UV that we want to change into the operating state */ + if (kvm_s390_pv_cpu_is_protected(vcpu)) { + r = kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_OPR); + if (r) { + spin_unlock(&vcpu->kvm->arch.start_stop_lock); + return r; + } + } + for (i = 0; i < online_vcpus; i++) { if (!is_vcpu_stopped(vcpu->kvm->vcpus[i])) started_vcpus++; @@ -4182,27 +4526,43 @@ kvm_s390_clear_cpuflags(vcpu, CPUSTAT_STOPPED); /* + * The real PSW might have changed due to a RESTART interpreted by the + * ultravisor. We block all interrupts and let the next sie exit + * refresh our view. + */ + if (kvm_s390_pv_cpu_is_protected(vcpu)) + vcpu->arch.sie_block->gpsw.mask &= ~PSW_INT_MASK; + /* * Another VCPU might have used IBS while we were offline. * Let's play safe and flush the VCPU at startup. */ kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu); spin_unlock(&vcpu->kvm->arch.start_stop_lock); - return; + return 0; } -void kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu) +int kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu) { - int i, online_vcpus, started_vcpus = 0; + int i, online_vcpus, r = 0, started_vcpus = 0; struct kvm_vcpu *started_vcpu = NULL; if (is_vcpu_stopped(vcpu)) - return; + return 0; trace_kvm_s390_vcpu_start_stop(vcpu->vcpu_id, 0); /* Only one cpu at a time may enter/leave the STOPPED state. */ spin_lock(&vcpu->kvm->arch.start_stop_lock); online_vcpus = atomic_read(&vcpu->kvm->online_vcpus); + /* Let's tell the UV that we want to change into the stopped state */ + if (kvm_s390_pv_cpu_is_protected(vcpu)) { + r = kvm_s390_pv_set_cpu_state(vcpu, PV_CPU_STATE_STP); + if (r) { + spin_unlock(&vcpu->kvm->arch.start_stop_lock); + return r; + } + } + /* SIGP STOP and SIGP STOP AND STORE STATUS has been fully processed */ kvm_s390_clear_stop_irq(vcpu); @@ -4225,7 +4585,7 @@ } spin_unlock(&vcpu->kvm->arch.start_stop_lock); - return; + return 0; } static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu, @@ -4252,12 +4612,40 @@ return r; } +static long kvm_s390_guest_sida_op(struct kvm_vcpu *vcpu, + struct kvm_s390_mem_op *mop) +{ + void __user *uaddr = (void __user *)mop->buf; + int r = 0; + + if (mop->flags || !mop->size) + return -EINVAL; + if (mop->size + mop->sida_offset < mop->size) + return -EINVAL; + if (mop->size + mop->sida_offset > sida_size(vcpu->arch.sie_block)) + return -E2BIG; + + switch (mop->op) { + case KVM_S390_MEMOP_SIDA_READ: + if (copy_to_user(uaddr, (void *)(sida_origin(vcpu->arch.sie_block) + + mop->sida_offset), mop->size)) + r = -EFAULT; + + break; + case KVM_S390_MEMOP_SIDA_WRITE: + if (copy_from_user((void *)(sida_origin(vcpu->arch.sie_block) + + mop->sida_offset), uaddr, mop->size)) + r = -EFAULT; + break; + } + return r; +} static long kvm_s390_guest_mem_op(struct kvm_vcpu *vcpu, struct kvm_s390_mem_op *mop) { void __user *uaddr = (void __user *)mop->buf; void *tmpbuf = NULL; - int r, srcu_idx; + int r = 0; const u64 supported_flags = KVM_S390_MEMOP_F_INJECT_EXCEPTION | KVM_S390_MEMOP_F_CHECK_ONLY; @@ -4267,14 +4655,15 @@ if (mop->size > MEM_OP_MAX_SIZE) return -E2BIG; + if (kvm_s390_pv_cpu_is_protected(vcpu)) + return -EINVAL; + if (!(mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY)) { tmpbuf = vmalloc(mop->size); if (!tmpbuf) return -ENOMEM; } - srcu_idx = srcu_read_lock(&vcpu->kvm->srcu); - switch (mop->op) { case KVM_S390_MEMOP_LOGICAL_READ: if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) { @@ -4300,12 +4689,8 @@ } r = write_guest(vcpu, mop->gaddr, mop->ar, tmpbuf, mop->size); break; - default: - r = -EINVAL; } - srcu_read_unlock(&vcpu->kvm->srcu, srcu_idx); - if (r > 0 && (mop->flags & KVM_S390_MEMOP_F_INJECT_EXCEPTION) != 0) kvm_s390_inject_prog_irq(vcpu, &vcpu->arch.pgm); @@ -4313,6 +4698,31 @@ return r; } +static long kvm_s390_guest_memsida_op(struct kvm_vcpu *vcpu, + struct kvm_s390_mem_op *mop) +{ + int r, srcu_idx; + + srcu_idx = srcu_read_lock(&vcpu->kvm->srcu); + + switch (mop->op) { + case KVM_S390_MEMOP_LOGICAL_READ: + case KVM_S390_MEMOP_LOGICAL_WRITE: + r = kvm_s390_guest_mem_op(vcpu, mop); + break; + case KVM_S390_MEMOP_SIDA_READ: + case KVM_S390_MEMOP_SIDA_WRITE: + /* we are locked against sida going away by the vcpu->mutex */ + r = kvm_s390_guest_sida_op(vcpu, mop); + break; + default: + r = -EINVAL; + } + + srcu_read_unlock(&vcpu->kvm->srcu, srcu_idx); + return r; +} + long kvm_arch_vcpu_async_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) { @@ -4348,13 +4758,14 @@ void __user *argp = (void __user *)arg; int idx; long r; + u16 rc, rrc; vcpu_load(vcpu); switch (ioctl) { case KVM_S390_STORE_STATUS: idx = srcu_read_lock(&vcpu->kvm->srcu); - r = kvm_s390_vcpu_store_status(vcpu, arg); + r = kvm_s390_store_status_unloaded(vcpu, arg); srcu_read_unlock(&vcpu->kvm->srcu, idx); break; case KVM_S390_SET_INITIAL_PSW: { @@ -4366,12 +4777,43 @@ r = kvm_arch_vcpu_ioctl_set_initial_psw(vcpu, psw); break; } + case KVM_S390_CLEAR_RESET: + r = 0; + kvm_arch_vcpu_ioctl_clear_reset(vcpu); + if (kvm_s390_pv_cpu_is_protected(vcpu)) { + r = uv_cmd_nodata(kvm_s390_pv_cpu_get_handle(vcpu), + UVC_CMD_CPU_RESET_CLEAR, &rc, &rrc); + VCPU_EVENT(vcpu, 3, "PROTVIRT RESET CLEAR VCPU: rc %x rrc %x", + rc, rrc); + } + break; case KVM_S390_INITIAL_RESET: - r = kvm_arch_vcpu_ioctl_initial_reset(vcpu); + r = 0; + kvm_arch_vcpu_ioctl_initial_reset(vcpu); + if (kvm_s390_pv_cpu_is_protected(vcpu)) { + r = uv_cmd_nodata(kvm_s390_pv_cpu_get_handle(vcpu), + UVC_CMD_CPU_RESET_INITIAL, + &rc, &rrc); + VCPU_EVENT(vcpu, 3, "PROTVIRT RESET INITIAL VCPU: rc %x rrc %x", + rc, rrc); + } + break; + case KVM_S390_NORMAL_RESET: + r = 0; + kvm_arch_vcpu_ioctl_normal_reset(vcpu); + if (kvm_s390_pv_cpu_is_protected(vcpu)) { + r = uv_cmd_nodata(kvm_s390_pv_cpu_get_handle(vcpu), + UVC_CMD_CPU_RESET, &rc, &rrc); + VCPU_EVENT(vcpu, 3, "PROTVIRT RESET NORMAL VCPU: rc %x rrc %x", + rc, rrc); + } break; case KVM_SET_ONE_REG: case KVM_GET_ONE_REG: { struct kvm_one_reg reg; + r = -EINVAL; + if (kvm_s390_pv_cpu_is_protected(vcpu)) + break; r = -EFAULT; if (copy_from_user(®, argp, sizeof(reg))) break; @@ -4434,7 +4876,7 @@ struct kvm_s390_mem_op mem_op; if (copy_from_user(&mem_op, argp, sizeof(mem_op)) == 0) - r = kvm_s390_guest_mem_op(vcpu, &mem_op); + r = kvm_s390_guest_memsida_op(vcpu, &mem_op); else r = -EFAULT; break; @@ -4520,6 +4962,9 @@ if (mem->guest_phys_addr + mem->memory_size > kvm->arch.mem_limit) return -EINVAL; + /* When we are protected, we should not change the memory slots */ + if (kvm_s390_pv_get_handle(kvm)) + return -EINVAL; return 0; } --- linux-riscv-5.4.0.orig/arch/s390/kvm/kvm-s390.h +++ linux-riscv-5.4.0/arch/s390/kvm/kvm-s390.h @@ -2,7 +2,7 @@ /* * definition for kvm on s390 * - * Copyright IBM Corp. 2008, 2009 + * Copyright IBM Corp. 2008, 2020 * * Author(s): Carsten Otte * Christian Borntraeger @@ -15,6 +15,7 @@ #include #include #include +#include #include #include #include @@ -25,6 +26,17 @@ #define IS_ITDB_VALID(vcpu) ((*(char *)vcpu->arch.sie_block->itdba == TDB_FORMAT1)) extern debug_info_t *kvm_s390_dbf; +extern debug_info_t *kvm_s390_dbf_uv; + +#define KVM_UV_EVENT(d_kvm, d_loglevel, d_string, d_args...)\ +do { \ + debug_sprintf_event((d_kvm)->arch.dbf, d_loglevel, d_string "\n", \ + d_args); \ + debug_sprintf_event(kvm_s390_dbf_uv, d_loglevel, \ + "%d: " d_string "\n", (d_kvm)->userspace_pid, \ + d_args); \ +} while (0) + #define KVM_EVENT(d_loglevel, d_string, d_args...)\ do { \ debug_sprintf_event(kvm_s390_dbf, d_loglevel, d_string "\n", \ @@ -196,6 +208,39 @@ return kvm->arch.user_cpu_state_ctrl != 0; } +/* implemented in pv.c */ +int kvm_s390_pv_destroy_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc); +int kvm_s390_pv_create_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc); +int kvm_s390_pv_deinit_vm(struct kvm *kvm, u16 *rc, u16 *rrc); +int kvm_s390_pv_init_vm(struct kvm *kvm, u16 *rc, u16 *rrc); +int kvm_s390_pv_set_sec_parms(struct kvm *kvm, void *hdr, u64 length, u16 *rc, + u16 *rrc); +int kvm_s390_pv_unpack(struct kvm *kvm, unsigned long addr, unsigned long size, + unsigned long tweak, u16 *rc, u16 *rrc); +int kvm_s390_pv_set_cpu_state(struct kvm_vcpu *vcpu, u8 state); + +static inline u64 kvm_s390_pv_get_handle(struct kvm *kvm) +{ + return kvm->arch.pv.handle; +} + +static inline u64 kvm_s390_pv_cpu_get_handle(struct kvm_vcpu *vcpu) +{ + return vcpu->arch.pv.handle; +} + +static inline bool kvm_s390_pv_is_protected(struct kvm *kvm) +{ + lockdep_assert_held(&kvm->lock); + return !!kvm_s390_pv_get_handle(kvm); +} + +static inline bool kvm_s390_pv_cpu_is_protected(struct kvm_vcpu *vcpu) +{ + lockdep_assert_held(&vcpu->mutex); + return !!kvm_s390_pv_cpu_get_handle(vcpu); +} + /* implemented in interrupt.c */ int kvm_s390_handle_wait(struct kvm_vcpu *vcpu); void kvm_s390_vcpu_wakeup(struct kvm_vcpu *vcpu); @@ -286,8 +331,8 @@ long kvm_arch_fault_in_page(struct kvm_vcpu *vcpu, gpa_t gpa, int writable); int kvm_s390_store_status_unloaded(struct kvm_vcpu *vcpu, unsigned long addr); int kvm_s390_vcpu_store_status(struct kvm_vcpu *vcpu, unsigned long addr); -void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu); -void kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu); +int kvm_s390_vcpu_start(struct kvm_vcpu *vcpu); +int kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu); void kvm_s390_vcpu_block(struct kvm_vcpu *vcpu); void kvm_s390_vcpu_unblock(struct kvm_vcpu *vcpu); bool kvm_s390_vcpu_sie_inhibited(struct kvm_vcpu *vcpu); --- linux-riscv-5.4.0.orig/arch/s390/kvm/priv.c +++ linux-riscv-5.4.0/arch/s390/kvm/priv.c @@ -2,7 +2,7 @@ /* * handling privileged instructions * - * Copyright IBM Corp. 2008, 2018 + * Copyright IBM Corp. 2008, 2020 * * Author(s): Carsten Otte * Christian Borntraeger @@ -872,7 +872,7 @@ operand2 = kvm_s390_get_base_disp_s(vcpu, &ar); - if (operand2 & 0xfff) + if (!kvm_s390_pv_cpu_is_protected(vcpu) && (operand2 & 0xfff)) return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION); switch (fc) { @@ -893,8 +893,13 @@ handle_stsi_3_2_2(vcpu, (void *) mem); break; } - - rc = write_guest(vcpu, operand2, ar, (void *)mem, PAGE_SIZE); + if (kvm_s390_pv_cpu_is_protected(vcpu)) { + memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem, + PAGE_SIZE); + rc = 0; + } else { + rc = write_guest(vcpu, operand2, ar, (void *)mem, PAGE_SIZE); + } if (rc) { rc = kvm_s390_inject_prog_cond(vcpu, rc); goto out; --- linux-riscv-5.4.0.orig/arch/s390/kvm/pv.c +++ linux-riscv-5.4.0/arch/s390/kvm/pv.c @@ -0,0 +1,303 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Hosting Protected Virtual Machines + * + * Copyright IBM Corp. 2019, 2020 + * Author(s): Janosch Frank + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include "kvm-s390.h" + +int kvm_s390_pv_destroy_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc) +{ + int cc = 0; + + if (kvm_s390_pv_cpu_get_handle(vcpu)) { + cc = uv_cmd_nodata(kvm_s390_pv_cpu_get_handle(vcpu), + UVC_CMD_DESTROY_SEC_CPU, rc, rrc); + + KVM_UV_EVENT(vcpu->kvm, 3, + "PROTVIRT DESTROY VCPU %d: rc %x rrc %x", + vcpu->vcpu_id, *rc, *rrc); + WARN_ONCE(cc, "protvirt destroy cpu failed rc %x rrc %x", + *rc, *rrc); + } + /* Intended memory leak for something that should never happen. */ + if (!cc) + free_pages(vcpu->arch.pv.stor_base, + get_order(uv_info.guest_cpu_stor_len)); + + free_page(sida_origin(vcpu->arch.sie_block)); + vcpu->arch.sie_block->pv_handle_cpu = 0; + vcpu->arch.sie_block->pv_handle_config = 0; + memset(&vcpu->arch.pv, 0, sizeof(vcpu->arch.pv)); + vcpu->arch.sie_block->sdf = 0; + /* + * The sidad field (for sdf == 2) is now the gbea field (for sdf == 0). + * Use the reset value of gbea to avoid leaking the kernel pointer of + * the just freed sida. + */ + vcpu->arch.sie_block->gbea = 1; + kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu); + + return cc ? EIO : 0; +} + +int kvm_s390_pv_create_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc) +{ + struct uv_cb_csc uvcb = { + .header.cmd = UVC_CMD_CREATE_SEC_CPU, + .header.len = sizeof(uvcb), + }; + int cc; + + if (kvm_s390_pv_cpu_get_handle(vcpu)) + return -EINVAL; + + vcpu->arch.pv.stor_base = __get_free_pages(GFP_KERNEL, + get_order(uv_info.guest_cpu_stor_len)); + if (!vcpu->arch.pv.stor_base) + return -ENOMEM; + + /* Input */ + uvcb.guest_handle = kvm_s390_pv_get_handle(vcpu->kvm); + uvcb.num = vcpu->arch.sie_block->icpua; + uvcb.state_origin = (u64)vcpu->arch.sie_block; + uvcb.stor_origin = (u64)vcpu->arch.pv.stor_base; + + /* Alloc Secure Instruction Data Area Designation */ + vcpu->arch.sie_block->sidad = __get_free_page(GFP_KERNEL | __GFP_ZERO); + if (!vcpu->arch.sie_block->sidad) { + free_pages(vcpu->arch.pv.stor_base, + get_order(uv_info.guest_cpu_stor_len)); + return -ENOMEM; + } + + cc = uv_call(0, (u64)&uvcb); + *rc = uvcb.header.rc; + *rrc = uvcb.header.rrc; + KVM_UV_EVENT(vcpu->kvm, 3, + "PROTVIRT CREATE VCPU: cpu %d handle %llx rc %x rrc %x", + vcpu->vcpu_id, uvcb.cpu_handle, uvcb.header.rc, + uvcb.header.rrc); + + if (cc) { + u16 dummy; + + kvm_s390_pv_destroy_cpu(vcpu, &dummy, &dummy); + return -EIO; + } + + /* Output */ + vcpu->arch.pv.handle = uvcb.cpu_handle; + vcpu->arch.sie_block->pv_handle_cpu = uvcb.cpu_handle; + vcpu->arch.sie_block->pv_handle_config = kvm_s390_pv_get_handle(vcpu->kvm); + vcpu->arch.sie_block->sdf = 2; + kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu); + return 0; +} + +/* only free resources when the destroy was successful */ +static void kvm_s390_pv_dealloc_vm(struct kvm *kvm) +{ + vfree(kvm->arch.pv.stor_var); + free_pages(kvm->arch.pv.stor_base, + get_order(uv_info.guest_base_stor_len)); + memset(&kvm->arch.pv, 0, sizeof(kvm->arch.pv)); +} + +static int kvm_s390_pv_alloc_vm(struct kvm *kvm) +{ + unsigned long base = uv_info.guest_base_stor_len; + unsigned long virt = uv_info.guest_virt_var_stor_len; + unsigned long npages = 0, vlen = 0; + struct kvm_memory_slot *memslot; + + kvm->arch.pv.stor_var = NULL; + kvm->arch.pv.stor_base = __get_free_pages(GFP_KERNEL, get_order(base)); + if (!kvm->arch.pv.stor_base) + return -ENOMEM; + + /* + * Calculate current guest storage for allocation of the + * variable storage, which is based on the length in MB. + * + * Slots are sorted by GFN + */ + mutex_lock(&kvm->slots_lock); + memslot = kvm_memslots(kvm)->memslots; + npages = memslot->base_gfn + memslot->npages; + mutex_unlock(&kvm->slots_lock); + + kvm->arch.pv.guest_len = npages * PAGE_SIZE; + + /* Allocate variable storage */ + vlen = ALIGN(virt * ((npages * PAGE_SIZE) / HPAGE_SIZE), PAGE_SIZE); + vlen += uv_info.guest_virt_base_stor_len; + kvm->arch.pv.stor_var = vzalloc(vlen); + if (!kvm->arch.pv.stor_var) + goto out_err; + return 0; + +out_err: + kvm_s390_pv_dealloc_vm(kvm); + return -ENOMEM; +} + +/* this should not fail, but if it does, we must not free the donated memory */ +int kvm_s390_pv_deinit_vm(struct kvm *kvm, u16 *rc, u16 *rrc) +{ + int cc; + + /* make all pages accessible before destroying the guest */ + s390_reset_acc(kvm->mm); + + cc = uv_cmd_nodata(kvm_s390_pv_get_handle(kvm), + UVC_CMD_DESTROY_SEC_CONF, rc, rrc); + WRITE_ONCE(kvm->arch.gmap->guest_handle, 0); + atomic_set(&kvm->mm->context.is_protected, 0); + KVM_UV_EVENT(kvm, 3, "PROTVIRT DESTROY VM: rc %x rrc %x", *rc, *rrc); + WARN_ONCE(cc, "protvirt destroy vm failed rc %x rrc %x", *rc, *rrc); + /* Inteded memory leak on "impossible" error */ + if (!cc) + kvm_s390_pv_dealloc_vm(kvm); + return cc ? -EIO : 0; +} + +int kvm_s390_pv_init_vm(struct kvm *kvm, u16 *rc, u16 *rrc) +{ + struct uv_cb_cgc uvcb = { + .header.cmd = UVC_CMD_CREATE_SEC_CONF, + .header.len = sizeof(uvcb) + }; + int cc, ret; + u16 dummy; + + ret = kvm_s390_pv_alloc_vm(kvm); + if (ret) + return ret; + + /* Inputs */ + uvcb.guest_stor_origin = 0; /* MSO is 0 for KVM */ + uvcb.guest_stor_len = kvm->arch.pv.guest_len; + uvcb.guest_asce = kvm->arch.gmap->asce; + uvcb.guest_sca = (unsigned long)kvm->arch.sca; + uvcb.conf_base_stor_origin = (u64)kvm->arch.pv.stor_base; + uvcb.conf_virt_stor_origin = (u64)kvm->arch.pv.stor_var; + + cc = uv_call(0, (u64)&uvcb); + *rc = uvcb.header.rc; + *rrc = uvcb.header.rrc; + KVM_UV_EVENT(kvm, 3, "PROTVIRT CREATE VM: handle %llx len %llx rc %x rrc %x", + uvcb.guest_handle, uvcb.guest_stor_len, *rc, *rrc); + + /* Outputs */ + kvm->arch.pv.handle = uvcb.guest_handle; + + if (cc) { + if (uvcb.header.rc & UVC_RC_NEED_DESTROY) + kvm_s390_pv_deinit_vm(kvm, &dummy, &dummy); + else + kvm_s390_pv_dealloc_vm(kvm); + return -EIO; + } + kvm->arch.gmap->guest_handle = uvcb.guest_handle; + atomic_set(&kvm->mm->context.is_protected, 1); + return 0; +} + +int kvm_s390_pv_set_sec_parms(struct kvm *kvm, void *hdr, u64 length, u16 *rc, + u16 *rrc) +{ + struct uv_cb_ssc uvcb = { + .header.cmd = UVC_CMD_SET_SEC_CONF_PARAMS, + .header.len = sizeof(uvcb), + .sec_header_origin = (u64)hdr, + .sec_header_len = length, + .guest_handle = kvm_s390_pv_get_handle(kvm), + }; + int cc = uv_call(0, (u64)&uvcb); + + *rc = uvcb.header.rc; + *rrc = uvcb.header.rrc; + KVM_UV_EVENT(kvm, 3, "PROTVIRT VM SET PARMS: rc %x rrc %x", + *rc, *rrc); + return cc ? -EINVAL : 0; +} + +static int unpack_one(struct kvm *kvm, unsigned long addr, u64 tweak, + u64 offset, u16 *rc, u16 *rrc) +{ + struct uv_cb_unp uvcb = { + .header.cmd = UVC_CMD_UNPACK_IMG, + .header.len = sizeof(uvcb), + .guest_handle = kvm_s390_pv_get_handle(kvm), + .gaddr = addr, + .tweak[0] = tweak, + .tweak[1] = offset, + }; + int ret = gmap_make_secure(kvm->arch.gmap, addr, &uvcb); + + *rc = uvcb.header.rc; + *rrc = uvcb.header.rrc; + + if (ret && ret != -EAGAIN) + KVM_UV_EVENT(kvm, 3, "PROTVIRT VM UNPACK: failed addr %llx with rc %x rrc %x", + uvcb.gaddr, *rc, *rrc); + return ret; +} + +int kvm_s390_pv_unpack(struct kvm *kvm, unsigned long addr, unsigned long size, + unsigned long tweak, u16 *rc, u16 *rrc) +{ + u64 offset = 0; + int ret = 0; + + if (addr & ~PAGE_MASK || !size || size & ~PAGE_MASK) + return -EINVAL; + + KVM_UV_EVENT(kvm, 3, "PROTVIRT VM UNPACK: start addr %lx size %lx", + addr, size); + + while (offset < size) { + ret = unpack_one(kvm, addr, tweak, offset, rc, rrc); + if (ret == -EAGAIN) { + cond_resched(); + if (fatal_signal_pending(current)) + break; + continue; + } + if (ret) + break; + addr += PAGE_SIZE; + offset += PAGE_SIZE; + } + if (!ret) + KVM_UV_EVENT(kvm, 3, "%s", "PROTVIRT VM UNPACK: successful"); + return ret; +} + +int kvm_s390_pv_set_cpu_state(struct kvm_vcpu *vcpu, u8 state) +{ + struct uv_cb_cpu_set_state uvcb = { + .header.cmd = UVC_CMD_CPU_SET_STATE, + .header.len = sizeof(uvcb), + .cpu_handle = kvm_s390_pv_cpu_get_handle(vcpu), + .state = state, + }; + int cc; + + cc = uv_call(0, (u64)&uvcb); + KVM_UV_EVENT(vcpu->kvm, 3, "PROTVIRT SET CPU %d STATE %d rc %x rrc %x", + vcpu->vcpu_id, state, uvcb.header.rc, uvcb.header.rrc); + if (cc) + return -EINVAL; + return 0; +} --- linux-riscv-5.4.0.orig/arch/s390/mm/fault.c +++ linux-riscv-5.4.0/arch/s390/mm/fault.c @@ -38,6 +38,7 @@ #include #include #include +#include #include "../kernel/entry.h" #define __FAIL_ADDR_MASK -4096L @@ -816,3 +817,80 @@ early_initcall(pfault_irq_init); #endif /* CONFIG_PFAULT */ + +#if IS_ENABLED(CONFIG_PGSTE) +void do_secure_storage_access(struct pt_regs *regs) +{ + unsigned long addr = regs->int_parm_long & __FAIL_ADDR_MASK; + struct vm_area_struct *vma; + struct mm_struct *mm; + struct page *page; + int rc; + + switch (get_fault_type(regs)) { + case USER_FAULT: + mm = current->mm; + down_read(&mm->mmap_sem); + vma = find_vma(mm, addr); + if (!vma) { + up_read(&mm->mmap_sem); + do_fault_error(regs, VM_READ | VM_WRITE, VM_FAULT_BADMAP); + break; + } + page = follow_page(vma, addr, FOLL_WRITE | FOLL_GET); + if (IS_ERR_OR_NULL(page)) { + up_read(&mm->mmap_sem); + break; + } + if (arch_make_page_accessible(page)) + send_sig(SIGSEGV, current, 0); + put_page(page); + up_read(&mm->mmap_sem); + break; + case KERNEL_FAULT: + page = phys_to_page(addr); + if (unlikely(!try_get_page(page))) + break; + rc = arch_make_page_accessible(page); + put_page(page); + if (rc) + BUG(); + break; + case VDSO_FAULT: + /* fallthrough */ + case GMAP_FAULT: + /* fallthrough */ + default: + do_fault_error(regs, VM_READ | VM_WRITE, VM_FAULT_BADMAP); + WARN_ON_ONCE(1); + } +} +NOKPROBE_SYMBOL(do_secure_storage_access); + +void do_non_secure_storage_access(struct pt_regs *regs) +{ + unsigned long gaddr = regs->int_parm_long & __FAIL_ADDR_MASK; + struct gmap *gmap = (struct gmap *)S390_lowcore.gmap; + + if (get_fault_type(regs) != GMAP_FAULT) { + do_fault_error(regs, VM_READ | VM_WRITE, VM_FAULT_BADMAP); + WARN_ON_ONCE(1); + return; + } + + if (gmap_convert_to_secure(gmap, gaddr) == -EINVAL) + send_sig(SIGSEGV, current, 0); +} +NOKPROBE_SYMBOL(do_non_secure_storage_access); + +#else +void do_secure_storage_access(struct pt_regs *regs) +{ + default_trap_handler(regs); +} + +void do_non_secure_storage_access(struct pt_regs *regs) +{ + default_trap_handler(regs); +} +#endif --- linux-riscv-5.4.0.orig/arch/s390/mm/gmap.c +++ linux-riscv-5.4.0/arch/s390/mm/gmap.c @@ -2548,6 +2548,22 @@ } EXPORT_SYMBOL_GPL(s390_enable_sie); +int gmap_mark_unmergeable(void) +{ + struct mm_struct *mm = current->mm; + struct vm_area_struct *vma; + + for (vma = mm->mmap; vma; vma = vma->vm_next) { + if (ksm_madvise(vma, vma->vm_start, vma->vm_end, + MADV_UNMERGEABLE, &vma->vm_flags)) { + return -ENOMEM; + } + } + mm->def_flags &= ~VM_MERGEABLE; + return 0; +} +EXPORT_SYMBOL_GPL(gmap_mark_unmergeable); + /* * Enable storage key handling from now on and initialize the storage * keys with the default key. @@ -2593,7 +2609,6 @@ int s390_enable_skey(void) { struct mm_struct *mm = current->mm; - struct vm_area_struct *vma; int rc = 0; down_write(&mm->mmap_sem); @@ -2601,16 +2616,11 @@ goto out_up; mm->context.uses_skeys = 1; - for (vma = mm->mmap; vma; vma = vma->vm_next) { - if (ksm_madvise(vma, vma->vm_start, vma->vm_end, - MADV_UNMERGEABLE, &vma->vm_flags)) { - mm->context.uses_skeys = 0; - rc = -ENOMEM; - goto out_up; - } + rc = gmap_mark_unmergeable(); + if (rc) { + mm->context.uses_skeys = 0; + goto out_up; } - mm->def_flags &= ~VM_MERGEABLE; - walk_page_range(mm, 0, TASK_SIZE, &enable_skey_walk_ops, NULL); out_up: @@ -2640,3 +2650,38 @@ up_write(&mm->mmap_sem); } EXPORT_SYMBOL_GPL(s390_reset_cmma); + +/* + * make inaccessible pages accessible again + */ +static int __s390_reset_acc(pte_t *ptep, unsigned long addr, + unsigned long next, struct mm_walk *walk) +{ + pte_t pte = READ_ONCE(*ptep); + + if (pte_present(pte)) + WARN_ON_ONCE(uv_convert_from_secure(pte_val(pte) & PAGE_MASK)); + return 0; +} + +static const struct mm_walk_ops reset_acc_walk_ops = { + .pte_entry = __s390_reset_acc, +}; + +#include +void s390_reset_acc(struct mm_struct *mm) +{ + /* + * we might be called during + * reset: we walk the pages and clear + * close of all kvm file descriptors: we walk the pages and clear + * exit of process on fd closure: vma already gone, do nothing + */ + if (!mmget_not_zero(mm)) + return; + down_read(&mm->mmap_sem); + walk_page_range(mm, 0, TASK_SIZE, &reset_acc_walk_ops, NULL); + up_read(&mm->mmap_sem); + mmput(mm); +} +EXPORT_SYMBOL_GPL(s390_reset_acc); --- linux-riscv-5.4.0.orig/arch/s390/mm/hugetlbpage.c +++ linux-riscv-5.4.0/arch/s390/mm/hugetlbpage.c @@ -2,7 +2,7 @@ /* * IBM System z Huge TLB Page Support for Kernel. * - * Copyright IBM Corp. 2007,2016 + * Copyright IBM Corp. 2007,2020 * Author(s): Gerald Schaefer */ @@ -11,6 +11,9 @@ #include #include +#include +#include +#include /* * If the bit selected by single-bit bitmask "a" is set within "x", move @@ -267,3 +270,98 @@ return 1; } __setup("hugepagesz=", setup_hugepagesz); + +static unsigned long hugetlb_get_unmapped_area_bottomup(struct file *file, + unsigned long addr, unsigned long len, + unsigned long pgoff, unsigned long flags) +{ + struct hstate *h = hstate_file(file); + struct vm_unmapped_area_info info; + + info.flags = 0; + info.length = len; + info.low_limit = current->mm->mmap_base; + info.high_limit = TASK_SIZE; + info.align_mask = PAGE_MASK & ~huge_page_mask(h); + info.align_offset = 0; + return vm_unmapped_area(&info); +} + +static unsigned long hugetlb_get_unmapped_area_topdown(struct file *file, + unsigned long addr0, unsigned long len, + unsigned long pgoff, unsigned long flags) +{ + struct hstate *h = hstate_file(file); + struct vm_unmapped_area_info info; + unsigned long addr; + + info.flags = VM_UNMAPPED_AREA_TOPDOWN; + info.length = len; + info.low_limit = max(PAGE_SIZE, mmap_min_addr); + info.high_limit = current->mm->mmap_base; + info.align_mask = PAGE_MASK & ~huge_page_mask(h); + info.align_offset = 0; + addr = vm_unmapped_area(&info); + + /* + * A failed mmap() very likely causes application failure, + * so fall back to the bottom-up function here. This scenario + * can happen with large stack limits and large mmap() + * allocations. + */ + if (addr & ~PAGE_MASK) { + VM_BUG_ON(addr != -ENOMEM); + info.flags = 0; + info.low_limit = TASK_UNMAPPED_BASE; + info.high_limit = TASK_SIZE; + addr = vm_unmapped_area(&info); + } + + return addr; +} + +unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr, + unsigned long len, unsigned long pgoff, unsigned long flags) +{ + struct hstate *h = hstate_file(file); + struct mm_struct *mm = current->mm; + struct vm_area_struct *vma; + int rc; + + if (len & ~huge_page_mask(h)) + return -EINVAL; + if (len > TASK_SIZE - mmap_min_addr) + return -ENOMEM; + + if (flags & MAP_FIXED) { + if (prepare_hugepage_range(file, addr, len)) + return -EINVAL; + goto check_asce_limit; + } + + if (addr) { + addr = ALIGN(addr, huge_page_size(h)); + vma = find_vma(mm, addr); + if (TASK_SIZE - len >= addr && addr >= mmap_min_addr && + (!vma || addr + len <= vm_start_gap(vma))) + goto check_asce_limit; + } + + if (mm->get_unmapped_area == arch_get_unmapped_area) + addr = hugetlb_get_unmapped_area_bottomup(file, addr, len, + pgoff, flags); + else + addr = hugetlb_get_unmapped_area_topdown(file, addr, len, + pgoff, flags); + if (addr & ~PAGE_MASK) + return addr; + +check_asce_limit: + if (addr + len > current->mm->context.asce_limit && + addr + len <= TASK_SIZE) { + rc = crst_table_upgrade(mm, addr + len); + if (rc) + return (unsigned long) rc; + } + return addr; +} --- linux-riscv-5.4.0.orig/arch/s390/mm/init.c +++ linux-riscv-5.4.0/arch/s390/mm/init.c @@ -291,10 +291,8 @@ { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; - struct zone *zone; - zone = page_zone(pfn_to_page(start_pfn)); - __remove_pages(zone, start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap); vmem_remove_mapping(start, size); } #endif /* CONFIG_MEMORY_HOTPLUG */ --- linux-riscv-5.4.0.orig/arch/s390/mm/maccess.c +++ linux-riscv-5.4.0/arch/s390/mm/maccess.c @@ -70,7 +70,7 @@ spin_unlock_irqrestore(&s390_kernel_write_lock, flags); } -static int __memcpy_real(void *dest, void *src, size_t count) +static int __no_sanitize_address __memcpy_real(void *dest, void *src, size_t count) { register unsigned long _dest asm("2") = (unsigned long) dest; register unsigned long _len1 asm("3") = (unsigned long) count; @@ -91,19 +91,23 @@ return rc; } -static unsigned long _memcpy_real(unsigned long dest, unsigned long src, - unsigned long count) +static unsigned long __no_sanitize_address _memcpy_real(unsigned long dest, + unsigned long src, + unsigned long count) { int irqs_disabled, rc; unsigned long flags; if (!count) return 0; - flags = __arch_local_irq_stnsm(0xf8UL); + flags = arch_local_irq_save(); irqs_disabled = arch_irqs_disabled_flags(flags); if (!irqs_disabled) trace_hardirqs_off(); + __arch_local_irq_stnsm(0xf8); // disable DAT rc = __memcpy_real((void *) dest, (void *) src, (size_t) count); + if (flags & PSW_MASK_DAT) + __arch_local_irq_stosm(0x04); // enable DAT if (!irqs_disabled) trace_hardirqs_on(); __arch_local_irq_ssm(flags); @@ -115,9 +119,15 @@ */ int memcpy_real(void *dest, void *src, size_t count) { - if (S390_lowcore.nodat_stack != 0) - return CALL_ON_STACK(_memcpy_real, S390_lowcore.nodat_stack, - 3, dest, src, count); + int rc; + + if (S390_lowcore.nodat_stack != 0) { + preempt_disable(); + rc = CALL_ON_STACK(_memcpy_real, S390_lowcore.nodat_stack, 3, + dest, src, count); + preempt_enable(); + return rc; + } /* * This is a really early memcpy_real call, the stacks are * not set up yet. Just call _memcpy_real on the early boot --- linux-riscv-5.4.0.orig/arch/s390/net/bpf_jit_comp.c +++ linux-riscv-5.4.0/arch/s390/net/bpf_jit_comp.c @@ -23,6 +23,7 @@ #include #include #include +#include #include #include #include @@ -1369,7 +1370,7 @@ } memset(&jit, 0, sizeof(jit)); - jit.addrs = kcalloc(fp->len + 1, sizeof(*jit.addrs), GFP_KERNEL); + jit.addrs = kvcalloc(fp->len + 1, sizeof(*jit.addrs), GFP_KERNEL); if (jit.addrs == NULL) { fp = orig_fp; goto out; @@ -1422,7 +1423,7 @@ if (!fp->is_func || extra_pass) { bpf_prog_fill_jited_linfo(fp, jit.addrs + 1); free_addrs: - kfree(jit.addrs); + kvfree(jit.addrs); kfree(jit_data); fp->aux->jit_data = NULL; } --- linux-riscv-5.4.0.orig/arch/s390/pci/pci.c +++ linux-riscv-5.4.0/arch/s390/pci/pci.c @@ -423,7 +423,7 @@ if (zpci_use_mio(zdev)) pdev->resource[i].start = - (resource_size_t __force) zdev->bars[i].mio_wb; + (resource_size_t __force) zdev->bars[i].mio_wt; else pdev->resource[i].start = (resource_size_t __force) pci_iomap_range_fh(pdev, i, 0, 0); @@ -530,7 +530,7 @@ flags |= IORESOURCE_MEM_64; if (zpci_use_mio(zdev)) - addr = (unsigned long) zdev->bars[i].mio_wb; + addr = (unsigned long) zdev->bars[i].mio_wt; else addr = ZPCI_ADDR(entry); size = 1UL << zdev->bars[i].size; @@ -934,5 +934,5 @@ void zpci_rescan(void) { if (zpci_is_enabled()) - clp_rescan_pci_devices_simple(); + clp_rescan_pci_devices_simple(NULL); } --- linux-riscv-5.4.0.orig/arch/s390/pci/pci_clp.c +++ linux-riscv-5.4.0/arch/s390/pci/pci_clp.c @@ -240,12 +240,14 @@ } /* - * Enable/Disable a given PCI function defined by its function handle. + * Enable/Disable a given PCI function and update its function handle if + * necessary */ -static int clp_set_pci_fn(u32 *fh, u8 nr_dma_as, u8 command) +static int clp_set_pci_fn(struct zpci_dev *zdev, u8 nr_dma_as, u8 command) { struct clp_req_rsp_set_pci *rrb; int rc, retries = 100; + u32 fid = zdev->fid; rrb = clp_alloc_block(GFP_KERNEL); if (!rrb) @@ -256,7 +258,7 @@ rrb->request.hdr.len = sizeof(rrb->request); rrb->request.hdr.cmd = CLP_SET_PCI_FN; rrb->response.hdr.len = sizeof(rrb->response); - rrb->request.fh = *fh; + rrb->request.fh = zdev->fh; rrb->request.oc = command; rrb->request.ndas = nr_dma_as; @@ -269,12 +271,17 @@ } } while (rrb->response.hdr.rsp == CLP_RC_SETPCIFN_BUSY); - if (!rc && rrb->response.hdr.rsp == CLP_RC_OK) - *fh = rrb->response.fh; - else { + if (rc || rrb->response.hdr.rsp != CLP_RC_OK) { zpci_err("Set PCI FN:\n"); zpci_err_clp(rrb->response.hdr.rsp, rc); - rc = -EIO; + } + + if (!rc && rrb->response.hdr.rsp == CLP_RC_OK) { + zdev->fh = rrb->response.fh; + } else if (!rc && rrb->response.hdr.rsp == CLP_RC_SETPCIFN_ALRDY && + rrb->response.fh == 0) { + /* Function is already in desired state - update handle */ + rc = clp_rescan_pci_devices_simple(&fid); } clp_free_block(rrb); return rc; @@ -282,18 +289,17 @@ int clp_enable_fh(struct zpci_dev *zdev, u8 nr_dma_as) { - u32 fh = zdev->fh; int rc; - rc = clp_set_pci_fn(&fh, nr_dma_as, CLP_SET_ENABLE_PCI_FN); - zpci_dbg(3, "ena fid:%x, fh:%x, rc:%d\n", zdev->fid, fh, rc); + rc = clp_set_pci_fn(zdev, nr_dma_as, CLP_SET_ENABLE_PCI_FN); + zpci_dbg(3, "ena fid:%x, fh:%x, rc:%d\n", zdev->fid, zdev->fh, rc); if (rc) goto out; - zdev->fh = fh; if (zpci_use_mio(zdev)) { - rc = clp_set_pci_fn(&fh, nr_dma_as, CLP_SET_ENABLE_MIO); - zpci_dbg(3, "ena mio fid:%x, fh:%x, rc:%d\n", zdev->fid, fh, rc); + rc = clp_set_pci_fn(zdev, nr_dma_as, CLP_SET_ENABLE_MIO); + zpci_dbg(3, "ena mio fid:%x, fh:%x, rc:%d\n", + zdev->fid, zdev->fh, rc); if (rc) clp_disable_fh(zdev); } @@ -309,11 +315,8 @@ if (!zdev_enabled(zdev)) return 0; - rc = clp_set_pci_fn(&fh, 0, CLP_SET_DISABLE_PCI_FN); + rc = clp_set_pci_fn(zdev, 0, CLP_SET_DISABLE_PCI_FN); zpci_dbg(3, "dis fid:%x, fh:%x, rc:%d\n", zdev->fid, fh, rc); - if (!rc) - zdev->fh = fh; - return rc; } @@ -370,10 +373,14 @@ static void __clp_update(struct clp_fh_list_entry *entry, void *data) { struct zpci_dev *zdev; + u32 *fid = data; if (!entry->vendor_id) return; + if (fid && *fid != entry->fid) + return; + zdev = get_zdev_by_fid(entry->fid); if (!zdev) return; @@ -413,7 +420,10 @@ return rc; } -int clp_rescan_pci_devices_simple(void) +/* Rescan PCI functions and refresh function handles. If fid is non-NULL only + * refresh the handle of the function matching @fid + */ +int clp_rescan_pci_devices_simple(u32 *fid) { struct clp_req_rsp_list_pci *rrb; int rc; @@ -422,7 +432,7 @@ if (!rrb) return -ENOMEM; - rc = clp_list_pci(rrb, NULL, __clp_update); + rc = clp_list_pci(rrb, fid, __clp_update); clp_free_block(rrb); return rc; --- linux-riscv-5.4.0.orig/arch/s390/pci/pci_sysfs.c +++ linux-riscv-5.4.0/arch/s390/pci/pci_sysfs.c @@ -13,6 +13,8 @@ #include #include +#include "../../../drivers/pci/pci.h" + #include #define zpci_attr(name, fmt, member) \ @@ -49,31 +51,50 @@ static ssize_t recover_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { + struct kernfs_node *kn; struct pci_dev *pdev = to_pci_dev(dev); struct zpci_dev *zdev = to_zpci(pdev); - int ret; - - if (!device_remove_file_self(dev, attr)) - return count; + int ret = 0; + /* Can't use device_remove_self() here as that would lead us to lock + * the pci_rescan_remove_lock while holding the device' kernfs lock. + * This would create a possible deadlock with disable_slot() which is + * not directly protected by the device' kernfs lock but takes it + * during the device removal which happens under + * pci_rescan_remove_lock. + * + * This is analogous to sdev_store_delete() in + * drivers/scsi/scsi_sysfs.c + */ + kn = sysfs_break_active_protection(&dev->kobj, &attr->attr); + WARN_ON_ONCE(!kn); + /* device_remove_file() serializes concurrent calls ignoring all but + * the first + */ + device_remove_file(dev, attr); + + /* A concurrent call to recover_store() may slip between + * sysfs_break_active_protection() and the sysfs file removal. + * Once it unblocks from pci_lock_rescan_remove() the original pdev + * will already be removed. + */ pci_lock_rescan_remove(); - pci_stop_and_remove_bus_device(pdev); - ret = zpci_disable_device(zdev); - if (ret) - goto error; - - ret = zpci_enable_device(zdev); - if (ret) - goto error; - - pci_rescan_bus(zdev->bus); - pci_unlock_rescan_remove(); - - return count; - -error: + if (pci_dev_is_added(pdev)) { + pci_stop_and_remove_bus_device(pdev); + ret = zpci_disable_device(zdev); + if (ret) + goto out; + + ret = zpci_enable_device(zdev); + if (ret) + goto out; + pci_rescan_bus(zdev->bus); + } +out: pci_unlock_rescan_remove(); - return ret; + if (kn) + sysfs_unbreak_active_protection(kn); + return ret ? ret : count; } static DEVICE_ATTR_WO(recover); --- linux-riscv-5.4.0.orig/arch/s390/purgatory/Makefile +++ linux-riscv-5.4.0/arch/s390/purgatory/Makefile @@ -15,8 +15,10 @@ $(obj)/mem.o: $(srctree)/arch/s390/lib/mem.S FORCE $(call if_changed_rule,as_o_S) -$(obj)/string.o: $(srctree)/arch/s390/lib/string.c FORCE - $(call if_changed_rule,cc_o_c) +KCOV_INSTRUMENT := n +GCOV_PROFILE := n +UBSAN_SANITIZE := n +KASAN_SANITIZE := n KBUILD_CFLAGS := -fno-strict-aliasing -Wall -Wstrict-prototypes KBUILD_CFLAGS += -Wno-pointer-sign -Wno-sign-compare --- linux-riscv-5.4.0.orig/arch/s390/purgatory/string.c +++ linux-riscv-5.4.0/arch/s390/purgatory/string.c @@ -0,0 +1,3 @@ +// SPDX-License-Identifier: GPL-2.0 +#define __HAVE_ARCH_MEMCMP /* arch function */ +#include "../lib/string.c" --- linux-riscv-5.4.0.orig/arch/sh/include/cpu-sh2a/cpu/sh7269.h +++ linux-riscv-5.4.0/arch/sh/include/cpu-sh2a/cpu/sh7269.h @@ -78,8 +78,15 @@ GPIO_FN_WDTOVF, /* CAN */ - GPIO_FN_CTX1, GPIO_FN_CRX1, GPIO_FN_CTX0, GPIO_FN_CTX0_CTX1, - GPIO_FN_CRX0, GPIO_FN_CRX0_CRX1, GPIO_FN_CRX0_CRX1_CRX2, + GPIO_FN_CTX2, GPIO_FN_CRX2, + GPIO_FN_CTX1, GPIO_FN_CRX1, + GPIO_FN_CTX0, GPIO_FN_CRX0, + GPIO_FN_CTX0_CTX1, GPIO_FN_CRX0_CRX1, + GPIO_FN_CTX0_CTX1_CTX2, GPIO_FN_CRX0_CRX1_CRX2, + GPIO_FN_CTX2_PJ21, GPIO_FN_CRX2_PJ20, + GPIO_FN_CTX1_PJ23, GPIO_FN_CRX1_PJ22, + GPIO_FN_CTX0_CTX1_PJ23, GPIO_FN_CRX0_CRX1_PJ22, + GPIO_FN_CTX0_CTX1_CTX2_PJ21, GPIO_FN_CRX0_CRX1_CRX2_PJ20, /* DMAC */ GPIO_FN_TEND0, GPIO_FN_DACK0, GPIO_FN_DREQ0, --- linux-riscv-5.4.0.orig/arch/sh/include/cpu-sh4/cpu/sh7734.h +++ linux-riscv-5.4.0/arch/sh/include/cpu-sh4/cpu/sh7734.h @@ -134,7 +134,7 @@ GPIO_FN_EX_WAIT1, GPIO_FN_SD1_DAT0_A, GPIO_FN_DREQ2, GPIO_FN_CAN1_TX_C, GPIO_FN_ET0_LINK_C, GPIO_FN_ET0_ETXD5_A, GPIO_FN_EX_WAIT0, GPIO_FN_TCLK1_B, - GPIO_FN_RD_WR, GPIO_FN_TCLK0, + GPIO_FN_RD_WR, GPIO_FN_TCLK0, GPIO_FN_CAN_CLK_B, GPIO_FN_ET0_ETXD4, GPIO_FN_EX_CS5, GPIO_FN_SD1_CMD_A, GPIO_FN_ATADIR, GPIO_FN_QSSL_B, GPIO_FN_ET0_ETXD3_A, GPIO_FN_EX_CS4, GPIO_FN_SD1_WP_A, GPIO_FN_ATAWR, GPIO_FN_QMI_QIO1_B, --- linux-riscv-5.4.0.orig/arch/sh/mm/init.c +++ linux-riscv-5.4.0/arch/sh/mm/init.c @@ -434,9 +434,7 @@ { unsigned long start_pfn = PFN_DOWN(start); unsigned long nr_pages = size >> PAGE_SHIFT; - struct zone *zone; - zone = page_zone(pfn_to_page(start_pfn)); - __remove_pages(zone, start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap); } #endif /* CONFIG_MEMORY_HOTPLUG */ --- linux-riscv-5.4.0.orig/arch/sparc/Kconfig +++ linux-riscv-5.4.0/arch/sparc/Kconfig @@ -65,7 +65,6 @@ select HAVE_KRETPROBES select HAVE_KPROBES select HAVE_RCU_TABLE_FREE if SMP - select HAVE_RCU_TABLE_NO_INVALIDATE if HAVE_RCU_TABLE_FREE select HAVE_MEMBLOCK_NODE_MAP select HAVE_ARCH_TRANSPARENT_HUGEPAGE select HAVE_DYNAMIC_FTRACE --- linux-riscv-5.4.0.orig/arch/sparc/include/asm/io_64.h +++ linux-riscv-5.4.0/arch/sparc/include/asm/io_64.h @@ -407,6 +407,7 @@ } #define ioremap_nocache(X,Y) ioremap((X),(Y)) +#define ioremap_uc(X,Y) ioremap((X),(Y)) #define ioremap_wc(X,Y) ioremap((X),(Y)) #define ioremap_wt(X,Y) ioremap((X),(Y)) --- linux-riscv-5.4.0.orig/arch/sparc/include/asm/tlb_64.h +++ linux-riscv-5.4.0/arch/sparc/include/asm/tlb_64.h @@ -28,6 +28,15 @@ #define __tlb_remove_tlb_entry(tlb, ptep, address) do { } while (0) #define tlb_flush(tlb) flush_tlb_pending() +/* + * SPARC64's hardware TLB fill does not use the Linux page-tables + * and therefore we don't need a TLBI when freeing page-table pages. + */ + +#ifdef CONFIG_HAVE_RCU_TABLE_FREE +#define tlb_needs_table_invalidate() (false) +#endif + #include #endif /* _SPARC64_TLB_H */ --- linux-riscv-5.4.0.orig/arch/sparc/include/uapi/asm/ipcbuf.h +++ linux-riscv-5.4.0/arch/sparc/include/uapi/asm/ipcbuf.h @@ -15,19 +15,19 @@ struct ipc64_perm { - __kernel_key_t key; - __kernel_uid_t uid; - __kernel_gid_t gid; - __kernel_uid_t cuid; - __kernel_gid_t cgid; + __kernel_key_t key; + __kernel_uid32_t uid; + __kernel_gid32_t gid; + __kernel_uid32_t cuid; + __kernel_gid32_t cgid; #ifndef __arch64__ - unsigned short __pad0; + unsigned short __pad0; #endif - __kernel_mode_t mode; - unsigned short __pad1; - unsigned short seq; - unsigned long long __unused1; - unsigned long long __unused2; + __kernel_mode_t mode; + unsigned short __pad1; + unsigned short seq; + unsigned long long __unused1; + unsigned long long __unused2; }; #endif /* __SPARC_IPCBUF_H */ --- linux-riscv-5.4.0.orig/arch/sparc/kernel/vmlinux.lds.S +++ linux-riscv-5.4.0/arch/sparc/kernel/vmlinux.lds.S @@ -172,12 +172,14 @@ } PERCPU_SECTION(SMP_CACHE_BYTES) -#ifdef CONFIG_JUMP_LABEL . = ALIGN(PAGE_SIZE); .exit.text : { EXIT_TEXT } -#endif + + .exit.data : { + EXIT_DATA + } . = ALIGN(PAGE_SIZE); __init_end = .; --- linux-riscv-5.4.0.orig/arch/um/Kconfig +++ linux-riscv-5.4.0/arch/um/Kconfig @@ -14,6 +14,7 @@ select HAVE_FUTEX_CMPXCHG if FUTEX select HAVE_DEBUG_KMEMLEAK select HAVE_DEBUG_BUGVERBOSE + select HAVE_COPY_THREAD_TLS select GENERIC_IRQ_SHOW select GENERIC_CPU_DEVICES select GENERIC_CLOCKEVENTS --- linux-riscv-5.4.0.orig/arch/um/drivers/Kconfig +++ linux-riscv-5.4.0/arch/um/drivers/Kconfig @@ -337,7 +337,7 @@ endmenu config VIRTIO_UML - tristate "UML driver for virtio devices" + bool "UML driver for virtio devices" select VIRTIO help This driver provides support for virtio based paravirtual device --- linux-riscv-5.4.0.orig/arch/um/drivers/virtio_uml.c +++ linux-riscv-5.4.0/arch/um/drivers/virtio_uml.c @@ -4,12 +4,12 @@ * * Copyright(c) 2019 Intel Corporation * - * This module allows virtio devices to be used over a vhost-user socket. + * This driver allows virtio devices to be used over a vhost-user socket. * * Guest devices can be instantiated by kernel module or command line * parameters. One device will be created for each parameter. Syntax: * - * [virtio_uml.]device=:[:] + * virtio_uml.device=:[:] * where: * := vhost-user socket path to connect * := virtio device id (as in virtio_ids.h) @@ -83,7 +83,7 @@ return 0; } -static int full_read(int fd, void *buf, int len) +static int full_read(int fd, void *buf, int len, bool abortable) { int rc; @@ -93,7 +93,7 @@ buf += rc; len -= rc; } - } while (len && (rc > 0 || rc == -EINTR)); + } while (len && (rc > 0 || rc == -EINTR || (!abortable && rc == -EAGAIN))); if (rc < 0) return rc; @@ -104,7 +104,7 @@ static int vhost_user_recv_header(int fd, struct vhost_user_msg *msg) { - return full_read(fd, msg, sizeof(msg->header)); + return full_read(fd, msg, sizeof(msg->header), true); } static int vhost_user_recv(int fd, struct vhost_user_msg *msg, @@ -118,7 +118,7 @@ size = msg->header.size; if (size > max_payload_size) return -EPROTO; - return full_read(fd, &msg->payload, size); + return full_read(fd, &msg->payload, size, false); } static int vhost_user_recv_resp(struct virtio_uml_device *vu_dev, --- linux-riscv-5.4.0.orig/arch/um/include/asm/common.lds.S +++ linux-riscv-5.4.0/arch/um/include/asm/common.lds.S @@ -83,8 +83,8 @@ __preinit_array_end = .; } .init_array : { - /* dummy - we call this ourselves */ __init_array_start = .; + *(.init_array) __init_array_end = .; } .fini_array : { --- linux-riscv-5.4.0.orig/arch/um/include/asm/ptrace-generic.h +++ linux-riscv-5.4.0/arch/um/include/asm/ptrace-generic.h @@ -36,7 +36,7 @@ extern unsigned long getreg(struct task_struct *child, int regno); extern int putreg(struct task_struct *child, int regno, unsigned long value); -extern int arch_copy_tls(struct task_struct *new); +extern int arch_set_tls(struct task_struct *new, unsigned long tls); extern void clear_flushed_tls(struct task_struct *task); extern int syscall_trace_enter(struct pt_regs *regs); extern void syscall_trace_leave(struct pt_regs *regs); --- linux-riscv-5.4.0.orig/arch/um/kernel/dyn.lds.S +++ linux-riscv-5.4.0/arch/um/kernel/dyn.lds.S @@ -103,6 +103,7 @@ be empty, which isn't pretty. */ . = ALIGN(32 / 8); .preinit_array : { *(.preinit_array) } + .init_array : { *(.init_array) } .fini_array : { *(.fini_array) } .data : { INIT_TASK_DATA(KERNEL_STACK_SIZE) --- linux-riscv-5.4.0.orig/arch/um/kernel/process.c +++ linux-riscv-5.4.0/arch/um/kernel/process.c @@ -153,8 +153,8 @@ userspace(¤t->thread.regs.regs, current_thread_info()->aux_fp_regs); } -int copy_thread(unsigned long clone_flags, unsigned long sp, - unsigned long arg, struct task_struct * p) +int copy_thread_tls(unsigned long clone_flags, unsigned long sp, + unsigned long arg, struct task_struct * p, unsigned long tls) { void (*handler)(void); int kthread = current->flags & PF_KTHREAD; @@ -188,7 +188,7 @@ * Set a new TLS for the child thread? */ if (clone_flags & CLONE_SETTLS) - ret = arch_copy_tls(p); + ret = arch_set_tls(p, tls); } return ret; --- linux-riscv-5.4.0.orig/arch/um/os-Linux/main.c +++ linux-riscv-5.4.0/arch/um/os-Linux/main.c @@ -170,7 +170,7 @@ * that they won't be delivered after the exec, when * they are definitely not expected. */ - unblock_signals_trace(); + unblock_signals(); os_info("\n"); /* Reboot */ --- linux-riscv-5.4.0.orig/arch/x86/boot/compressed/acpi.c +++ linux-riscv-5.4.0/arch/x86/boot/compressed/acpi.c @@ -393,7 +393,13 @@ table = table_addr + sizeof(struct acpi_table_srat); while (table + sizeof(struct acpi_subtable_header) < table_end) { + sub_table = (struct acpi_subtable_header *)table; + if (!sub_table->length) { + debug_putstr("Invalid zero length SRAT subtable.\n"); + return 0; + } + if (sub_table->type == ACPI_SRAT_TYPE_MEMORY_AFFINITY) { struct acpi_srat_mem_affinity *ma; --- linux-riscv-5.4.0.orig/arch/x86/boot/compressed/head_64.S +++ linux-riscv-5.4.0/arch/x86/boot/compressed/head_64.S @@ -244,6 +244,11 @@ leal efi32_config(%ebp), %eax movl %eax, efi_config(%ebp) + /* Disable paging */ + movl %cr0, %eax + btrl $X86_CR0_PG_BIT, %eax + movl %eax, %cr0 + jmp startup_32 ENDPROC(efi32_stub_entry) #endif --- linux-riscv-5.4.0.orig/arch/x86/boot/compressed/kaslr_64.c +++ linux-riscv-5.4.0/arch/x86/boot/compressed/kaslr_64.c @@ -29,9 +29,6 @@ #define __PAGE_OFFSET __PAGE_OFFSET_BASE #include "../../mm/ident_map.c" -/* Used by pgtable.h asm code to force instruction serialization. */ -unsigned long __force_order; - /* Used to track our page table allocation area. */ struct alloc_pgt_data { unsigned char *pgt_buf; --- linux-riscv-5.4.0.orig/arch/x86/boot/video-vga.c +++ linux-riscv-5.4.0/arch/x86/boot/video-vga.c @@ -188,7 +188,7 @@ vga_set_vertical_end(60*8); } -static int vga_set_mode(struct mode_info *mode) +static int __attribute__((optimize("no-jump-tables"))) vga_set_mode(struct mode_info *mode) { /* Set the basic mode */ vga_set_basic_mode(); --- linux-riscv-5.4.0.orig/arch/x86/entry/entry_32.S +++ linux-riscv-5.4.0/arch/x86/entry/entry_32.S @@ -172,7 +172,7 @@ ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI .if \no_user_check == 0 /* coming from usermode? */ - testl $SEGMENT_RPL_MASK, PT_CS(%esp) + testl $USER_SEGMENT_RPL_MASK, PT_CS(%esp) jz .Lend_\@ .endif /* On user-cr3? */ @@ -205,64 +205,76 @@ #define CS_FROM_ENTRY_STACK (1 << 31) #define CS_FROM_USER_CR3 (1 << 30) #define CS_FROM_KERNEL (1 << 29) +#define CS_FROM_ESPFIX (1 << 28) .macro FIXUP_FRAME /* * The high bits of the CS dword (__csh) are used for CS_FROM_*. * Clear them in case hardware didn't do this for us. */ - andl $0x0000ffff, 3*4(%esp) + andl $0x0000ffff, 4*4(%esp) #ifdef CONFIG_VM86 - testl $X86_EFLAGS_VM, 4*4(%esp) + testl $X86_EFLAGS_VM, 5*4(%esp) jnz .Lfrom_usermode_no_fixup_\@ #endif - testl $SEGMENT_RPL_MASK, 3*4(%esp) + testl $USER_SEGMENT_RPL_MASK, 4*4(%esp) jnz .Lfrom_usermode_no_fixup_\@ - orl $CS_FROM_KERNEL, 3*4(%esp) + orl $CS_FROM_KERNEL, 4*4(%esp) /* * When we're here from kernel mode; the (exception) stack looks like: * - * 5*4(%esp) - - * 4*4(%esp) - flags - * 3*4(%esp) - cs - * 2*4(%esp) - ip - * 1*4(%esp) - orig_eax - * 0*4(%esp) - gs / function + * 6*4(%esp) - + * 5*4(%esp) - flags + * 4*4(%esp) - cs + * 3*4(%esp) - ip + * 2*4(%esp) - orig_eax + * 1*4(%esp) - gs / function + * 0*4(%esp) - fs * * Lets build a 5 entry IRET frame after that, such that struct pt_regs * is complete and in particular regs->sp is correct. This gives us - * the original 5 enties as gap: + * the original 6 enties as gap: * - * 12*4(%esp) - - * 11*4(%esp) - gap / flags - * 10*4(%esp) - gap / cs - * 9*4(%esp) - gap / ip - * 8*4(%esp) - gap / orig_eax - * 7*4(%esp) - gap / gs / function - * 6*4(%esp) - ss - * 5*4(%esp) - sp - * 4*4(%esp) - flags - * 3*4(%esp) - cs - * 2*4(%esp) - ip - * 1*4(%esp) - orig_eax - * 0*4(%esp) - gs / function + * 14*4(%esp) - + * 13*4(%esp) - gap / flags + * 12*4(%esp) - gap / cs + * 11*4(%esp) - gap / ip + * 10*4(%esp) - gap / orig_eax + * 9*4(%esp) - gap / gs / function + * 8*4(%esp) - gap / fs + * 7*4(%esp) - ss + * 6*4(%esp) - sp + * 5*4(%esp) - flags + * 4*4(%esp) - cs + * 3*4(%esp) - ip + * 2*4(%esp) - orig_eax + * 1*4(%esp) - gs / function + * 0*4(%esp) - fs */ pushl %ss # ss pushl %esp # sp (points at ss) - addl $6*4, (%esp) # point sp back at the previous context - pushl 6*4(%esp) # flags - pushl 6*4(%esp) # cs - pushl 6*4(%esp) # ip - pushl 6*4(%esp) # orig_eax - pushl 6*4(%esp) # gs / function + addl $7*4, (%esp) # point sp back at the previous context + pushl 7*4(%esp) # flags + pushl 7*4(%esp) # cs + pushl 7*4(%esp) # ip + pushl 7*4(%esp) # orig_eax + pushl 7*4(%esp) # gs / function + pushl 7*4(%esp) # fs .Lfrom_usermode_no_fixup_\@: .endm .macro IRET_FRAME + /* + * We're called with %ds, %es, %fs, and %gs from the interrupted + * frame, so we shouldn't use them. Also, we may be in ESPFIX + * mode and therefore have a nonzero SS base and an offset ESP, + * so any attempt to access the stack needs to use SS. (except for + * accesses through %esp, which automatically use SS.) + */ testl $CS_FROM_KERNEL, 1*4(%esp) jz .Lfinished_frame_\@ @@ -276,31 +288,40 @@ movl 5*4(%esp), %eax # (modified) regs->sp movl 4*4(%esp), %ecx # flags - movl %ecx, -4(%eax) + movl %ecx, %ss:-1*4(%eax) movl 3*4(%esp), %ecx # cs andl $0x0000ffff, %ecx - movl %ecx, -8(%eax) + movl %ecx, %ss:-2*4(%eax) movl 2*4(%esp), %ecx # ip - movl %ecx, -12(%eax) + movl %ecx, %ss:-3*4(%eax) movl 1*4(%esp), %ecx # eax - movl %ecx, -16(%eax) + movl %ecx, %ss:-4*4(%eax) popl %ecx - lea -16(%eax), %esp + lea -4*4(%eax), %esp popl %eax .Lfinished_frame_\@: .endm -.macro SAVE_ALL pt_regs_ax=%eax switch_stacks=0 skip_gs=0 +.macro SAVE_ALL pt_regs_ax=%eax switch_stacks=0 skip_gs=0 unwind_espfix=0 cld .if \skip_gs == 0 PUSH_GS .endif - FIXUP_FRAME pushl %fs + + pushl %eax + movl $(__KERNEL_PERCPU), %eax + movl %eax, %fs +.if \unwind_espfix > 0 + UNWIND_ESPFIX_STACK +.endif + popl %eax + + FIXUP_FRAME pushl %es pushl %ds pushl \pt_regs_ax @@ -313,8 +334,6 @@ movl $(__USER_DS), %edx movl %edx, %ds movl %edx, %es - movl $(__KERNEL_PERCPU), %edx - movl %edx, %fs .if \skip_gs == 0 SET_KERNEL_GS %edx .endif @@ -324,8 +343,8 @@ .endif .endm -.macro SAVE_ALL_NMI cr3_reg:req - SAVE_ALL +.macro SAVE_ALL_NMI cr3_reg:req unwind_espfix=0 + SAVE_ALL unwind_espfix=\unwind_espfix BUG_IF_WRONG_CR3 @@ -357,6 +376,7 @@ 2: popl %es 3: popl %fs POP_GS \pop + IRET_FRAME .pushsection .fixup, "ax" 4: movl $0, (%esp) jmp 1b @@ -395,7 +415,8 @@ .macro CHECK_AND_APPLY_ESPFIX #ifdef CONFIG_X86_ESPFIX32 -#define GDT_ESPFIX_SS PER_CPU_VAR(gdt_page) + (GDT_ENTRY_ESPFIX_SS * 8) +#define GDT_ESPFIX_OFFSET (GDT_ENTRY_ESPFIX_SS * 8) +#define GDT_ESPFIX_SS PER_CPU_VAR(gdt_page) + GDT_ESPFIX_OFFSET ALTERNATIVE "jmp .Lend_\@", "", X86_BUG_ESPFIX @@ -1075,7 +1096,6 @@ /* Restore user state */ RESTORE_REGS pop=4 # skip orig_eax/error_code .Lirq_return: - IRET_FRAME /* * ARCH_HAS_MEMBARRIER_SYNC_CORE rely on IRET core serialization * when returning from IPI handler and when returning from @@ -1128,30 +1148,43 @@ * We can't call C functions using the ESPFIX stack. This code reads * the high word of the segment base from the GDT and swiches to the * normal stack and adjusts ESP with the matching offset. + * + * We might be on user CR3 here, so percpu data is not mapped and we can't + * access the GDT through the percpu segment. Instead, use SGDT to find + * the cpu_entry_area alias of the GDT. */ #ifdef CONFIG_X86_ESPFIX32 /* fixup the stack */ - mov GDT_ESPFIX_SS + 4, %al /* bits 16..23 */ - mov GDT_ESPFIX_SS + 7, %ah /* bits 24..31 */ + pushl %ecx + subl $2*4, %esp + sgdt (%esp) + movl 2(%esp), %ecx /* GDT address */ + /* + * Careful: ECX is a linear pointer, so we need to force base + * zero. %cs is the only known-linear segment we have right now. + */ + mov %cs:GDT_ESPFIX_OFFSET + 4(%ecx), %al /* bits 16..23 */ + mov %cs:GDT_ESPFIX_OFFSET + 7(%ecx), %ah /* bits 24..31 */ shl $16, %eax + addl $2*4, %esp + popl %ecx addl %esp, %eax /* the adjusted stack pointer */ pushl $__KERNEL_DS pushl %eax lss (%esp), %esp /* switch to the normal stack segment */ #endif .endm + .macro UNWIND_ESPFIX_STACK + /* It's safe to clobber %eax, all other regs need to be preserved */ #ifdef CONFIG_X86_ESPFIX32 movl %ss, %eax /* see if on espfix stack */ cmpw $__ESPFIX_SS, %ax - jne 27f - movl $__KERNEL_DS, %eax - movl %eax, %ds - movl %eax, %es + jne .Lno_fixup_\@ /* switch to normal stack */ FIXUP_ESPFIX_STACK -27: +.Lno_fixup_\@: #endif .endm @@ -1341,11 +1374,6 @@ #ifdef CONFIG_XEN_PV ENTRY(xen_hypervisor_callback) - pushl $-1 /* orig_ax = -1 => not a system call */ - SAVE_ALL - ENCODE_FRAME_POINTER - TRACE_IRQS_OFF - /* * Check to see if we got the event in the critical * region in xen_iret_direct, after we've reenabled @@ -1353,16 +1381,17 @@ * iret instruction's behaviour where it delivers a * pending interrupt when enabling interrupts: */ - movl PT_EIP(%esp), %eax - cmpl $xen_iret_start_crit, %eax + cmpl $xen_iret_start_crit, (%esp) jb 1f - cmpl $xen_iret_end_crit, %eax + cmpl $xen_iret_end_crit, (%esp) jae 1f - - jmp xen_iret_crit_fixup - -ENTRY(xen_do_upcall) -1: mov %esp, %eax + call xen_iret_crit_fixup +1: + pushl $-1 /* orig_ax = -1 => not a system call */ + SAVE_ALL + ENCODE_FRAME_POINTER + TRACE_IRQS_OFF + mov %esp, %eax call xen_evtchn_do_upcall #ifndef CONFIG_PREEMPTION call xen_maybe_preempt_hcall @@ -1449,10 +1478,9 @@ common_exception_read_cr2: /* the function address is in %gs's slot on the stack */ - SAVE_ALL switch_stacks=1 skip_gs=1 + SAVE_ALL switch_stacks=1 skip_gs=1 unwind_espfix=1 ENCODE_FRAME_POINTER - UNWIND_ESPFIX_STACK /* fixup %gs */ GS_TO_REG %ecx @@ -1474,9 +1502,8 @@ common_exception: /* the function address is in %gs's slot on the stack */ - SAVE_ALL switch_stacks=1 skip_gs=1 + SAVE_ALL switch_stacks=1 skip_gs=1 unwind_espfix=1 ENCODE_FRAME_POINTER - UNWIND_ESPFIX_STACK /* fixup %gs */ GS_TO_REG %ecx @@ -1515,6 +1542,10 @@ ASM_CLAC #ifdef CONFIG_X86_ESPFIX32 + /* + * ESPFIX_SS is only ever set on the return to user path + * after we've switched to the entry stack. + */ pushl %eax movl %ss, %eax cmpw $__ESPFIX_SS, %ax @@ -1550,6 +1581,11 @@ movl %ebx, %esp .Lnmi_return: +#ifdef CONFIG_X86_ESPFIX32 + testl $CS_FROM_ESPFIX, PT_CS(%esp) + jnz .Lnmi_from_espfix +#endif + CHECK_AND_APPLY_ESPFIX RESTORE_ALL_NMI cr3_reg=%edi pop=4 jmp .Lirq_return @@ -1557,23 +1593,42 @@ #ifdef CONFIG_X86_ESPFIX32 .Lnmi_espfix_stack: /* - * create the pointer to lss back + * Create the pointer to LSS back */ pushl %ss pushl %esp addl $4, (%esp) - /* copy the iret frame of 12 bytes */ - .rept 3 - pushl 16(%esp) - .endr - pushl %eax - SAVE_ALL_NMI cr3_reg=%edi + + /* Copy the (short) IRET frame */ + pushl 4*4(%esp) # flags + pushl 4*4(%esp) # cs + pushl 4*4(%esp) # ip + + pushl %eax # orig_ax + + SAVE_ALL_NMI cr3_reg=%edi unwind_espfix=1 ENCODE_FRAME_POINTER - FIXUP_ESPFIX_STACK # %eax == %esp + + /* clear CS_FROM_KERNEL, set CS_FROM_ESPFIX */ + xorl $(CS_FROM_ESPFIX | CS_FROM_KERNEL), PT_CS(%esp) + xorl %edx, %edx # zero error code - call do_nmi + movl %esp, %eax # pt_regs pointer + jmp .Lnmi_from_sysenter_stack + +.Lnmi_from_espfix: RESTORE_ALL_NMI cr3_reg=%edi - lss 12+4(%esp), %esp # back to espfix stack + /* + * Because we cleared CS_FROM_KERNEL, IRET_FRAME 'forgot' to + * fix up the gap and long frame: + * + * 3 - original frame (exception) + * 2 - ESPFIX block (above) + * 6 - gap (FIXUP_FRAME) + * 5 - long frame (FIXUP_FRAME) + * 1 - orig_ax + */ + lss (1+5+6)*4(%esp), %esp # back to espfix stack jmp .Lirq_return #endif END(nmi) --- linux-riscv-5.4.0.orig/arch/x86/entry/syscall_32.c +++ linux-riscv-5.4.0/arch/x86/entry/syscall_32.c @@ -10,13 +10,11 @@ #ifdef CONFIG_IA32_EMULATION /* On X86_64, we use struct pt_regs * to pass parameters to syscalls */ #define __SYSCALL_I386(nr, sym, qual) extern asmlinkage long sym(const struct pt_regs *); - -/* this is a lie, but it does not hurt as sys_ni_syscall just returns -EINVAL */ -extern asmlinkage long sys_ni_syscall(const struct pt_regs *); - +#define __sys_ni_syscall __ia32_sys_ni_syscall #else /* CONFIG_IA32_EMULATION */ #define __SYSCALL_I386(nr, sym, qual) extern asmlinkage long sym(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long); extern asmlinkage long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long); +#define __sys_ni_syscall sys_ni_syscall #endif /* CONFIG_IA32_EMULATION */ #include @@ -29,6 +27,6 @@ * Smells like a compiler bug -- it doesn't work * when the & below is removed. */ - [0 ... __NR_syscall_compat_max] = &sys_ni_syscall, + [0 ... __NR_syscall_compat_max] = &__sys_ni_syscall, #include }; --- linux-riscv-5.4.0.orig/arch/x86/entry/syscall_64.c +++ linux-riscv-5.4.0/arch/x86/entry/syscall_64.c @@ -4,11 +4,17 @@ #include #include #include +#include #include #include -/* this is a lie, but it does not hurt as sys_ni_syscall just returns -EINVAL */ -extern asmlinkage long sys_ni_syscall(const struct pt_regs *); +extern asmlinkage long sys_ni_syscall(void); + +SYSCALL_DEFINE0(ni_syscall) +{ + return sys_ni_syscall(); +} + #define __SYSCALL_64(nr, sym, qual) extern asmlinkage long sym(const struct pt_regs *); #define __SYSCALL_X32(nr, sym, qual) __SYSCALL_64(nr, sym, qual) #include @@ -23,7 +29,7 @@ * Smells like a compiler bug -- it doesn't work * when the & below is removed. */ - [0 ... __NR_syscall_max] = &sys_ni_syscall, + [0 ... __NR_syscall_max] = &__x64_sys_ni_syscall, #include }; @@ -40,7 +46,7 @@ * Smells like a compiler bug -- it doesn't work * when the & below is removed. */ - [0 ... __NR_syscall_x32_max] = &sys_ni_syscall, + [0 ... __NR_syscall_x32_max] = &__x64_sys_ni_syscall, #include }; --- linux-riscv-5.4.0.orig/arch/x86/entry/syscalls/syscall_32.tbl +++ linux-riscv-5.4.0/arch/x86/entry/syscalls/syscall_32.tbl @@ -124,13 +124,13 @@ 110 i386 iopl sys_iopl __ia32_sys_iopl 111 i386 vhangup sys_vhangup __ia32_sys_vhangup 112 i386 idle -113 i386 vm86old sys_vm86old sys_ni_syscall +113 i386 vm86old sys_vm86old __ia32_sys_ni_syscall 114 i386 wait4 sys_wait4 __ia32_compat_sys_wait4 115 i386 swapoff sys_swapoff __ia32_sys_swapoff 116 i386 sysinfo sys_sysinfo __ia32_compat_sys_sysinfo 117 i386 ipc sys_ipc __ia32_compat_sys_ipc 118 i386 fsync sys_fsync __ia32_sys_fsync -119 i386 sigreturn sys_sigreturn sys32_sigreturn +119 i386 sigreturn sys_sigreturn __ia32_compat_sys_sigreturn 120 i386 clone sys_clone __ia32_compat_sys_x86_clone 121 i386 setdomainname sys_setdomainname __ia32_sys_setdomainname 122 i386 uname sys_newuname __ia32_sys_newuname @@ -177,14 +177,14 @@ 163 i386 mremap sys_mremap __ia32_sys_mremap 164 i386 setresuid sys_setresuid16 __ia32_sys_setresuid16 165 i386 getresuid sys_getresuid16 __ia32_sys_getresuid16 -166 i386 vm86 sys_vm86 sys_ni_syscall +166 i386 vm86 sys_vm86 __ia32_sys_ni_syscall 167 i386 query_module 168 i386 poll sys_poll __ia32_sys_poll 169 i386 nfsservctl 170 i386 setresgid sys_setresgid16 __ia32_sys_setresgid16 171 i386 getresgid sys_getresgid16 __ia32_sys_getresgid16 172 i386 prctl sys_prctl __ia32_sys_prctl -173 i386 rt_sigreturn sys_rt_sigreturn sys32_rt_sigreturn +173 i386 rt_sigreturn sys_rt_sigreturn __ia32_compat_sys_rt_sigreturn 174 i386 rt_sigaction sys_rt_sigaction __ia32_compat_sys_rt_sigaction 175 i386 rt_sigprocmask sys_rt_sigprocmask __ia32_compat_sys_rt_sigprocmask 176 i386 rt_sigpending sys_rt_sigpending __ia32_compat_sys_rt_sigpending --- linux-riscv-5.4.0.orig/arch/x86/entry/vdso/vdso32-setup.c +++ linux-riscv-5.4.0/arch/x86/entry/vdso/vdso32-setup.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include --- linux-riscv-5.4.0.orig/arch/x86/events/amd/core.c +++ linux-riscv-5.4.0/arch/x86/events/amd/core.c @@ -14,6 +14,10 @@ static DEFINE_PER_CPU(unsigned long, perf_nmi_tstamp); static unsigned long perf_nmi_window; +/* AMD Event 0xFFF: Merge. Used with Large Increment per Cycle events */ +#define AMD_MERGE_EVENT ((0xFULL << 32) | 0xFFULL) +#define AMD_MERGE_EVENT_ENABLE (AMD_MERGE_EVENT | ARCH_PERFMON_EVENTSEL_ENABLE) + static __initconst const u64 amd_hw_cache_event_ids [PERF_COUNT_HW_CACHE_MAX] [PERF_COUNT_HW_CACHE_OP_MAX] @@ -246,6 +250,7 @@ [PERF_COUNT_HW_CPU_CYCLES] = 0x0076, [PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0, [PERF_COUNT_HW_CACHE_REFERENCES] = 0xff60, + [PERF_COUNT_HW_CACHE_MISSES] = 0x0964, [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c2, [PERF_COUNT_HW_BRANCH_MISSES] = 0x00c3, [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x0287, @@ -301,6 +306,25 @@ return offset; } +/* + * AMD64 events are detected based on their event codes. + */ +static inline unsigned int amd_get_event_code(struct hw_perf_event *hwc) +{ + return ((hwc->config >> 24) & 0x0f00) | (hwc->config & 0x00ff); +} + +static inline bool amd_is_pair_event_code(struct hw_perf_event *hwc) +{ + if (!(x86_pmu.flags & PMU_FL_PAIR)) + return false; + + switch (amd_get_event_code(hwc)) { + case 0x003: return true; /* Retired SSE/AVX FLOPs */ + default: return false; + } +} + static int amd_core_hw_config(struct perf_event *event) { if (event->attr.exclude_host && event->attr.exclude_guest) @@ -316,15 +340,10 @@ else if (event->attr.exclude_guest) event->hw.config |= AMD64_EVENTSEL_HOSTONLY; - return 0; -} + if ((x86_pmu.flags & PMU_FL_PAIR) && amd_is_pair_event_code(&event->hw)) + event->hw.flags |= PERF_X86_EVENT_PAIR; -/* - * AMD64 events are detected based on their event codes. - */ -static inline unsigned int amd_get_event_code(struct hw_perf_event *hwc) -{ - return ((hwc->config >> 24) & 0x0f00) | (hwc->config & 0x00ff); + return 0; } static inline int amd_is_nb_event(struct hw_perf_event *hwc) @@ -864,6 +883,29 @@ } } +static struct event_constraint pair_constraint; + +static struct event_constraint * +amd_get_event_constraints_f17h(struct cpu_hw_events *cpuc, int idx, + struct perf_event *event) +{ + struct hw_perf_event *hwc = &event->hw; + + if (amd_is_pair_event_code(hwc)) + return &pair_constraint; + + return &unconstrained; +} + +static void amd_put_event_constraints_f17h(struct cpu_hw_events *cpuc, + struct perf_event *event) +{ + struct hw_perf_event *hwc = &event->hw; + + if (is_counter_pair(hwc)) + --cpuc->n_pair; +} + static ssize_t amd_event_sysfs_show(char *page, u64 config) { u64 event = (config & ARCH_PERFMON_EVENTSEL_EVENT) | @@ -907,33 +949,15 @@ static int __init amd_core_pmu_init(void) { + u64 even_ctr_mask = 0ULL; + int i; + if (!boot_cpu_has(X86_FEATURE_PERFCTR_CORE)) return 0; - /* Avoid calulating the value each time in the NMI handler */ + /* Avoid calculating the value each time in the NMI handler */ perf_nmi_window = msecs_to_jiffies(100); - switch (boot_cpu_data.x86) { - case 0x15: - pr_cont("Fam15h "); - x86_pmu.get_event_constraints = amd_get_event_constraints_f15h; - break; - case 0x17: - pr_cont("Fam17h "); - /* - * In family 17h, there are no event constraints in the PMC hardware. - * We fallback to using default amd_get_event_constraints. - */ - break; - case 0x18: - pr_cont("Fam18h "); - /* Using default amd_get_event_constraints. */ - break; - default: - pr_err("core perfctr but no constraints; unknown hardware!\n"); - return -ENODEV; - } - /* * If core performance counter extensions exists, we must use * MSR_F15H_PERF_CTL/MSR_F15H_PERF_CTR msrs. See also @@ -948,6 +972,32 @@ */ x86_pmu.amd_nb_constraints = 0; + if (boot_cpu_data.x86 == 0x15) { + pr_cont("Fam15h "); + x86_pmu.get_event_constraints = amd_get_event_constraints_f15h; + } + if (boot_cpu_data.x86 >= 0x17) { + pr_cont("Fam17h+ "); + /* + * Family 17h and compatibles have constraints for Large + * Increment per Cycle events: they may only be assigned an + * even numbered counter that has a consecutive adjacent odd + * numbered counter following it. + */ + for (i = 0; i < x86_pmu.num_counters - 1; i += 2) + even_ctr_mask |= 1 << i; + + pair_constraint = (struct event_constraint) + __EVENT_CONSTRAINT(0, even_ctr_mask, 0, + x86_pmu.num_counters / 2, 0, + PERF_X86_EVENT_PAIR); + + x86_pmu.get_event_constraints = amd_get_event_constraints_f17h; + x86_pmu.put_event_constraints = amd_put_event_constraints_f17h; + x86_pmu.perf_ctr_pair_en = AMD_MERGE_EVENT_ENABLE; + x86_pmu.flags |= PMU_FL_PAIR; + } + pr_cont("core perfctr, "); return 0; } --- linux-riscv-5.4.0.orig/arch/x86/events/amd/uncore.c +++ linux-riscv-5.4.0/arch/x86/events/amd/uncore.c @@ -190,15 +190,12 @@ /* * NB and Last level cache counters (MSRs) are shared across all cores - * that share the same NB / Last level cache. Interrupts can be directed - * to a single target core, however, event counts generated by processes - * running on other cores cannot be masked out. So we do not support - * sampling and per-thread events. + * that share the same NB / Last level cache. On family 16h and below, + * Interrupts can be directed to a single target core, however, event + * counts generated by processes running on other cores cannot be masked + * out. So we do not support sampling and per-thread events via + * CAP_NO_INTERRUPT, and we do not enable counter overflow interrupts: */ - if (is_sampling_event(event) || event->attach_state & PERF_ATTACH_TASK) - return -EINVAL; - - /* and we do not enable counter overflow interrupts */ hwc->config = event->attr.config & AMD64_RAW_EVENT_MASK_NB; hwc->idx = -1; @@ -306,7 +303,7 @@ .start = amd_uncore_start, .stop = amd_uncore_stop, .read = amd_uncore_read, - .capabilities = PERF_PMU_CAP_NO_EXCLUDE, + .capabilities = PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT, }; static struct pmu amd_llc_pmu = { @@ -317,7 +314,7 @@ .start = amd_uncore_start, .stop = amd_uncore_stop, .read = amd_uncore_read, - .capabilities = PERF_PMU_CAP_NO_EXCLUDE, + .capabilities = PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT, }; static struct amd_uncore *amd_uncore_alloc(unsigned int cpu) --- linux-riscv-5.4.0.orig/arch/x86/events/core.c +++ linux-riscv-5.4.0/arch/x86/events/core.c @@ -375,7 +375,7 @@ * LBR and BTS are still mutually exclusive. */ if (x86_pmu.lbr_pt_coexist && what == x86_lbr_exclusive_pt) - return 0; + goto out; if (!atomic_inc_not_zero(&x86_pmu.lbr_exclusive[what])) { mutex_lock(&pmc_reserve_mutex); @@ -387,6 +387,7 @@ mutex_unlock(&pmc_reserve_mutex); } +out: atomic_inc(&active_events); return 0; @@ -397,11 +398,15 @@ void x86_del_exclusive(unsigned int what) { + atomic_dec(&active_events); + + /* + * See the comment in x86_add_exclusive(). + */ if (x86_pmu.lbr_pt_coexist && what == x86_lbr_exclusive_pt) return; atomic_dec(&x86_pmu.lbr_exclusive[what]); - atomic_dec(&active_events); } int x86_setup_perfctr(struct perf_event *event) @@ -612,6 +617,7 @@ int idx; for (idx = 0; idx < x86_pmu.num_counters; idx++) { + struct hw_perf_event *hwc = &cpuc->events[idx]->hw; u64 val; if (!test_bit(idx, cpuc->active_mask)) @@ -621,6 +627,8 @@ continue; val &= ~ARCH_PERFMON_EVENTSEL_ENABLE; wrmsrl(x86_pmu_config_addr(idx), val); + if (is_counter_pair(hwc)) + wrmsrl(x86_pmu_config_addr(idx + 1), 0); } } @@ -693,7 +701,7 @@ int counter; /* counter index */ int unassigned; /* number of events to be assigned left */ int nr_gp; /* number of GP counters used */ - unsigned long used[BITS_TO_LONGS(X86_PMC_IDX_MAX)]; + u64 used; }; /* Total max is X86_PMC_IDX_MAX, but we are O(n!) limited */ @@ -750,8 +758,12 @@ sched->saved_states--; sched->state = sched->saved[sched->saved_states]; - /* continue with next counter: */ - clear_bit(sched->state.counter++, sched->state.used); + /* this assignment didn't work out */ + /* XXX broken vs EVENT_PAIR */ + sched->state.used &= ~BIT_ULL(sched->state.counter); + + /* try the next one */ + sched->state.counter++; return true; } @@ -776,20 +788,32 @@ if (c->idxmsk64 & (~0ULL << INTEL_PMC_IDX_FIXED)) { idx = INTEL_PMC_IDX_FIXED; for_each_set_bit_from(idx, c->idxmsk, X86_PMC_IDX_MAX) { - if (!__test_and_set_bit(idx, sched->state.used)) - goto done; + u64 mask = BIT_ULL(idx); + + if (sched->state.used & mask) + continue; + + sched->state.used |= mask; + goto done; } } /* Grab the first unused counter starting with idx */ idx = sched->state.counter; for_each_set_bit_from(idx, c->idxmsk, INTEL_PMC_IDX_FIXED) { - if (!__test_and_set_bit(idx, sched->state.used)) { - if (sched->state.nr_gp++ >= sched->max_gp) - return false; + u64 mask = BIT_ULL(idx); - goto done; - } + if (c->flags & PERF_X86_EVENT_PAIR) + mask |= mask << 1; + + if (sched->state.used & mask) + continue; + + if (sched->state.nr_gp++ >= sched->max_gp) + return false; + + sched->state.used |= mask; + goto done; } return false; @@ -866,12 +890,10 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign) { struct event_constraint *c; - unsigned long used_mask[BITS_TO_LONGS(X86_PMC_IDX_MAX)]; struct perf_event *e; int n0, i, wmin, wmax, unsched = 0; struct hw_perf_event *hwc; - - bitmap_zero(used_mask, X86_PMC_IDX_MAX); + u64 used_mask = 0; /* * Compute the number of events already present; see x86_pmu_add(), @@ -914,6 +936,8 @@ * fastpath, try to reuse previous register */ for (i = 0; i < n; i++) { + u64 mask; + hwc = &cpuc->event_list[i]->hw; c = cpuc->event_constraint[i]; @@ -925,11 +949,16 @@ if (!test_bit(hwc->idx, c->idxmsk)) break; + mask = BIT_ULL(hwc->idx); + if (is_counter_pair(hwc)) + mask |= mask << 1; + /* not already used */ - if (test_bit(hwc->idx, used_mask)) + if (used_mask & mask) break; - __set_bit(hwc->idx, used_mask); + used_mask |= mask; + if (assign) assign[i] = hwc->idx; } @@ -952,6 +981,15 @@ READ_ONCE(cpuc->excl_cntrs->exclusive_present)) gpmax /= 2; + /* + * Reduce the amount of available counters to allow fitting + * the extra Merge events needed by large increment events. + */ + if (x86_pmu.flags & PMU_FL_PAIR) { + gpmax = x86_pmu.num_counters - cpuc->n_pair; + WARN_ON(gpmax <= 0); + } + unsched = perf_assign_events(cpuc->event_constraint, n, wmin, wmax, gpmax, assign); } @@ -1032,6 +1070,8 @@ return -EINVAL; cpuc->event_list[n] = leader; n++; + if (is_counter_pair(&leader->hw)) + cpuc->n_pair++; } if (!dogrp) return n; @@ -1046,6 +1086,8 @@ cpuc->event_list[n] = event; n++; + if (is_counter_pair(&event->hw)) + cpuc->n_pair++; } return n; } @@ -1232,6 +1274,13 @@ wrmsrl(hwc->event_base, (u64)(-left) & x86_pmu.cntval_mask); /* + * Clear the Merge event counter's upper 16 bits since + * we currently declare a 48-bit counter width + */ + if (is_counter_pair(hwc)) + wrmsrl(x86_pmu_event_addr(idx + 1), 0); + + /* * Due to erratum on certan cpu we need * a second write to be sure the register * is updated properly @@ -1641,9 +1690,12 @@ ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr, char *page) { - struct perf_pmu_events_attr *pmu_attr = \ + struct perf_pmu_events_attr *pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr); - u64 config = x86_pmu.event_map(pmu_attr->id); + u64 config = 0; + + if (pmu_attr->id < x86_pmu.max_events) + config = x86_pmu.event_map(pmu_attr->id); /* string trumps id */ if (pmu_attr->event_str) @@ -1712,6 +1764,9 @@ { struct perf_pmu_events_attr *pmu_attr; + if (idx >= x86_pmu.max_events) + return 0; + pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr.attr); /* str trumps id */ return pmu_attr->event_str || x86_pmu.event_map(idx) ? attr->mode : 0; --- linux-riscv-5.4.0.orig/arch/x86/events/intel/bts.c +++ linux-riscv-5.4.0/arch/x86/events/intel/bts.c @@ -63,9 +63,17 @@ static struct pmu bts_pmu; +static int buf_nr_pages(struct page *page) +{ + if (!PagePrivate(page)) + return 1; + + return 1 << page_private(page); +} + static size_t buf_size(struct page *page) { - return 1 << (PAGE_SHIFT + page_private(page)); + return buf_nr_pages(page) * PAGE_SIZE; } static void * @@ -83,9 +91,7 @@ /* count all the high order buffers */ for (pg = 0, nbuf = 0; pg < nr_pages;) { page = virt_to_page(pages[pg]); - if (WARN_ON_ONCE(!PagePrivate(page) && nr_pages > 1)) - return NULL; - pg += 1 << page_private(page); + pg += buf_nr_pages(page); nbuf++; } @@ -109,7 +115,7 @@ unsigned int __nr_pages; page = virt_to_page(pages[pg]); - __nr_pages = PagePrivate(page) ? 1 << page_private(page) : 1; + __nr_pages = buf_nr_pages(page); buf->buf[nbuf].page = page; buf->buf[nbuf].offset = offset; buf->buf[nbuf].displacement = (pad ? BTS_RECORD_SIZE - pad : 0); --- linux-riscv-5.4.0.orig/arch/x86/events/intel/core.c +++ linux-riscv-5.4.0/arch/x86/events/intel/core.c @@ -4746,6 +4746,7 @@ break; case INTEL_FAM6_ATOM_TREMONT_D: + case INTEL_FAM6_ATOM_TREMONT: x86_pmu.late_ack = true; memcpy(hw_cache_event_ids, glp_hw_cache_event_ids, sizeof(hw_cache_event_ids)); --- linux-riscv-5.4.0.orig/arch/x86/events/intel/cstate.c +++ linux-riscv-5.4.0/arch/x86/events/intel/cstate.c @@ -40,17 +40,18 @@ * Model specific counters: * MSR_CORE_C1_RES: CORE C1 Residency Counter * perf code: 0x00 - * Available model: SLM,AMT,GLM,CNL + * Available model: SLM,AMT,GLM,CNL,TNT * Scope: Core (each processor core has a MSR) * MSR_CORE_C3_RESIDENCY: CORE C3 Residency Counter * perf code: 0x01 * Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,GLM, - * CNL,KBL,CML + * CNL,KBL,CML,TNT * Scope: Core * MSR_CORE_C6_RESIDENCY: CORE C6 Residency Counter * perf code: 0x02 * Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW, - * SKL,KNL,GLM,CNL,KBL,CML,ICL,TGL + * SKL,KNL,GLM,CNL,KBL,CML,ICL,TGL, + * TNT * Scope: Core * MSR_CORE_C7_RESIDENCY: CORE C7 Residency Counter * perf code: 0x03 @@ -60,17 +61,18 @@ * MSR_PKG_C2_RESIDENCY: Package C2 Residency Counter. * perf code: 0x00 * Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM,CNL, - * KBL,CML,ICL,TGL + * KBL,CML,ICL,TGL,TNT * Scope: Package (physical package) * MSR_PKG_C3_RESIDENCY: Package C3 Residency Counter. * perf code: 0x01 * Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL, - * GLM,CNL,KBL,CML,ICL,TGL + * GLM,CNL,KBL,CML,ICL,TGL,TNT * Scope: Package (physical package) * MSR_PKG_C6_RESIDENCY: Package C6 Residency Counter. * perf code: 0x02 - * Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW - * SKL,KNL,GLM,CNL,KBL,CML,ICL,TGL + * Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW, + * SKL,KNL,GLM,CNL,KBL,CML,ICL,TGL, + * TNT * Scope: Package (physical package) * MSR_PKG_C7_RESIDENCY: Package C7 Residency Counter. * perf code: 0x03 @@ -87,7 +89,8 @@ * Scope: Package (physical package) * MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter. * perf code: 0x06 - * Available model: HSW ULT,KBL,GLM,CNL,CML,ICL,TGL + * Available model: HSW ULT,KBL,GLM,CNL,CML,ICL,TGL, + * TNT * Scope: Package (physical package) * */ @@ -640,8 +643,9 @@ X86_CSTATES_MODEL(INTEL_FAM6_ATOM_GOLDMONT, glm_cstates), X86_CSTATES_MODEL(INTEL_FAM6_ATOM_GOLDMONT_D, glm_cstates), - X86_CSTATES_MODEL(INTEL_FAM6_ATOM_GOLDMONT_PLUS, glm_cstates), + X86_CSTATES_MODEL(INTEL_FAM6_ATOM_TREMONT_D, glm_cstates), + X86_CSTATES_MODEL(INTEL_FAM6_ATOM_TREMONT, glm_cstates), X86_CSTATES_MODEL(INTEL_FAM6_ICELAKE_L, icl_cstates), X86_CSTATES_MODEL(INTEL_FAM6_ICELAKE, icl_cstates), --- linux-riscv-5.4.0.orig/arch/x86/events/intel/ds.c +++ linux-riscv-5.4.0/arch/x86/events/intel/ds.c @@ -1713,6 +1713,8 @@ old = ((s64)(prev_raw_count << shift) >> shift); local64_add(new - old + count * period, &event->count); + local64_set(&hwc->period_left, -new); + perf_event_update_userpage(event); return 0; --- linux-riscv-5.4.0.orig/arch/x86/events/intel/uncore_snb.c +++ linux-riscv-5.4.0/arch/x86/events/intel/uncore_snb.c @@ -15,6 +15,7 @@ #define PCI_DEVICE_ID_INTEL_SKL_HQ_IMC 0x1910 #define PCI_DEVICE_ID_INTEL_SKL_SD_IMC 0x190f #define PCI_DEVICE_ID_INTEL_SKL_SQ_IMC 0x191f +#define PCI_DEVICE_ID_INTEL_SKL_E3_IMC 0x1918 #define PCI_DEVICE_ID_INTEL_KBL_Y_IMC 0x590c #define PCI_DEVICE_ID_INTEL_KBL_U_IMC 0x5904 #define PCI_DEVICE_ID_INTEL_KBL_UQ_IMC 0x5914 @@ -658,6 +659,10 @@ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0), }, { /* IMC */ + PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_SKL_E3_IMC), + .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0), + }, + { /* IMC */ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_KBL_Y_IMC), .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0), }, @@ -826,6 +831,7 @@ IMC_DEV(SKL_HQ_IMC, &skl_uncore_pci_driver), /* 6th Gen Core H Quad Core */ IMC_DEV(SKL_SD_IMC, &skl_uncore_pci_driver), /* 6th Gen Core S Dual Core */ IMC_DEV(SKL_SQ_IMC, &skl_uncore_pci_driver), /* 6th Gen Core S Quad Core */ + IMC_DEV(SKL_E3_IMC, &skl_uncore_pci_driver), /* Xeon E3 V5 Gen Core processor */ IMC_DEV(KBL_Y_IMC, &skl_uncore_pci_driver), /* 7th Gen Core Y */ IMC_DEV(KBL_U_IMC, &skl_uncore_pci_driver), /* 7th Gen Core U */ IMC_DEV(KBL_UQ_IMC, &skl_uncore_pci_driver), /* 7th Gen Core U Quad Core */ --- linux-riscv-5.4.0.orig/arch/x86/events/intel/uncore_snbep.c +++ linux-riscv-5.4.0/arch/x86/events/intel/uncore_snbep.c @@ -369,11 +369,6 @@ #define SNR_M2M_PCI_PMON_BOX_CTL 0x438 #define SNR_M2M_PCI_PMON_UMASK_EXT 0xff -/* SNR PCIE3 */ -#define SNR_PCIE3_PCI_PMON_CTL0 0x508 -#define SNR_PCIE3_PCI_PMON_CTR0 0x4e8 -#define SNR_PCIE3_PCI_PMON_BOX_CTL 0x4e4 - /* SNR IMC */ #define SNR_IMC_MMIO_PMON_FIXED_CTL 0x54 #define SNR_IMC_MMIO_PMON_FIXED_CTR 0x38 @@ -4328,27 +4323,12 @@ .format_group = &snr_m2m_uncore_format_group, }; -static struct intel_uncore_type snr_uncore_pcie3 = { - .name = "pcie3", - .num_counters = 4, - .num_boxes = 1, - .perf_ctr_bits = 48, - .perf_ctr = SNR_PCIE3_PCI_PMON_CTR0, - .event_ctl = SNR_PCIE3_PCI_PMON_CTL0, - .event_mask = SNBEP_PMON_RAW_EVENT_MASK, - .box_ctl = SNR_PCIE3_PCI_PMON_BOX_CTL, - .ops = &ivbep_uncore_pci_ops, - .format_group = &ivbep_uncore_format_group, -}; - enum { SNR_PCI_UNCORE_M2M, - SNR_PCI_UNCORE_PCIE3, }; static struct intel_uncore_type *snr_pci_uncores[] = { [SNR_PCI_UNCORE_M2M] = &snr_uncore_m2m, - [SNR_PCI_UNCORE_PCIE3] = &snr_uncore_pcie3, NULL, }; @@ -4357,10 +4337,6 @@ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x344a), .driver_data = UNCORE_PCI_DEV_FULL_DATA(12, 0, SNR_PCI_UNCORE_M2M, 0), }, - { /* PCIe3 */ - PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x334a), - .driver_data = UNCORE_PCI_DEV_FULL_DATA(4, 0, SNR_PCI_UNCORE_PCIE3, 0), - }, { /* end: all zeroes */ } }; @@ -4536,6 +4512,7 @@ INTEL_UNCORE_EVENT_DESC(write, "event=0xff,umask=0x21"), INTEL_UNCORE_EVENT_DESC(write.scale, "3.814697266e-6"), INTEL_UNCORE_EVENT_DESC(write.unit, "MiB"), + { /* end: all zeroes */ }, }; static struct intel_uncore_ops snr_uncore_imc_freerunning_ops = { --- linux-riscv-5.4.0.orig/arch/x86/events/msr.c +++ linux-riscv-5.4.0/arch/x86/events/msr.c @@ -75,8 +75,9 @@ case INTEL_FAM6_ATOM_GOLDMONT: case INTEL_FAM6_ATOM_GOLDMONT_D: - case INTEL_FAM6_ATOM_GOLDMONT_PLUS: + case INTEL_FAM6_ATOM_TREMONT_D: + case INTEL_FAM6_ATOM_TREMONT: case INTEL_FAM6_XEON_PHI_KNL: case INTEL_FAM6_XEON_PHI_KNM: --- linux-riscv-5.4.0.orig/arch/x86/events/perf_event.h +++ linux-riscv-5.4.0/arch/x86/events/perf_event.h @@ -77,6 +77,7 @@ #define PERF_X86_EVENT_AUTO_RELOAD 0x0200 /* use PEBS auto-reload */ #define PERF_X86_EVENT_LARGE_PEBS 0x0400 /* use large PEBS */ #define PERF_X86_EVENT_PEBS_VIA_PT 0x0800 /* use PT buffer for PEBS */ +#define PERF_X86_EVENT_PAIR 0x1000 /* Large Increment per Cycle */ struct amd_nb { int nb_id; /* NorthBridge id */ @@ -272,6 +273,7 @@ struct amd_nb *amd_nb; /* Inverted mask of bits to clear in the perf_ctr ctrl registers */ u64 perf_ctr_virt_mask; + int n_pair; /* Large increment events */ void *kfree_on_online[X86_PERF_KFREE_MAX]; }; @@ -686,6 +688,7 @@ * AMD bits */ unsigned int amd_nb_constraints : 1; + u64 perf_ctr_pair_en; /* * Extra registers for events @@ -735,6 +738,7 @@ #define PMU_FL_EXCL_ENABLED 0x8 /* exclusive counter active */ #define PMU_FL_PEBS_ALL 0x10 /* all events are valid PEBS events */ #define PMU_FL_TFA 0x20 /* deal with TSX force abort */ +#define PMU_FL_PAIR 0x40 /* merge counters for large incr. events */ #define EVENT_VAR(_id) event_attr_##_id #define EVENT_PTR(_id) &event_attr_##_id.attr.attr @@ -830,6 +834,11 @@ void x86_pmu_disable_all(void); +static inline bool is_counter_pair(struct hw_perf_event *hwc) +{ + return hwc->flags & PERF_X86_EVENT_PAIR; +} + static inline void __x86_pmu_enable_event(struct hw_perf_event *hwc, u64 enable_mask) { @@ -837,6 +846,14 @@ if (hwc->extra_reg.reg) wrmsrl(hwc->extra_reg.reg, hwc->extra_reg.config); + + /* + * Add enabled Merge event on next counter + * if large increment event being enabled on this counter + */ + if (is_counter_pair(hwc)) + wrmsrl(x86_pmu_config_addr(hwc->idx + 1), x86_pmu.perf_ctr_pair_en); + wrmsrl(hwc->config_base, (hwc->config | enable_mask) & ~disable_mask); } @@ -853,6 +870,9 @@ struct hw_perf_event *hwc = &event->hw; wrmsrl(hwc->config_base, hwc->config); + + if (is_counter_pair(hwc)) + wrmsrl(x86_pmu_config_addr(hwc->idx + 1), 0); } void x86_pmu_enable_event(struct perf_event *event); --- linux-riscv-5.4.0.orig/arch/x86/hyperv/hv_init.c +++ linux-riscv-5.4.0/arch/x86/hyperv/hv_init.c @@ -22,6 +22,14 @@ #include #include +#ifndef PKG_ABI +/* + * Preserve the ability to 'make deb-pkg' since PKG_ABI is provided + * by the Ubuntu build rules. + */ +#define PKG_ABI 0 +#endif + void *hv_hypercall_pg; EXPORT_SYMBOL_GPL(hv_hypercall_pg); @@ -297,7 +305,7 @@ * 1. Register the guest ID * 2. Enable the hypercall and register the hypercall page */ - guest_id = generate_guest_id(0, LINUX_VERSION_CODE, 0); + guest_id = generate_guest_id(0x80 /*Canonical*/, LINUX_VERSION_CODE, PKG_ABI); wrmsrl(HV_X64_MSR_GUEST_OS_ID, guest_id); hv_hypercall_pg = __vmalloc(PAGE_SIZE, GFP_KERNEL, PAGE_KERNEL_RX); --- linux-riscv-5.4.0.orig/arch/x86/ia32/ia32_signal.c +++ linux-riscv-5.4.0/arch/x86/ia32/ia32_signal.c @@ -21,6 +21,7 @@ #include #include #include +#include #include #include #include @@ -118,7 +119,7 @@ return err; } -asmlinkage long sys32_sigreturn(void) +COMPAT_SYSCALL_DEFINE0(sigreturn) { struct pt_regs *regs = current_pt_regs(); struct sigframe_ia32 __user *frame = (struct sigframe_ia32 __user *)(regs->sp-8); @@ -144,7 +145,7 @@ return 0; } -asmlinkage long sys32_rt_sigreturn(void) +COMPAT_SYSCALL_DEFINE0(rt_sigreturn) { struct pt_regs *regs = current_pt_regs(); struct rt_sigframe_ia32 __user *frame; --- linux-riscv-5.4.0.orig/arch/x86/include/asm/apic.h +++ linux-riscv-5.4.0/arch/x86/include/asm/apic.h @@ -140,6 +140,7 @@ extern void lapic_shutdown(void); extern void sync_Arb_IDs(void); extern void init_bsp_APIC(void); +extern void apic_intr_mode_select(void); extern void apic_intr_mode_init(void); extern void init_apic_mappings(void); void register_lapic_address(unsigned long address); @@ -188,6 +189,7 @@ # define setup_secondary_APIC_clock x86_init_noop static inline void lapic_update_tsc_freq(void) { } static inline void init_bsp_APIC(void) { } +static inline void apic_intr_mode_select(void) { } static inline void apic_intr_mode_init(void) { } static inline void lapic_assign_system_vectors(void) { } static inline void lapic_assign_legacy_vector(unsigned int i, bool r) { } @@ -452,6 +454,14 @@ apic_eoi(); } + +static inline bool lapic_vector_set_in_irr(unsigned int vector) +{ + u32 irr = apic_read(APIC_IRR + (vector / 32 * 0x10)); + + return !!(irr & (1U << (vector % 32))); +} + static inline unsigned default_get_apic_id(unsigned long x) { unsigned int ver = GET_APIC_VERSION(apic_read(APIC_LVR)); --- linux-riscv-5.4.0.orig/arch/x86/include/asm/apm.h +++ linux-riscv-5.4.0/arch/x86/include/asm/apm.h @@ -35,6 +35,7 @@ __asm__ __volatile__(APM_DO_ZERO_SEGS "pushl %%edi\n\t" "pushl %%ebp\n\t" + ANNOTATE_RETPOLINE_SAFE /* FRBS */ "lcall *%%cs:apm_bios_entry\n\t" "setc %%al\n\t" "popl %%ebp\n\t" @@ -59,6 +60,7 @@ __asm__ __volatile__(APM_DO_ZERO_SEGS "pushl %%edi\n\t" "pushl %%ebp\n\t" + ANNOTATE_RETPOLINE_SAFE /* FRBS */ "lcall *%%cs:apm_bios_entry\n\t" "setc %%bl\n\t" "popl %%ebp\n\t" --- linux-riscv-5.4.0.orig/arch/x86/include/asm/cpu_entry_area.h +++ linux-riscv-5.4.0/arch/x86/include/asm/cpu_entry_area.h @@ -78,8 +78,12 @@ /* * The GDT is just below entry_stack and thus serves (on x86_64) as - * a a read-only guard page. + * a read-only guard page. On 32-bit the GDT must be writeable, so + * it needs an extra guard page. */ +#ifdef CONFIG_X86_32 + char guard_entry_stack[PAGE_SIZE]; +#endif struct entry_stack_page entry_stack_page; /* @@ -94,7 +98,6 @@ */ struct cea_exception_stacks estacks; #endif -#ifdef CONFIG_CPU_SUP_INTEL /* * Per CPU debug store for Intel performance monitoring. Wastes a * full page at the moment. @@ -105,11 +108,13 @@ * Reserve enough fixmap PTEs. */ struct debug_store_buffers cpu_debug_buffers; -#endif }; -#define CPU_ENTRY_AREA_SIZE (sizeof(struct cpu_entry_area)) -#define CPU_ENTRY_AREA_TOT_SIZE (CPU_ENTRY_AREA_SIZE * NR_CPUS) +#define CPU_ENTRY_AREA_SIZE (sizeof(struct cpu_entry_area)) +#define CPU_ENTRY_AREA_ARRAY_SIZE (CPU_ENTRY_AREA_SIZE * NR_CPUS) + +/* Total size includes the readonly IDT mapping page as well: */ +#define CPU_ENTRY_AREA_TOTAL_SIZE (CPU_ENTRY_AREA_ARRAY_SIZE + PAGE_SIZE) DECLARE_PER_CPU(struct cpu_entry_area *, cpu_entry_area); DECLARE_PER_CPU(struct cea_exception_stacks *, cea_exception_stacks); @@ -117,13 +122,14 @@ extern void setup_cpu_entry_areas(void); extern void cea_set_pte(void *cea_vaddr, phys_addr_t pa, pgprot_t flags); +/* Single page reserved for the readonly IDT mapping: */ #define CPU_ENTRY_AREA_RO_IDT CPU_ENTRY_AREA_BASE #define CPU_ENTRY_AREA_PER_CPU (CPU_ENTRY_AREA_RO_IDT + PAGE_SIZE) #define CPU_ENTRY_AREA_RO_IDT_VADDR ((void *)CPU_ENTRY_AREA_RO_IDT) #define CPU_ENTRY_AREA_MAP_SIZE \ - (CPU_ENTRY_AREA_PER_CPU + CPU_ENTRY_AREA_TOT_SIZE - CPU_ENTRY_AREA_BASE) + (CPU_ENTRY_AREA_PER_CPU + CPU_ENTRY_AREA_ARRAY_SIZE - CPU_ENTRY_AREA_BASE) extern struct cpu_entry_area *get_cpu_entry_area(int cpu); --- linux-riscv-5.4.0.orig/arch/x86/include/asm/crash.h +++ linux-riscv-5.4.0/arch/x86/include/asm/crash.h @@ -2,6 +2,8 @@ #ifndef _ASM_X86_CRASH_H #define _ASM_X86_CRASH_H +struct kimage; + int crash_load_segments(struct kimage *image); int crash_copy_backup_region(struct kimage *image); int crash_setup_memmap_entries(struct kimage *image, --- linux-riscv-5.4.0.orig/arch/x86/include/asm/fixmap.h +++ linux-riscv-5.4.0/arch/x86/include/asm/fixmap.h @@ -156,7 +156,7 @@ extern pte_t *pkmap_page_table; void __native_set_fixmap(enum fixed_addresses idx, pte_t pte); -void native_set_fixmap(enum fixed_addresses idx, +void native_set_fixmap(unsigned /* enum fixed_addresses */ idx, phys_addr_t phys, pgprot_t flags); #ifndef CONFIG_PARAVIRT_XXL --- linux-riscv-5.4.0.orig/arch/x86/include/asm/fpu/internal.h +++ linux-riscv-5.4.0/arch/x86/include/asm/fpu/internal.h @@ -509,7 +509,7 @@ static inline int fpregs_state_valid(struct fpu *fpu, unsigned int cpu) { - return fpu == this_cpu_read_stable(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu; + return fpu == this_cpu_read(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu; } /* --- linux-riscv-5.4.0.orig/arch/x86/include/asm/kvm_host.h +++ linux-riscv-5.4.0/arch/x86/include/asm/kvm_host.h @@ -380,12 +380,12 @@ void (*set_cr3)(struct kvm_vcpu *vcpu, unsigned long root); unsigned long (*get_cr3)(struct kvm_vcpu *vcpu); u64 (*get_pdptr)(struct kvm_vcpu *vcpu, int index); - int (*page_fault)(struct kvm_vcpu *vcpu, gva_t gva, u32 err, + int (*page_fault)(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u32 err, bool prefault); void (*inject_page_fault)(struct kvm_vcpu *vcpu, struct x86_exception *fault); - gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t gva, u32 access, - struct x86_exception *exception); + gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, gpa_t gva_or_gpa, + u32 access, struct x86_exception *exception); gpa_t (*translate_gpa)(struct kvm_vcpu *vcpu, gpa_t gpa, u32 access, struct x86_exception *exception); int (*sync_page)(struct kvm_vcpu *vcpu, @@ -667,10 +667,10 @@ bool pvclock_set_guest_stopped_request; struct { + u8 preempted; u64 msr_val; u64 last_steal; - struct gfn_to_hva_cache stime; - struct kvm_steal_time steal; + struct gfn_to_pfn_cache cache; } st; u64 tsc_offset; @@ -1098,7 +1098,7 @@ void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap); void (*set_virtual_apic_mode)(struct kvm_vcpu *vcpu); void (*set_apic_access_page_addr)(struct kvm_vcpu *vcpu, hpa_t hpa); - void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector); + int (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector); int (*sync_pir_to_irr)(struct kvm_vcpu *vcpu); int (*set_tss_addr)(struct kvm *kvm, unsigned int addr); int (*set_identity_map_addr)(struct kvm *kvm, u64 ident_addr); @@ -1128,6 +1128,7 @@ bool (*xsaves_supported)(void); bool (*umip_emulated)(void); bool (*pt_supported)(void); + bool (*pku_supported)(void); int (*check_nested_events)(struct kvm_vcpu *vcpu, bool external_intr); void (*request_immediate_exit)(struct kvm_vcpu *vcpu); @@ -1450,7 +1451,7 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); -int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t gva, u64 error_code, +int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code, void *insn, int insn_len); void kvm_mmu_invlpg(struct kvm_vcpu *vcpu, gva_t gva); void kvm_mmu_invpcid_gva(struct kvm_vcpu *vcpu, gva_t gva, unsigned long pcid); --- linux-riscv-5.4.0.orig/arch/x86/include/asm/mce.h +++ linux-riscv-5.4.0/arch/x86/include/asm/mce.h @@ -290,6 +290,7 @@ /* These may be used by multiple smca_hwid_mcatypes */ enum smca_bank_types { SMCA_LS = 0, /* Load Store */ + SMCA_LS_V2, /* Load Store */ SMCA_IF, /* Instruction Fetch */ SMCA_L2_CACHE, /* L2 Cache */ SMCA_DE, /* Decoder Unit */ --- linux-riscv-5.4.0.orig/arch/x86/include/asm/msr-index.h +++ linux-riscv-5.4.0/arch/x86/include/asm/msr-index.h @@ -510,6 +510,8 @@ #define MSR_K7_HWCR 0xc0010015 #define MSR_K7_HWCR_SMMLOCK_BIT 0 #define MSR_K7_HWCR_SMMLOCK BIT_ULL(MSR_K7_HWCR_SMMLOCK_BIT) +#define MSR_K7_HWCR_IRPERF_EN_BIT 30 +#define MSR_K7_HWCR_IRPERF_EN BIT_ULL(MSR_K7_HWCR_IRPERF_EN_BIT) #define MSR_K7_FID_VID_CTL 0xc0010041 #define MSR_K7_FID_VID_STATUS 0xc0010042 --- linux-riscv-5.4.0.orig/arch/x86/include/asm/nmi.h +++ linux-riscv-5.4.0/arch/x86/include/asm/nmi.h @@ -41,7 +41,6 @@ struct list_head list; nmi_handler_t handler; u64 max_duration; - struct irq_work irq_work; unsigned long flags; const char *name; }; --- linux-riscv-5.4.0.orig/arch/x86/include/asm/pci-direct.h +++ linux-riscv-5.4.0/arch/x86/include/asm/pci-direct.h @@ -10,9 +10,11 @@ extern u32 read_pci_config(u8 bus, u8 slot, u8 func, u8 offset); extern u8 read_pci_config_byte(u8 bus, u8 slot, u8 func, u8 offset); extern u16 read_pci_config_16(u8 bus, u8 slot, u8 func, u8 offset); +extern u32 pci_early_find_cap(int bus, int slot, int func, int cap); extern void write_pci_config(u8 bus, u8 slot, u8 func, u8 offset, u32 val); extern void write_pci_config_byte(u8 bus, u8 slot, u8 func, u8 offset, u8 val); extern void write_pci_config_16(u8 bus, u8 slot, u8 func, u8 offset, u16 val); +extern unsigned int pci_early_clear_msi; extern int early_pci_allowed(void); #endif /* _ASM_X86_PCI_DIRECT_H */ --- linux-riscv-5.4.0.orig/arch/x86/include/asm/pgtable_32_types.h +++ linux-riscv-5.4.0/arch/x86/include/asm/pgtable_32_types.h @@ -44,11 +44,11 @@ * Define this here and validate with BUILD_BUG_ON() in pgtable_32.c * to avoid include recursion hell */ -#define CPU_ENTRY_AREA_PAGES (NR_CPUS * 40) +#define CPU_ENTRY_AREA_PAGES (NR_CPUS * 39) -#define CPU_ENTRY_AREA_BASE \ - ((FIXADDR_TOT_START - PAGE_SIZE * (CPU_ENTRY_AREA_PAGES + 1)) \ - & PMD_MASK) +/* The +1 is for the readonly IDT page: */ +#define CPU_ENTRY_AREA_BASE \ + ((FIXADDR_TOT_START - PAGE_SIZE*(CPU_ENTRY_AREA_PAGES+1)) & PMD_MASK) #define LDT_BASE_ADDR \ ((CPU_ENTRY_AREA_BASE - PAGE_SIZE) & PMD_MASK) --- linux-riscv-5.4.0.orig/arch/x86/include/asm/segment.h +++ linux-riscv-5.4.0/arch/x86/include/asm/segment.h @@ -31,6 +31,18 @@ */ #define SEGMENT_RPL_MASK 0x3 +/* + * When running on Xen PV, the actual privilege level of the kernel is 1, + * not 0. Testing the Requested Privilege Level in a segment selector to + * determine whether the context is user mode or kernel mode with + * SEGMENT_RPL_MASK is wrong because the PV kernel's privilege level + * matches the 0x3 mask. + * + * Testing with USER_SEGMENT_RPL_MASK is valid for both native and Xen PV + * kernels because privilege level 2 is never used. + */ +#define USER_SEGMENT_RPL_MASK 0x2 + /* User mode is privilege level 3: */ #define USER_RPL 0x3 --- linux-riscv-5.4.0.orig/arch/x86/include/asm/syscall_wrapper.h +++ linux-riscv-5.4.0/arch/x86/include/asm/syscall_wrapper.h @@ -6,6 +6,8 @@ #ifndef _ASM_X86_SYSCALL_WRAPPER_H #define _ASM_X86_SYSCALL_WRAPPER_H +struct pt_regs; + /* Mapping of registers to parameters for syscalls on x86-64 and x32 */ #define SC_X86_64_REGS_TO_ARGS(x, ...) \ __MAP(x,__SC_ARGS \ @@ -28,13 +30,21 @@ * kernel/sys_ni.c and SYS_NI in kernel/time/posix-stubs.c to cover this * case as well. */ +#define __IA32_COMPAT_SYS_STUB0(x, name) \ + asmlinkage long __ia32_compat_sys_##name(const struct pt_regs *regs);\ + ALLOW_ERROR_INJECTION(__ia32_compat_sys_##name, ERRNO); \ + asmlinkage long __ia32_compat_sys_##name(const struct pt_regs *regs)\ + { \ + return __se_compat_sys_##name(); \ + } + #define __IA32_COMPAT_SYS_STUBx(x, name, ...) \ asmlinkage long __ia32_compat_sys##name(const struct pt_regs *regs);\ ALLOW_ERROR_INJECTION(__ia32_compat_sys##name, ERRNO); \ asmlinkage long __ia32_compat_sys##name(const struct pt_regs *regs)\ { \ return __se_compat_sys##name(SC_IA32_REGS_TO_ARGS(x,__VA_ARGS__));\ - } \ + } #define __IA32_SYS_STUBx(x, name, ...) \ asmlinkage long __ia32_sys##name(const struct pt_regs *regs); \ @@ -48,16 +58,23 @@ * To keep the naming coherent, re-define SYSCALL_DEFINE0 to create an alias * named __ia32_sys_*() */ -#define SYSCALL_DEFINE0(sname) \ - SYSCALL_METADATA(_##sname, 0); \ - asmlinkage long __x64_sys_##sname(void); \ - ALLOW_ERROR_INJECTION(__x64_sys_##sname, ERRNO); \ - SYSCALL_ALIAS(__ia32_sys_##sname, __x64_sys_##sname); \ - asmlinkage long __x64_sys_##sname(void) - -#define COND_SYSCALL(name) \ - cond_syscall(__x64_sys_##name); \ - cond_syscall(__ia32_sys_##name) + +#define SYSCALL_DEFINE0(sname) \ + SYSCALL_METADATA(_##sname, 0); \ + asmlinkage long __x64_sys_##sname(const struct pt_regs *__unused);\ + ALLOW_ERROR_INJECTION(__x64_sys_##sname, ERRNO); \ + SYSCALL_ALIAS(__ia32_sys_##sname, __x64_sys_##sname); \ + asmlinkage long __x64_sys_##sname(const struct pt_regs *__unused) + +#define COND_SYSCALL(name) \ + asmlinkage __weak long __x64_sys_##name(const struct pt_regs *__unused) \ + { \ + return sys_ni_syscall(); \ + } \ + asmlinkage __weak long __ia32_sys_##name(const struct pt_regs *__unused)\ + { \ + return sys_ni_syscall(); \ + } #define SYS_NI(name) \ SYSCALL_ALIAS(__x64_sys_##name, sys_ni_posix_timers); \ @@ -75,15 +92,24 @@ * of the x86-64-style parameter ordering of x32 syscalls. The syscalls common * with x86_64 obviously do not need such care. */ +#define __X32_COMPAT_SYS_STUB0(x, name, ...) \ + asmlinkage long __x32_compat_sys_##name(const struct pt_regs *regs);\ + ALLOW_ERROR_INJECTION(__x32_compat_sys_##name, ERRNO); \ + asmlinkage long __x32_compat_sys_##name(const struct pt_regs *regs)\ + { \ + return __se_compat_sys_##name();\ + } + #define __X32_COMPAT_SYS_STUBx(x, name, ...) \ asmlinkage long __x32_compat_sys##name(const struct pt_regs *regs);\ ALLOW_ERROR_INJECTION(__x32_compat_sys##name, ERRNO); \ asmlinkage long __x32_compat_sys##name(const struct pt_regs *regs)\ { \ return __se_compat_sys##name(SC_X86_64_REGS_TO_ARGS(x,__VA_ARGS__));\ - } \ + } #else /* CONFIG_X86_X32 */ +#define __X32_COMPAT_SYS_STUB0(x, name) #define __X32_COMPAT_SYS_STUBx(x, name, ...) #endif /* CONFIG_X86_X32 */ @@ -94,6 +120,17 @@ * mapping of registers to parameters, we need to generate stubs for each * of them. */ +#define COMPAT_SYSCALL_DEFINE0(name) \ + static long __se_compat_sys_##name(void); \ + static inline long __do_compat_sys_##name(void); \ + __IA32_COMPAT_SYS_STUB0(x, name) \ + __X32_COMPAT_SYS_STUB0(x, name) \ + static long __se_compat_sys_##name(void) \ + { \ + return __do_compat_sys_##name(); \ + } \ + static inline long __do_compat_sys_##name(void) + #define COMPAT_SYSCALL_DEFINEx(x, name, ...) \ static long __se_compat_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \ static inline long __do_compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__));\ @@ -181,15 +218,19 @@ * macros to work correctly. */ #ifndef SYSCALL_DEFINE0 -#define SYSCALL_DEFINE0(sname) \ - SYSCALL_METADATA(_##sname, 0); \ - asmlinkage long __x64_sys_##sname(void); \ - ALLOW_ERROR_INJECTION(__x64_sys_##sname, ERRNO); \ - asmlinkage long __x64_sys_##sname(void) +#define SYSCALL_DEFINE0(sname) \ + SYSCALL_METADATA(_##sname, 0); \ + asmlinkage long __x64_sys_##sname(const struct pt_regs *__unused);\ + ALLOW_ERROR_INJECTION(__x64_sys_##sname, ERRNO); \ + asmlinkage long __x64_sys_##sname(const struct pt_regs *__unused) #endif #ifndef COND_SYSCALL -#define COND_SYSCALL(name) cond_syscall(__x64_sys_##name) +#define COND_SYSCALL(name) \ + asmlinkage __weak long __x64_sys_##name(const struct pt_regs *__unused) \ + { \ + return sys_ni_syscall(); \ + } #endif #ifndef SYS_NI @@ -201,7 +242,6 @@ * For VSYSCALLS, we need to declare these three syscalls with the new * pt_regs-based calling convention for in-kernel use. */ -struct pt_regs; asmlinkage long __x64_sys_getcpu(const struct pt_regs *regs); asmlinkage long __x64_sys_gettimeofday(const struct pt_regs *regs); asmlinkage long __x64_sys_time(const struct pt_regs *regs); --- linux-riscv-5.4.0.orig/arch/x86/include/asm/x86_init.h +++ linux-riscv-5.4.0/arch/x86/include/asm/x86_init.h @@ -51,12 +51,14 @@ * are set up. * @intr_init: interrupt init code * @trap_init: platform specific trap setup + * @intr_mode_select: interrupt delivery mode selection * @intr_mode_init: interrupt delivery mode setup */ struct x86_init_irqs { void (*pre_vector_init)(void); void (*intr_init)(void); void (*trap_init)(void); + void (*intr_mode_select)(void); void (*intr_mode_init)(void); }; --- linux-riscv-5.4.0.orig/arch/x86/kernel/acpi/wakeup_32.S +++ linux-riscv-5.4.0/arch/x86/kernel/acpi/wakeup_32.S @@ -3,6 +3,7 @@ #include #include #include +#include # Copyright 2003, 2008 Pavel Machek #include #include +#include # Copyright 2003 Pavel Machek = 0x40; bytes++) { - u8 id; - - pos &= ~3; - id = read_pci_config_byte(bus, slot, func, pos+PCI_CAP_LIST_ID); - if (id == 0xff) - break; - if (id == cap) - return pos; - pos = read_pci_config_byte(bus, slot, func, - pos+PCI_CAP_LIST_NEXT); - } - return 0; -} - /* Read a standard AGPv3 bridge header */ static u32 __init read_agp(int bus, int slot, int func, int cap, u32 *order) { @@ -240,8 +214,8 @@ case PCI_CLASS_BRIDGE_HOST: case PCI_CLASS_BRIDGE_OTHER: /* needed? */ /* AGP bridge? */ - cap = find_cap(bus, slot, func, - PCI_CAP_ID_AGP); + cap = pci_early_find_cap(bus, slot, + func, PCI_CAP_ID_AGP); if (!cap) break; *valid_agp = 1; --- linux-riscv-5.4.0.orig/arch/x86/kernel/apic/apic.c +++ linux-riscv-5.4.0/arch/x86/kernel/apic/apic.c @@ -830,8 +830,17 @@ if (!tsc_khz || !cpu_khz) return true; - /* Is there an APIC at all? */ - if (!boot_cpu_has(X86_FEATURE_APIC)) + /* Is there an APIC at all or is it disabled? */ + if (!boot_cpu_has(X86_FEATURE_APIC) || disable_apic) + return true; + + /* + * If interrupt delivery mode is legacy PIC or virtual wire without + * configuration, the local APIC timer wont be set up. Make sure + * that the PIT is initialized. + */ + if (apic_intr_mode == APIC_PIC || + apic_intr_mode == APIC_VIRTUAL_WIRE_NO_CONFIG) return true; /* Virt guests may lack ARAT, but still have DEADLINE */ @@ -1322,7 +1331,7 @@ enum apic_intr_mode_id apic_intr_mode __ro_after_init; -static int __init apic_intr_mode_select(void) +static int __init __apic_intr_mode_select(void) { /* Check kernel option */ if (disable_apic) { @@ -1384,6 +1393,12 @@ return APIC_SYMMETRIC_IO; } +/* Select the interrupt delivery mode for the BSP */ +void __init apic_intr_mode_select(void) +{ + apic_intr_mode = __apic_intr_mode_select(); +} + /* * An initial setup of the virtual wire mode. */ @@ -1440,8 +1455,6 @@ { bool upmode = IS_ENABLED(CONFIG_UP_LATE_INIT); - apic_intr_mode = apic_intr_mode_select(); - switch (apic_intr_mode) { case APIC_PIC: pr_info("APIC: Keep in PIC mode(8259)\n"); --- linux-riscv-5.4.0.orig/arch/x86/kernel/apic/io_apic.c +++ linux-riscv-5.4.0/arch/x86/kernel/apic/io_apic.c @@ -1727,9 +1727,10 @@ static inline bool ioapic_irqd_mask(struct irq_data *data) { - /* If we are moving the irq we need to mask it */ + /* If we are moving the IRQ we need to mask it */ if (unlikely(irqd_is_setaffinity_pending(data))) { - mask_ioapic_irq(data); + if (!irqd_irq_masked(data)) + mask_ioapic_irq(data); return true; } return false; @@ -1766,7 +1767,9 @@ */ if (!io_apic_level_ack_pending(data->chip_data)) irq_move_masked_irq(data); - unmask_ioapic_irq(data); + /* If the IRQ is masked in the core, leave it: */ + if (!irqd_irq_masked(data)) + unmask_ioapic_irq(data); } } #else --- linux-riscv-5.4.0.orig/arch/x86/kernel/apic/msi.c +++ linux-riscv-5.4.0/arch/x86/kernel/apic/msi.c @@ -23,10 +23,8 @@ static struct irq_domain *msi_default_domain; -static void irq_msi_compose_msg(struct irq_data *data, struct msi_msg *msg) +static void __irq_msi_compose_msg(struct irq_cfg *cfg, struct msi_msg *msg) { - struct irq_cfg *cfg = irqd_cfg(data); - msg->address_hi = MSI_ADDR_BASE_HI; if (x2apic_enabled()) @@ -47,6 +45,127 @@ MSI_DATA_VECTOR(cfg->vector); } +static void irq_msi_compose_msg(struct irq_data *data, struct msi_msg *msg) +{ + __irq_msi_compose_msg(irqd_cfg(data), msg); +} + +static void irq_msi_update_msg(struct irq_data *irqd, struct irq_cfg *cfg) +{ + struct msi_msg msg[2] = { [1] = { }, }; + + __irq_msi_compose_msg(cfg, msg); + irq_data_get_irq_chip(irqd)->irq_write_msi_msg(irqd, msg); +} + +static int +msi_set_affinity(struct irq_data *irqd, const struct cpumask *mask, bool force) +{ + struct irq_cfg old_cfg, *cfg = irqd_cfg(irqd); + struct irq_data *parent = irqd->parent_data; + unsigned int cpu; + int ret; + + /* Save the current configuration */ + cpu = cpumask_first(irq_data_get_effective_affinity_mask(irqd)); + old_cfg = *cfg; + + /* Allocate a new target vector */ + ret = parent->chip->irq_set_affinity(parent, mask, force); + if (ret < 0 || ret == IRQ_SET_MASK_OK_DONE) + return ret; + + /* + * For non-maskable and non-remapped MSI interrupts the migration + * to a different destination CPU and a different vector has to be + * done careful to handle the possible stray interrupt which can be + * caused by the non-atomic update of the address/data pair. + * + * Direct update is possible when: + * - The MSI is maskable (remapped MSI does not use this code path)). + * The quirk bit is not set in this case. + * - The new vector is the same as the old vector + * - The old vector is MANAGED_IRQ_SHUTDOWN_VECTOR (interrupt starts up) + * - The new destination CPU is the same as the old destination CPU + */ + if (!irqd_msi_nomask_quirk(irqd) || + cfg->vector == old_cfg.vector || + old_cfg.vector == MANAGED_IRQ_SHUTDOWN_VECTOR || + cfg->dest_apicid == old_cfg.dest_apicid) { + irq_msi_update_msg(irqd, cfg); + return ret; + } + + /* + * Paranoia: Validate that the interrupt target is the local + * CPU. + */ + if (WARN_ON_ONCE(cpu != smp_processor_id())) { + irq_msi_update_msg(irqd, cfg); + return ret; + } + + /* + * Redirect the interrupt to the new vector on the current CPU + * first. This might cause a spurious interrupt on this vector if + * the device raises an interrupt right between this update and the + * update to the final destination CPU. + * + * If the vector is in use then the installed device handler will + * denote it as spurious which is no harm as this is a rare event + * and interrupt handlers have to cope with spurious interrupts + * anyway. If the vector is unused, then it is marked so it won't + * trigger the 'No irq handler for vector' warning in do_IRQ(). + * + * This requires to hold vector lock to prevent concurrent updates to + * the affected vector. + */ + lock_vector_lock(); + + /* + * Mark the new target vector on the local CPU if it is currently + * unused. Reuse the VECTOR_RETRIGGERED state which is also used in + * the CPU hotplug path for a similar purpose. This cannot be + * undone here as the current CPU has interrupts disabled and + * cannot handle the interrupt before the whole set_affinity() + * section is done. In the CPU unplug case, the current CPU is + * about to vanish and will not handle any interrupts anymore. The + * vector is cleaned up when the CPU comes online again. + */ + if (IS_ERR_OR_NULL(this_cpu_read(vector_irq[cfg->vector]))) + this_cpu_write(vector_irq[cfg->vector], VECTOR_RETRIGGERED); + + /* Redirect it to the new vector on the local CPU temporarily */ + old_cfg.vector = cfg->vector; + irq_msi_update_msg(irqd, &old_cfg); + + /* Now transition it to the target CPU */ + irq_msi_update_msg(irqd, cfg); + + /* + * All interrupts after this point are now targeted at the new + * vector/CPU. + * + * Drop vector lock before testing whether the temporary assignment + * to the local CPU was hit by an interrupt raised in the device, + * because the retrigger function acquires vector lock again. + */ + unlock_vector_lock(); + + /* + * Check whether the transition raced with a device interrupt and + * is pending in the local APICs IRR. It is safe to do this outside + * of vector lock as the irq_desc::lock of this interrupt is still + * held and interrupts are disabled: The check is not accessing the + * underlying vector store. It's just checking the local APIC's + * IRR. + */ + if (lapic_vector_set_in_irr(cfg->vector)) + irq_data_get_irq_chip(irqd)->irq_retrigger(irqd); + + return ret; +} + /* * IRQ Chip for MSI PCI/PCI-X/PCI-Express Devices, * which implement the MSI or MSI-X Capability Structure. @@ -58,6 +177,7 @@ .irq_ack = irq_chip_ack_parent, .irq_retrigger = irq_chip_retrigger_hierarchy, .irq_compose_msi_msg = irq_msi_compose_msg, + .irq_set_affinity = msi_set_affinity, .flags = IRQCHIP_SKIP_SET_WAKE, }; @@ -146,6 +266,8 @@ } if (!msi_default_domain) pr_warn("failed to initialize irqdomain for MSI/MSI-x.\n"); + else + msi_default_domain->flags |= IRQ_DOMAIN_MSI_NOMASK_QUIRK; } #ifdef CONFIG_IRQ_REMAP --- linux-riscv-5.4.0.orig/arch/x86/kernel/cpu/amd.c +++ linux-riscv-5.4.0/arch/x86/kernel/cpu/amd.c @@ -28,6 +28,7 @@ static const int amd_erratum_383[]; static const int amd_erratum_400[]; +static const int amd_erratum_1054[]; static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum); /* @@ -615,9 +616,9 @@ return; clear_all: - clear_cpu_cap(c, X86_FEATURE_SME); + setup_clear_cpu_cap(X86_FEATURE_SME); clear_sev: - clear_cpu_cap(c, X86_FEATURE_SEV); + setup_clear_cpu_cap(X86_FEATURE_SEV); } } @@ -978,6 +979,15 @@ /* AMD CPUs don't reset SS attributes on SYSRET, Xen does. */ if (!cpu_has(c, X86_FEATURE_XENPV)) set_cpu_bug(c, X86_BUG_SYSRET_SS_ATTRS); + + /* + * Turn on the Instructions Retired free counter on machines not + * susceptible to erratum #1054 "Instructions Retired Performance + * Counter May Be Inaccurate". + */ + if (cpu_has(c, X86_FEATURE_IRPERF) && + !cpu_has_amd_erratum(c, amd_erratum_1054)) + msr_set_bit(MSR_K7_HWCR, MSR_K7_HWCR_IRPERF_EN_BIT); } #ifdef CONFIG_X86_32 @@ -1105,6 +1115,10 @@ static const int amd_erratum_383[] = AMD_OSVW_ERRATUM(3, AMD_MODEL_RANGE(0x10, 0, 0, 0xff, 0xf)); +/* #1054: Instructions Retired Performance Counter May Be Inaccurate */ +static const int amd_erratum_1054[] = + AMD_OSVW_ERRATUM(0, AMD_MODEL_RANGE(0x17, 0, 0, 0x2f, 0xf)); + static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum) { --- linux-riscv-5.4.0.orig/arch/x86/kernel/cpu/bugs.c +++ linux-riscv-5.4.0/arch/x86/kernel/cpu/bugs.c @@ -39,6 +39,7 @@ static void __init ssb_select_mitigation(void); static void __init l1tf_select_mitigation(void); static void __init mds_select_mitigation(void); +static void __init mds_print_mitigation(void); static void __init taa_select_mitigation(void); /* The base value of the SPEC_CTRL MSR that always has to be preserved. */ @@ -108,6 +109,12 @@ mds_select_mitigation(); taa_select_mitigation(); + /* + * As MDS and TAA mitigations are inter-related, print MDS + * mitigation until after TAA mitigation selection is done. + */ + mds_print_mitigation(); + arch_smt_update(); #ifdef CONFIG_X86_32 @@ -245,6 +252,12 @@ (mds_nosmt || cpu_mitigations_auto_nosmt())) cpu_smt_disable(false); } +} + +static void __init mds_print_mitigation(void) +{ + if (!boot_cpu_has_bug(X86_BUG_MDS) || cpu_mitigations_off()) + return; pr_info("%s\n", mds_strings[mds_mitigation]); } @@ -304,8 +317,12 @@ return; } - /* TAA mitigation is turned off on the cmdline (tsx_async_abort=off) */ - if (taa_mitigation == TAA_MITIGATION_OFF) + /* + * TAA mitigation via VERW is turned off if both + * tsx_async_abort=off and mds=off are specified. + */ + if (taa_mitigation == TAA_MITIGATION_OFF && + mds_mitigation == MDS_MITIGATION_OFF) goto out; if (boot_cpu_has(X86_FEATURE_MD_CLEAR)) @@ -339,6 +356,15 @@ if (taa_nosmt || cpu_mitigations_auto_nosmt()) cpu_smt_disable(false); + /* + * Update MDS mitigation, if necessary, as the mds_user_clear is + * now enabled for TAA mitigation. + */ + if (mds_mitigation == MDS_MITIGATION_OFF && + boot_cpu_has_bug(X86_BUG_MDS)) { + mds_mitigation = MDS_MITIGATION_FULL; + mds_select_mitigation(); + } out: pr_info("%s\n", taa_strings[taa_mitigation]); } --- linux-riscv-5.4.0.orig/arch/x86/kernel/cpu/common.c +++ linux-riscv-5.4.0/arch/x86/kernel/cpu/common.c @@ -464,7 +464,7 @@ * cpuid bit to be set. We need to ensure that we * update that bit in this CPU's "cpu_info". */ - get_cpu_cap(c); + set_cpu_cap(c, X86_FEATURE_OSPKE); } #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS --- linux-riscv-5.4.0.orig/arch/x86/kernel/cpu/mce/amd.c +++ linux-riscv-5.4.0/arch/x86/kernel/cpu/mce/amd.c @@ -78,6 +78,7 @@ static struct smca_bank_name smca_names[] = { [SMCA_LS] = { "load_store", "Load Store Unit" }, + [SMCA_LS_V2] = { "load_store", "Load Store Unit" }, [SMCA_IF] = { "insn_fetch", "Instruction Fetch Unit" }, [SMCA_L2_CACHE] = { "l2_cache", "L2 Cache" }, [SMCA_DE] = { "decode_unit", "Decode Unit" }, @@ -138,6 +139,7 @@ /* ZN Core (HWID=0xB0) MCA types */ { SMCA_LS, HWID_MCATYPE(0xB0, 0x0), 0x1FFFFF }, + { SMCA_LS_V2, HWID_MCATYPE(0xB0, 0x10), 0xFFFFFF }, { SMCA_IF, HWID_MCATYPE(0xB0, 0x1), 0x3FFF }, { SMCA_L2_CACHE, HWID_MCATYPE(0xB0, 0x2), 0xF }, { SMCA_DE, HWID_MCATYPE(0xB0, 0x3), 0x1FF }, @@ -266,10 +268,10 @@ smca_set_misc_banks_map(bank, cpu); /* Return early if this bank was already initialized. */ - if (smca_banks[bank].hwid) + if (smca_banks[bank].hwid && smca_banks[bank].hwid->hwid_mcatype != 0) return; - if (rdmsr_safe_on_cpu(cpu, MSR_AMD64_SMCA_MCx_IPID(bank), &low, &high)) { + if (rdmsr_safe(MSR_AMD64_SMCA_MCx_IPID(bank), &low, &high)) { pr_warn("Failed to read MCA_IPID for bank %d\n", bank); return; } @@ -1161,9 +1163,12 @@ .store = store, }; +static void threshold_block_release(struct kobject *kobj); + static struct kobj_type threshold_ktype = { .sysfs_ops = &threshold_ops, .default_attrs = default_attrs, + .release = threshold_block_release, }; static const char *get_name(unsigned int bank, struct threshold_block *b) @@ -1196,8 +1201,9 @@ return buf_mcatype; } -static int allocate_threshold_blocks(unsigned int cpu, unsigned int bank, - unsigned int block, u32 address) +static int allocate_threshold_blocks(unsigned int cpu, struct threshold_bank *tb, + unsigned int bank, unsigned int block, + u32 address) { struct threshold_block *b = NULL; u32 low, high; @@ -1241,16 +1247,12 @@ INIT_LIST_HEAD(&b->miscj); - if (per_cpu(threshold_banks, cpu)[bank]->blocks) { - list_add(&b->miscj, - &per_cpu(threshold_banks, cpu)[bank]->blocks->miscj); - } else { - per_cpu(threshold_banks, cpu)[bank]->blocks = b; - } + if (tb->blocks) + list_add(&b->miscj, &tb->blocks->miscj); + else + tb->blocks = b; - err = kobject_init_and_add(&b->kobj, &threshold_ktype, - per_cpu(threshold_banks, cpu)[bank]->kobj, - get_name(bank, b)); + err = kobject_init_and_add(&b->kobj, &threshold_ktype, tb->kobj, get_name(bank, b)); if (err) goto out_free; recurse: @@ -1258,7 +1260,7 @@ if (!address) return 0; - err = allocate_threshold_blocks(cpu, bank, block, address); + err = allocate_threshold_blocks(cpu, tb, bank, block, address); if (err) goto out_free; @@ -1343,8 +1345,6 @@ goto out_free; } - per_cpu(threshold_banks, cpu)[bank] = b; - if (is_shared_bank(bank)) { refcount_set(&b->cpus, 1); @@ -1355,9 +1355,13 @@ } } - err = allocate_threshold_blocks(cpu, bank, 0, msr_ops.misc(bank)); - if (!err) - goto out; + err = allocate_threshold_blocks(cpu, b, bank, 0, msr_ops.misc(bank)); + if (err) + goto out_free; + + per_cpu(threshold_banks, cpu)[bank] = b; + + return 0; out_free: kfree(b); @@ -1366,8 +1370,12 @@ return err; } -static void deallocate_threshold_block(unsigned int cpu, - unsigned int bank) +static void threshold_block_release(struct kobject *kobj) +{ + kfree(to_block(kobj)); +} + +static void deallocate_threshold_block(unsigned int cpu, unsigned int bank) { struct threshold_block *pos = NULL; struct threshold_block *tmp = NULL; @@ -1377,13 +1385,11 @@ return; list_for_each_entry_safe(pos, tmp, &head->blocks->miscj, miscj) { - kobject_put(&pos->kobj); list_del(&pos->miscj); - kfree(pos); + kobject_put(&pos->kobj); } - kfree(per_cpu(threshold_banks, cpu)[bank]->blocks); - per_cpu(threshold_banks, cpu)[bank]->blocks = NULL; + kobject_put(&head->blocks->kobj); } static void __threshold_remove_blocks(struct threshold_bank *b) --- linux-riscv-5.4.0.orig/arch/x86/kernel/cpu/mce/core.c +++ linux-riscv-5.4.0/arch/x86/kernel/cpu/mce/core.c @@ -814,8 +814,8 @@ if (quirk_no_way_out) quirk_no_way_out(i, m, regs); + m->bank = i; if (mce_severity(m, mca_cfg.tolerant, &tmp, true) >= MCE_PANIC_SEVERITY) { - m->bank = i; mce_read_aux(m, i); *msg = tmp; return 1; --- linux-riscv-5.4.0.orig/arch/x86/kernel/cpu/mce/intel.c +++ linux-riscv-5.4.0/arch/x86/kernel/cpu/mce/intel.c @@ -489,17 +489,18 @@ return; if ((val & 3UL) == 1UL) { - /* PPIN available but disabled: */ + /* PPIN locked in disabled mode */ return; } - /* If PPIN is disabled, but not locked, try to enable: */ - if (!(val & 3UL)) { + /* If PPIN is disabled, try to enable */ + if (!(val & 2UL)) { wrmsrl_safe(MSR_PPIN_CTL, val | 2UL); rdmsrl_safe(MSR_PPIN_CTL, &val); } - if ((val & 3UL) == 2UL) + /* Is the enable bit set? */ + if (val & 2UL) set_cpu_cap(c, X86_FEATURE_INTEL_PPIN); } } --- linux-riscv-5.4.0.orig/arch/x86/kernel/cpu/mce/therm_throt.c +++ linux-riscv-5.4.0/arch/x86/kernel/cpu/mce/therm_throt.c @@ -188,7 +188,7 @@ /* if we just entered the thermal event */ if (new_event) { if (event == THERMAL_THROTTLING_EVENT) - pr_crit("CPU%d: %s temperature above threshold, cpu clock throttled (total events = %lu)\n", + pr_warn("CPU%d: %s temperature above threshold, cpu clock throttled (total events = %lu)\n", this_cpu, level == CORE_LEVEL ? "Core" : "Package", state->count); --- linux-riscv-5.4.0.orig/arch/x86/kernel/cpu/resctrl/core.c +++ linux-riscv-5.4.0/arch/x86/kernel/cpu/resctrl/core.c @@ -618,7 +618,7 @@ if (static_branch_unlikely(&rdt_mon_enable_key)) rmdir_mondata_subdir_allrdtgrp(r, d->id); list_del(&d->list); - if (is_mbm_enabled()) + if (r->mon_capable && is_mbm_enabled()) cancel_delayed_work(&d->mbm_over); if (is_llc_occupancy_enabled() && has_busy_rmid(r, d)) { /* --- linux-riscv-5.4.0.orig/arch/x86/kernel/cpu/resctrl/internal.h +++ linux-riscv-5.4.0/arch/x86/kernel/cpu/resctrl/internal.h @@ -57,6 +57,7 @@ } DECLARE_STATIC_KEY_FALSE(rdt_enable_key); +DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key); /** * struct mon_evt - Entry in the event list of a resource --- linux-riscv-5.4.0.orig/arch/x86/kernel/cpu/resctrl/monitor.c +++ linux-riscv-5.4.0/arch/x86/kernel/cpu/resctrl/monitor.c @@ -514,7 +514,7 @@ mutex_lock(&rdtgroup_mutex); - if (!static_branch_likely(&rdt_enable_key)) + if (!static_branch_likely(&rdt_mon_enable_key)) goto out_unlock; d = get_domain_from_cpu(cpu, &rdt_resources_all[RDT_RESOURCE_L3]); @@ -543,7 +543,7 @@ unsigned long delay = msecs_to_jiffies(delay_ms); int cpu; - if (!static_branch_likely(&rdt_enable_key)) + if (!static_branch_likely(&rdt_mon_enable_key)) return; cpu = cpumask_any(&dom->cpu_mask); dom->mbm_work_cpu = cpu; --- linux-riscv-5.4.0.orig/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ linux-riscv-5.4.0/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -1741,9 +1741,6 @@ struct rdt_domain *d; int cpu; - if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL)) - return -ENOMEM; - if (level == RDT_RESOURCE_L3) update = l3_qos_cfg_update; else if (level == RDT_RESOURCE_L2) @@ -1751,6 +1748,9 @@ else return -EINVAL; + if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL)) + return -ENOMEM; + r_l = &rdt_resources_all[level]; list_for_each_entry(d, &r_l->domains, list) { /* Pick one CPU from each domain instance to update MSR */ @@ -1970,7 +1970,7 @@ if (rdt_mon_capable) { ret = mongroup_create_dir(rdtgroup_default.kn, - NULL, "mon_groups", + &rdtgroup_default, "mon_groups", &kn_mongrp); if (ret < 0) goto out_info; @@ -2205,7 +2205,11 @@ list_for_each_entry_safe(sentry, stmp, head, mon.crdtgrp_list) { free_rmid(sentry->mon.rmid); list_del(&sentry->mon.crdtgrp_list); - kfree(sentry); + + if (atomic_read(&sentry->waitcount) != 0) + sentry->flags = RDT_DELETED; + else + kfree(sentry); } } @@ -2243,7 +2247,11 @@ kernfs_remove(rdtgrp->kn); list_del(&rdtgrp->rdtgroup_list); - kfree(rdtgrp); + + if (atomic_read(&rdtgrp->waitcount) != 0) + rdtgrp->flags = RDT_DELETED; + else + kfree(rdtgrp); } /* Notify online CPUs to update per cpu storage and PQR_ASSOC MSR */ update_closid_rmid(cpu_online_mask, &rdtgroup_default); @@ -2446,7 +2454,7 @@ /* * Create the mon_data directory first. */ - ret = mongroup_create_dir(parent_kn, NULL, "mon_data", &kn); + ret = mongroup_create_dir(parent_kn, prgrp, "mon_data", &kn); if (ret) return ret; @@ -2645,7 +2653,7 @@ uint files = 0; int ret; - prdtgrp = rdtgroup_kn_lock_live(prgrp_kn); + prdtgrp = rdtgroup_kn_lock_live(parent_kn); if (!prdtgrp) { ret = -ENODEV; goto out_unlock; @@ -2718,7 +2726,7 @@ kernfs_activate(kn); /* - * The caller unlocks the prgrp_kn upon success. + * The caller unlocks the parent_kn upon success. */ return 0; @@ -2729,7 +2737,7 @@ out_free_rgrp: kfree(rdtgrp); out_unlock: - rdtgroup_kn_unlock(prgrp_kn); + rdtgroup_kn_unlock(parent_kn); return ret; } @@ -2767,7 +2775,7 @@ */ list_add_tail(&rdtgrp->mon.crdtgrp_list, &prgrp->mon.crdtgrp_list); - rdtgroup_kn_unlock(prgrp_kn); + rdtgroup_kn_unlock(parent_kn); return ret; } @@ -2810,7 +2818,7 @@ * Create an empty mon_groups directory to hold the subset * of tasks and cpus to monitor. */ - ret = mongroup_create_dir(kn, NULL, "mon_groups", NULL); + ret = mongroup_create_dir(kn, rdtgrp, "mon_groups", NULL); if (ret) { rdt_last_cmd_puts("kernfs subdir error\n"); goto out_del_list; @@ -2826,7 +2834,7 @@ out_common_fail: mkdir_rdt_prepare_clean(rdtgrp); out_unlock: - rdtgroup_kn_unlock(prgrp_kn); + rdtgroup_kn_unlock(parent_kn); return ret; } @@ -2952,13 +2960,13 @@ closid_free(rdtgrp->closid); free_rmid(rdtgrp->mon.rmid); + rdtgroup_ctrl_remove(kn, rdtgrp); + /* * Free all the child monitor group rmids. */ free_all_child_rdtgrp(rdtgrp); - rdtgroup_ctrl_remove(kn, rdtgrp); - return 0; } --- linux-riscv-5.4.0.orig/arch/x86/kernel/cpu/tsx.c +++ linux-riscv-5.4.0/arch/x86/kernel/cpu/tsx.c @@ -115,11 +115,12 @@ tsx_disable(); /* - * tsx_disable() will change the state of the - * RTM CPUID bit. Clear it here since it is now - * expected to be not set. + * tsx_disable() will change the state of the RTM and HLE CPUID + * bits. Clear them here since they are now expected to be not + * set. */ setup_clear_cpu_cap(X86_FEATURE_RTM); + setup_clear_cpu_cap(X86_FEATURE_HLE); } else if (tsx_ctrl_state == TSX_CTRL_ENABLE) { /* @@ -131,10 +132,10 @@ tsx_enable(); /* - * tsx_enable() will change the state of the - * RTM CPUID bit. Force it here since it is now - * expected to be set. + * tsx_enable() will change the state of the RTM and HLE CPUID + * bits. Force them here since they are now expected to be set. */ setup_force_cpu_cap(X86_FEATURE_RTM); + setup_force_cpu_cap(X86_FEATURE_HLE); } } --- linux-riscv-5.4.0.orig/arch/x86/kernel/doublefault.c +++ linux-riscv-5.4.0/arch/x86/kernel/doublefault.c @@ -65,6 +65,9 @@ .ss = __KERNEL_DS, .ds = __USER_DS, .fs = __KERNEL_PERCPU, +#ifndef CONFIG_X86_32_LAZY_GS + .gs = __KERNEL_STACK_CANARY, +#endif .__cr3 = __pa_nodebug(swapper_pg_dir), }; --- linux-riscv-5.4.0.orig/arch/x86/kernel/early-quirks.c +++ linux-riscv-5.4.0/arch/x86/kernel/early-quirks.c @@ -28,6 +28,37 @@ #include #include +static void __init early_pci_clear_msi(int bus, int slot, int func) +{ + int pos; + u16 ctrl; + + if (likely(!pci_early_clear_msi)) + return; + + pr_info_once("Clearing MSI/MSI-X enable bits early in boot (quirk)\n"); + + pos = pci_early_find_cap(bus, slot, func, PCI_CAP_ID_MSI); + if (pos) { + ctrl = read_pci_config_16(bus, slot, func, pos + PCI_MSI_FLAGS); + ctrl &= ~PCI_MSI_FLAGS_ENABLE; + write_pci_config_16(bus, slot, func, pos + PCI_MSI_FLAGS, ctrl); + + /* Read again to flush previous write */ + ctrl = read_pci_config_16(bus, slot, func, pos + PCI_MSI_FLAGS); + } + + pos = pci_early_find_cap(bus, slot, func, PCI_CAP_ID_MSIX); + if (pos) { + ctrl = read_pci_config_16(bus, slot, func, pos + PCI_MSIX_FLAGS); + ctrl &= ~PCI_MSIX_FLAGS_ENABLE; + write_pci_config_16(bus, slot, func, pos + PCI_MSIX_FLAGS, ctrl); + + /* Read again to flush previous write */ + ctrl = read_pci_config_16(bus, slot, func, pos + PCI_MSIX_FLAGS); + } +} + static void __init fix_hypertransport_config(int num, int slot, int func) { u32 htcfg; @@ -710,10 +741,15 @@ */ { PCI_VENDOR_ID_INTEL, 0x0f00, PCI_CLASS_BRIDGE_HOST, PCI_ANY_ID, 0, force_disable_hpet}, + { PCI_VENDOR_ID_INTEL, 0x3e20, + PCI_CLASS_BRIDGE_HOST, PCI_ANY_ID, 0, force_disable_hpet}, { PCI_VENDOR_ID_INTEL, 0x3ec4, PCI_CLASS_BRIDGE_HOST, PCI_ANY_ID, 0, force_disable_hpet}, + { PCI_VENDOR_ID_INTEL, 0x8a12, + PCI_CLASS_BRIDGE_HOST, PCI_ANY_ID, 0, force_disable_hpet}, { PCI_VENDOR_ID_BROADCOM, 0x4331, PCI_CLASS_NETWORK_OTHER, PCI_ANY_ID, 0, apple_airport_reset}, + { PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0, early_pci_clear_msi}, {} }; @@ -766,6 +802,10 @@ PCI_HEADER_TYPE); if ((type & 0x7f) == PCI_HEADER_TYPE_BRIDGE) { + /* pci_early_clear_msi scans the buses differently. */ + if (pci_early_clear_msi) + return -1; + sec = read_pci_config_byte(num, slot, func, PCI_SECONDARY_BUS); if (sec > num) early_pci_scan_bus(sec); @@ -792,8 +832,13 @@ void __init early_quirks(void) { + int bus; + if (!early_pci_allowed()) return; early_pci_scan_bus(0); + /* pci_early_clear_msi scans more buses. */ + for (bus = 1; pci_early_clear_msi && bus < 256; bus++) + early_pci_scan_bus(bus); } --- linux-riscv-5.4.0.orig/arch/x86/kernel/fpu/signal.c +++ linux-riscv-5.4.0/arch/x86/kernel/fpu/signal.c @@ -352,6 +352,7 @@ fpregs_unlock(); return 0; } + fpregs_deactivate(fpu); fpregs_unlock(); } @@ -403,6 +404,8 @@ } if (!ret) fpregs_mark_activate(); + else + fpregs_deactivate(fpu); fpregs_unlock(); err_out: --- linux-riscv-5.4.0.orig/arch/x86/kernel/head_32.S +++ linux-riscv-5.4.0/arch/x86/kernel/head_32.S @@ -26,6 +26,7 @@ #include #include #include +#include /* Physical address */ #define pa(X) ((X) - __PAGE_OFFSET) @@ -153,6 +154,7 @@ movl pa(subarch_entries)(,%eax,4), %eax subl $__PAGE_OFFSET, %eax + ANNOTATE_RETPOLINE_SAFE jmp *%eax .Lbad_subarch: @@ -302,6 +304,7 @@ movl setup_once_ref,%eax andl %eax,%eax jz 1f # Did we do this already? + ANNOTATE_RETPOLINE_SAFE call *%eax 1: @@ -571,6 +574,16 @@ # error "Kernel PMDs should be 1, 2 or 3" # endif .align PAGE_SIZE /* needs to be page-sized too */ + +#ifdef CONFIG_PAGE_TABLE_ISOLATION + /* + * PTI needs another page so sync_initial_pagetable() works correctly + * and does not scribble over the data which is placed behind the + * actual initial_page_table. See clone_pgd_range(). + */ + .fill 1024, 4, 0 +#endif + #endif .data --- linux-riscv-5.4.0.orig/arch/x86/kernel/ima_arch.c +++ linux-riscv-5.4.0/arch/x86/kernel/ima_arch.c @@ -10,8 +10,6 @@ static enum efi_secureboot_mode get_sb_mode(void) { - efi_char16_t efi_SecureBoot_name[] = L"SecureBoot"; - efi_char16_t efi_SetupMode_name[] = L"SecureBoot"; efi_guid_t efi_variable_guid = EFI_GLOBAL_VARIABLE_GUID; efi_status_t status; unsigned long size; @@ -25,7 +23,7 @@ } /* Get variable contents into buffer */ - status = efi.get_variable(efi_SecureBoot_name, &efi_variable_guid, + status = efi.get_variable(L"SecureBoot", &efi_variable_guid, NULL, &size, &secboot); if (status == EFI_NOT_FOUND) { pr_info("ima: secureboot mode disabled\n"); @@ -38,7 +36,7 @@ } size = sizeof(setupmode); - status = efi.get_variable(efi_SetupMode_name, &efi_variable_guid, + status = efi.get_variable(L"SetupMode", &efi_variable_guid, NULL, &size, &setupmode); if (status != EFI_SUCCESS) /* ignore unknown SetupMode */ --- linux-riscv-5.4.0.orig/arch/x86/kernel/nmi.c +++ linux-riscv-5.4.0/arch/x86/kernel/nmi.c @@ -104,18 +104,22 @@ } fs_initcall(nmi_warning_debugfs); -static void nmi_max_handler(struct irq_work *w) +static void nmi_check_duration(struct nmiaction *action, u64 duration) { - struct nmiaction *a = container_of(w, struct nmiaction, irq_work); + u64 whole_msecs = READ_ONCE(action->max_duration); int remainder_ns, decimal_msecs; - u64 whole_msecs = READ_ONCE(a->max_duration); + + if (duration < nmi_longest_ns || duration < action->max_duration) + return; + + action->max_duration = duration; remainder_ns = do_div(whole_msecs, (1000 * 1000)); decimal_msecs = remainder_ns / 1000; printk_ratelimited(KERN_INFO "INFO: NMI handler (%ps) took too long to run: %lld.%03d msecs\n", - a->handler, whole_msecs, decimal_msecs); + action->handler, whole_msecs, decimal_msecs); } static int nmi_handle(unsigned int type, struct pt_regs *regs) @@ -142,11 +146,7 @@ delta = sched_clock() - delta; trace_nmi_handler(a->handler, (int)delta, thishandled); - if (delta < nmi_longest_ns || delta < a->max_duration) - continue; - - a->max_duration = delta; - irq_work_queue(&a->irq_work); + nmi_check_duration(a, delta); } rcu_read_unlock(); @@ -164,8 +164,6 @@ if (!action->handler) return -EINVAL; - init_irq_work(&action->irq_work, nmi_max_handler); - raw_spin_lock_irqsave(&desc->lock, flags); /* --- linux-riscv-5.4.0.orig/arch/x86/kernel/reboot.c +++ linux-riscv-5.4.0/arch/x86/kernel/reboot.c @@ -32,6 +32,7 @@ #include #include #include +#include /* * Power off function, if any @@ -127,11 +128,11 @@ /* Jump to the identity-mapped low memory code */ #ifdef CONFIG_X86_32 - asm volatile("jmpl *%0" : : + asm volatile(ANNOTATE_RETPOLINE_SAFE "jmpl *%0" : : "rm" (real_mode_header->machine_real_restart_asm), "a" (type)); #else - asm volatile("ljmpl *%0" : : + asm volatile(ANNOTATE_RETPOLINE_SAFE "ljmpl *%0" : : "m" (real_mode_header->machine_real_restart_asm), "D" (type)); #endif @@ -478,7 +479,46 @@ DMI_MATCH(DMI_PRODUCT_NAME, "VGN-Z540N"), }, }, - + { /* Handle problems with rebooting on the Latitude E6520. */ + .callback = set_pci_reboot, + .ident = "Dell Latitude E6520", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), + DMI_MATCH(DMI_PRODUCT_NAME, "Latitude E6520"), + }, + }, + { /* Handle problems with rebooting on the OptiPlex 790. */ + .callback = set_pci_reboot, + .ident = "Dell OptiPlex 790", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), + DMI_MATCH(DMI_PRODUCT_NAME, "OptiPlex 790"), + }, + }, + { /* Handle problems with rebooting on the OptiPlex 990. */ + .callback = set_pci_reboot, + .ident = "Dell OptiPlex 990", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), + DMI_MATCH(DMI_PRODUCT_NAME, "OptiPlex 990"), + }, + }, + { /* Handle problems with rebooting on the Latitude E6220. */ + .callback = set_pci_reboot, + .ident = "Dell Latitude E6220", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), + DMI_MATCH(DMI_PRODUCT_NAME, "Latitude E6220"), + }, + }, + { /* Handle problems with rebooting on the OptiPlex 390. */ + .callback = set_pci_reboot, + .ident = "Dell OptiPlex 390", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), + DMI_MATCH(DMI_PRODUCT_NAME, "OptiPlex 390"), + }, + }, { } }; --- linux-riscv-5.4.0.orig/arch/x86/kernel/relocate_kernel_32.S +++ linux-riscv-5.4.0/arch/x86/kernel/relocate_kernel_32.S @@ -8,6 +8,7 @@ #include #include #include +#include /* * Must be relocatable PIC code callable as a C function @@ -165,6 +166,7 @@ movl CP_PA_SWAP_PAGE(%edi), %esp addl $PAGE_SIZE, %esp 2: + ANNOTATE_RETPOLINE_SAFE call *%edx /* get the re-entry point of the peer system */ --- linux-riscv-5.4.0.orig/arch/x86/kernel/relocate_kernel_64.S +++ linux-riscv-5.4.0/arch/x86/kernel/relocate_kernel_64.S @@ -9,6 +9,7 @@ #include #include #include +#include /* * Must be relocatable PIC code callable as a C function @@ -192,6 +193,7 @@ 1: popq %rdx leaq PAGE_SIZE(%r10), %rsp + ANNOTATE_RETPOLINE_SAFE call *%rdx /* get the re-entry point of the peer system */ --- linux-riscv-5.4.0.orig/arch/x86/kernel/setup.c +++ linux-riscv-5.4.0/arch/x86/kernel/setup.c @@ -73,6 +73,7 @@ #include #include #include +#include #include #include