khadas
diff --git a/‎Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst‎
Lines changed: 2 additions & 2 deletions b/‎Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎Documentation/RCU/Design/Requirements/Requirements.rst‎
Lines changed: 13 additions & 13 deletions b/‎Documentation/RCU/Design/Requirements/Requirements.rst‎
Lines changed: 13 additions & 13 deletions
diff --git a/‎Documentation/RCU/checklist.rst‎
Lines changed: 1 addition & 1 deletion b/‎Documentation/RCU/checklist.rst‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎Documentation/RCU/rcubarrier.rst‎
Lines changed: 3 additions & 3 deletions b/‎Documentation/RCU/rcubarrier.rst‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎Documentation/RCU/stallwarn.rst‎
Lines changed: 2 additions & 2 deletions b/‎Documentation/RCU/stallwarn.rst‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎Documentation/RCU/whatisRCU.rst‎
Lines changed: 5 additions & 5 deletions b/‎Documentation/RCU/whatisRCU.rst‎
Lines changed: 5 additions & 5 deletions
diff --git a/‎Documentation/admin-guide/kernel-parameters.txt‎
Lines changed: 11 additions & 0 deletions b/‎Documentation/admin-guide/kernel-parameters.txt‎
Lines changed: 11 additions & 0 deletions
diff --git a/‎Documentation/driver-api/io-mapping.rst‎
Lines changed: 39 additions & 53 deletions b/‎Documentation/driver-api/io-mapping.rst‎
Lines changed: 39 additions & 53 deletions
diff --git a/‎arch/Kconfig‎
Lines changed: 7 additions & 1 deletion b/‎arch/Kconfig‎
Lines changed: 7 additions & 1 deletion
diff --git a/‎arch/alpha/include/asm/kmap_types.h‎
Lines changed: 0 additions & 15 deletions b/‎arch/alpha/include/asm/kmap_types.h‎
Lines changed: 0 additions & 15 deletions
@@ -38,7 +38,7 @@ sections.
 RCU-preempt Expedited Grace Periods
 ===================================
 
-``CONFIG_PREEMPT=y`` kernels implement RCU-preempt.
+``CONFIG_PREEMPTION=y`` kernels implement RCU-preempt.
 The overall flow of the handling of a given CPU by an RCU-preempt
 expedited grace period is shown in the following diagram:
 
@@ -112,7 +112,7 @@ things.
 RCU-sched Expedited Grace Periods
 ---------------------------------
 
-``CONFIG_PREEMPT=n`` kernels implement RCU-sched. The overall flow of
+``CONFIG_PREEMPTION=n`` kernels implement RCU-sched. The overall flow of
 the handling of a given CPU by an RCU-sched expedited grace period is
 shown in the following diagram:
 
 
@@ -78,7 +78,7 @@ RCU treats a nested set as one big RCU read-side critical section.
 Production-quality implementations of ``rcu_read_lock()`` and
 ``rcu_read_unlock()`` are extremely lightweight, and in fact have
 exactly zero overhead in Linux kernels built for production use with
-``CONFIG_PREEMPT=n``.
+``CONFIG_PREEMPTION=n``.
 
 This guarantee allows ordering to be enforced with extremely low
 overhead to readers, for example:
@@ -1182,7 +1182,7 @@ and has become decreasingly so as memory sizes have expanded and memory
 costs have plummeted. However, as I learned from Matt Mackall's
 `bloatwatch <http://elinux.org/Linux_Tiny-FAQ>`__ efforts, memory
 footprint is critically important on single-CPU systems with
-non-preemptible (``CONFIG_PREEMPT=n``) kernels, and thus `tiny
+non-preemptible (``CONFIG_PREEMPTION=n``) kernels, and thus `tiny
 RCU <https://lkml.kernel.org/g/[email protected]>`__
 was born. Josh Triplett has since taken over the small-memory banner
 with his `Linux kernel tinification <https://tiny.wiki.kernel.org/>`__
@@ -1498,7 +1498,7 @@ limitations.
 
 Implementations of RCU for which ``rcu_read_lock()`` and
 ``rcu_read_unlock()`` generate no code, such as Linux-kernel RCU when
-``CONFIG_PREEMPT=n``, can be nested arbitrarily deeply. After all, there
+``CONFIG_PREEMPTION=n``, can be nested arbitrarily deeply. After all, there
 is no overhead. Except that if all these instances of
 ``rcu_read_lock()`` and ``rcu_read_unlock()`` are visible to the
 compiler, compilation will eventually fail due to exhausting memory,
@@ -1771,7 +1771,7 @@ implementation can be a no-op.
 
 However, once the scheduler has spawned its first kthread, this early
 boot trick fails for ``synchronize_rcu()`` (as well as for
-``synchronize_rcu_expedited()``) in ``CONFIG_PREEMPT=y`` kernels. The
+``synchronize_rcu_expedited()``) in ``CONFIG_PREEMPTION=y`` kernels. The
 reason is that an RCU read-side critical section might be preempted,
 which means that a subsequent ``synchronize_rcu()`` really does have to
 wait for something, as opposed to simply returning immediately.
@@ -2010,7 +2010,7 @@ the following:
        5 rcu_read_unlock();
        6 do_something_with(v, user_v);
 
-If the compiler did make this transformation in a ``CONFIG_PREEMPT=n`` kernel
+If the compiler did make this transformation in a ``CONFIG_PREEMPTION=n`` kernel
 build, and if ``get_user()`` did page fault, the result would be a quiescent
 state in the middle of an RCU read-side critical section.  This misplaced
 quiescent state could result in line 4 being a use-after-free access,
@@ -2289,10 +2289,10 @@ decides to throw at it.
 
 The Linux kernel is used for real-time workloads, especially in
 conjunction with the `-rt
-patchset <https://rt.wiki.kernel.org/index.php/Main_Page>`__. The
+patchset <https://wiki.linuxfoundation.org/realtime/>`__. The
 real-time-latency response requirements are such that the traditional
 approach of disabling preemption across RCU read-side critical sections
-is inappropriate. Kernels built with ``CONFIG_PREEMPT=y`` therefore use
+is inappropriate. Kernels built with ``CONFIG_PREEMPTION=y`` therefore use
 an RCU implementation that allows RCU read-side critical sections to be
 preempted. This requirement made its presence known after users made it
 clear that an earlier `real-time
@@ -2414,7 +2414,7 @@ includes ``rcu_read_lock_bh()``, ``rcu_read_unlock_bh()``,
 ``call_rcu_bh()``, ``rcu_barrier_bh()``, and
 ``rcu_read_lock_bh_held()``. However, the update-side APIs are now
 simple wrappers for other RCU flavors, namely RCU-sched in
-CONFIG_PREEMPT=n kernels and RCU-preempt otherwise.
+CONFIG_PREEMPTION=n kernels and RCU-preempt otherwise.
 
 Sched Flavor (Historical)
 ~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -2432,11 +2432,11 @@ not have this property, given that any point in the code outside of an
 RCU read-side critical section can be a quiescent state. Therefore,
 *RCU-sched* was created, which follows “classic” RCU in that an
 RCU-sched grace period waits for pre-existing interrupt and NMI
-handlers. In kernels built with ``CONFIG_PREEMPT=n``, the RCU and
+handlers. In kernels built with ``CONFIG_PREEMPTION=n``, the RCU and
 RCU-sched APIs have identical implementations, while kernels built with
-``CONFIG_PREEMPT=y`` provide a separate implementation for each.
+``CONFIG_PREEMPTION=y`` provide a separate implementation for each.
 
-Note well that in ``CONFIG_PREEMPT=y`` kernels,
+Note well that in ``CONFIG_PREEMPTION=y`` kernels,
 ``rcu_read_lock_sched()`` and ``rcu_read_unlock_sched()`` disable and
 re-enable preemption, respectively. This means that if there was a
 preemption attempt during the RCU-sched read-side critical section,
@@ -2599,10 +2599,10 @@ userspace execution also delimit tasks-RCU read-side critical sections.
 
 The tasks-RCU API is quite compact, consisting only of
 ``call_rcu_tasks()``, ``synchronize_rcu_tasks()``, and
-``rcu_barrier_tasks()``. In ``CONFIG_PREEMPT=n`` kernels, trampolines
+``rcu_barrier_tasks()``. In ``CONFIG_PREEMPTION=n`` kernels, trampolines
 cannot be preempted, so these APIs map to ``call_rcu()``,
 ``synchronize_rcu()``, and ``rcu_barrier()``, respectively. In
-``CONFIG_PREEMPT=y`` kernels, trampolines can be preempted, and these
+``CONFIG_PREEMPTION=y`` kernels, trampolines can be preempted, and these
 three APIs are therefore implemented by separate functions that check
 for voluntary context switches.
 
 
@@ -214,7 +214,7 @@ over a rather long period of time, but improvements are always welcome!
 	the rest of the system.
 
 7.	As of v4.20, a given kernel implements only one RCU flavor,
-	which is RCU-sched for PREEMPT=n and RCU-preempt for PREEMPT=y.
+	which is RCU-sched for PREEMPTION=n and RCU-preempt for PREEMPTION=y.
 	If the updater uses call_rcu() or synchronize_rcu(),
 	then the corresponding readers my use rcu_read_lock() and
 	rcu_read_unlock(), rcu_read_lock_bh() and rcu_read_unlock_bh(),
 
@@ -9,7 +9,7 @@ RCU (read-copy update) is a synchronization mechanism that can be thought
 of as a replacement for read-writer locking (among other things), but with
 very low-overhead readers that are immune to deadlock, priority inversion,
 and unbounded latency. RCU read-side critical sections are delimited
-by rcu_read_lock() and rcu_read_unlock(), which, in non-CONFIG_PREEMPT
+by rcu_read_lock() and rcu_read_unlock(), which, in non-CONFIG_PREEMPTION
 kernels, generate no code whatsoever.
 
 This means that RCU writers are unaware of the presence of concurrent
@@ -329,10 +329,10 @@ Answer: This cannot happen. The reason is that on_each_cpu() has its last
 	to smp_call_function() and further to smp_call_function_on_cpu(),
 	causing this latter to spin until the cross-CPU invocation of
 	rcu_barrier_func() has completed. This by itself would prevent
-	a grace period from completing on non-CONFIG_PREEMPT kernels,
+	a grace period from completing on non-CONFIG_PREEMPTION kernels,
 	since each CPU must undergo a context switch (or other quiescent
 	state) before the grace period can complete. However, this is
-	of no use in CONFIG_PREEMPT kernels.
+	of no use in CONFIG_PREEMPTION kernels.
 
 	Therefore, on_each_cpu() disables preemption across its call
 	to smp_call_function() and also across the local call to
 
@@ -25,7 +25,7 @@ warnings:
 
 -	A CPU looping with bottom halves disabled.
 
--	For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
+-	For !CONFIG_PREEMPTION kernels, a CPU looping anywhere in the kernel
 	without invoking schedule().  If the looping in the kernel is
 	really expected and desirable behavior, you might need to add
 	some calls to cond_resched().
@@ -44,7 +44,7 @@ warnings:
 	result in the ``rcu_.*kthread starved for`` console-log message,
 	which will include additional debugging information.
 
--	A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might
+-	A CPU-bound real-time task in a CONFIG_PREEMPTION kernel, which might
 	happen to preempt a low-priority task in the middle of an RCU
 	read-side critical section.   This is especially damaging if
 	that low-priority task is not permitted to run on any other CPU,
 
@@ -684,7 +684,7 @@ Quick Quiz #1:
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 This section presents a "toy" RCU implementation that is based on
 "classic RCU".  It is also short on performance (but only for updates) and
-on features such as hotplug CPU and the ability to run in CONFIG_PREEMPT
+on features such as hotplug CPU and the ability to run in CONFIG_PREEMPTION
 kernels.  The definitions of rcu_dereference() and rcu_assign_pointer()
 are the same as those shown in the preceding section, so they are omitted.
 ::
@@ -740,7 +740,7 @@ Quick Quiz #2:
 Quick Quiz #3:
 		If it is illegal to block in an RCU read-side
 		critical section, what the heck do you do in
-		PREEMPT_RT, where normal spinlocks can block???
+		CONFIG_PREEMPT_RT, where normal spinlocks can block???
 
 :ref:`Answers to Quick Quiz <8_whatisRCU>`
 
@@ -1094,7 +1094,7 @@ Quick Quiz #2:
 		overhead is **negative**.
 
 Answer:
-		Imagine a single-CPU system with a non-CONFIG_PREEMPT
+		Imagine a single-CPU system with a non-CONFIG_PREEMPTION
 		kernel where a routing table is used by process-context
 		code, but can be updated by irq-context code (for example,
 		by an "ICMP REDIRECT" packet).	The usual way of handling
@@ -1121,10 +1121,10 @@ Answer:
 Quick Quiz #3:
 		If it is illegal to block in an RCU read-side
 		critical section, what the heck do you do in
-		PREEMPT_RT, where normal spinlocks can block???
+		CONFIG_PREEMPT_RT, where normal spinlocks can block???
 
 Answer:
-		Just as PREEMPT_RT permits preemption of spinlock
+		Just as CONFIG_PREEMPT_RT permits preemption of spinlock
 		critical sections, it permits preemption of RCU
 		read-side critical sections.  It also permits
 		spinlocks blocking while in RCU read-side critical
 
@@ -4182,6 +4182,10 @@
 			value, meaning that RCU_SOFTIRQ is used by default.
 			Specify rcutree.use_softirq=0 to use rcuc kthreads.
 
+			But note that CONFIG_PREEMPT_RT=y kernels disable
+			this kernel boot parameter, forcibly setting it
+			to zero.
+
 	rcutree.rcu_fanout_exact= [KNL]
 			Disable autobalancing of the rcu_node combining
 			tree.  This is used by rcutorture, and might
@@ -4560,6 +4564,13 @@
 			only normal grace-period primitives.  No effect
 			on CONFIG_TINY_RCU kernels.
 
+			But note that CONFIG_PREEMPT_RT=y kernels enables
+			this kernel boot parameter, forcibly setting
+			it to the value one, that is, converting any
+			post-boot attempt at an expedited RCU grace
+			period to instead use normal non-expedited
+			grace-period processing.
+
 	rcupdate.rcu_task_ipi_delay= [KNL]
 			Set time in jiffies during which RCU tasks will
 			avoid sending IPIs, starting with the beginning
 
@@ -20,78 +20,64 @@ A mapping object is created during driver initialization using::
 mappable, while 'size' indicates how large a mapping region to
 enable. Both are in bytes.
 
-This _wc variant provides a mapping which may only be used
-with the io_mapping_map_atomic_wc or io_mapping_map_wc.
+This _wc variant provides a mapping which may only be used with
+io_mapping_map_local_wc() or io_mapping_map_wc().
 
-With this mapping object, individual pages can be mapped either atomically
-or not, depending on the necessary scheduling environment. Of course, atomic
-maps are more efficient::
+With this mapping object, individual pages can be mapped either temporarily
+or long term, depending on the requirements. Of course, temporary maps are
+more efficient.
 
-	void *io_mapping_map_atomic_wc(struct io_mapping *mapping,
-				       unsigned long offset)
+	void *io_mapping_map_local_wc(struct io_mapping *mapping,
+				      unsigned long offset)
 
-'offset' is the offset within the defined mapping region.
-Accessing addresses beyond the region specified in the
-creation function yields undefined results. Using an offset
-which is not page aligned yields an undefined result. The
-return value points to a single page in CPU address space.
+'offset' is the offset within the defined mapping region.  Accessing
+addresses beyond the region specified in the creation function yields
+undefined results. Using an offset which is not page aligned yields an
+undefined result. The return value points to a single page in CPU address
+space.
 
-This _wc variant returns a write-combining map to the
-page and may only be used with mappings created by
-io_mapping_create_wc
+This _wc variant returns a write-combining map to the page and may only be
+used with mappings created by io_mapping_create_wc()
 
-Note that the task may not sleep while holding this page
-mapped.
+Temporary mappings are only valid in the context of the caller. The mapping
+is not guaranteed to be globaly visible.
 
-::
+io_mapping_map_local_wc() has a side effect on X86 32bit as it disables
+migration to make the mapping code work. No caller can rely on this side
+effect.
 
-	void io_mapping_unmap_atomic(void *vaddr)
+Nested mappings need to be undone in reverse order because the mapping
+code uses a stack for keeping track of them::
 
-'vaddr' must be the value returned by the last
-io_mapping_map_atomic_wc call. This unmaps the specified
-page and allows the task to sleep once again.
+ addr1 = io_mapping_map_local_wc(map1, offset1);
+ addr2 = io_mapping_map_local_wc(map2, offset2);
+ ...
+ io_mapping_unmap_local(addr2);
+ io_mapping_unmap_local(addr1);
 
-If you need to sleep while holding the lock, you can use the non-atomic
-variant, although they may be significantly slower.
+The mappings are released with::
 
-::
+	void io_mapping_unmap_local(void *vaddr)
+
+'vaddr' must be the value returned by the last io_mapping_map_local_wc()
+call. This unmaps the specified mapping and undoes eventual side effects of
+the mapping function.
+
+If you need to sleep while holding a mapping, you can use the regular
+variant, although this may be significantly slower::
 
 	void *io_mapping_map_wc(struct io_mapping *mapping,
 				unsigned long offset)
 
-This works like io_mapping_map_atomic_wc except it allows
-the task to sleep while holding the page mapped.
+This works like io_mapping_map_local_wc() except it has no side effects and
+the pointer is globaly visible.
 
-
-::
+The mappings are released with::
 
 	void io_mapping_unmap(void *vaddr)
 
-This works like io_mapping_unmap_atomic, except it is used
-for pages mapped with io_mapping_map_wc.
+Use for pages mapped with io_mapping_map_wc().
 
 At driver close time, the io_mapping object must be freed::
 
 	void io_mapping_free(struct io_mapping *mapping)
-
-Current Implementation
-======================
-
-The initial implementation of these functions uses existing mapping
-mechanisms and so provides only an abstraction layer and no new
-functionality.
-
-On 64-bit processors, io_mapping_create_wc calls ioremap_wc for the whole
-range, creating a permanent kernel-visible mapping to the resource. The
-map_atomic and map functions add the requested offset to the base of the
-virtual address returned by ioremap_wc.
-
-On 32-bit processors with HIGHMEM defined, io_mapping_map_atomic_wc uses
-kmap_atomic_pfn to map the specified page in an atomic fashion;
-kmap_atomic_pfn isn't really supposed to be used with device pages, but it
-provides an efficient mapping for this usage.
-
-On 32-bit processors without HIGHMEM defined, io_mapping_map_atomic_wc and
-io_mapping_map_wc both use ioremap_wc, a terribly inefficient function which
-performs an IPI to inform all processors about the new mapping. This results
-in a significant performance penalty.
@@ -37,6 +37,7 @@ config OPROFILE
 	tristate "OProfile system profiling"
 	depends on PROFILING
 	depends on HAVE_OPROFILE
+	depends on !PREEMPT_RT
 	select RING_BUFFER
 	select RING_BUFFER_ALLOW_SWAP
 	help
@@ -756,6 +757,12 @@ config HAVE_TIF_NOHZ
 config HAVE_VIRT_CPU_ACCOUNTING
 	bool
 
+config HAVE_VIRT_CPU_ACCOUNTING_IDLE
+	bool
+	help
+	  Architecture has its own way to account idle CPU time and therefore
+	  doesn't implement vtime_account_idle().
+
 config ARCH_HAS_SCALED_CPUTIME
 	bool
 
@@ -770,7 +777,6 @@ config HAVE_VIRT_CPU_ACCOUNTING_GEN
 	  some 32-bit arches may require multiple accesses, so proper
 	  locking is needed to protect against concurrent accesses.
 
-
 config HAVE_IRQ_TIME_ACCOUNTING
 	bool
 	help