Merge branches 'consolidate.2019.08.01b', 'fixes.2019.08.12a',... (31da0670) · Commits · jan.koester / Linux

Documentation/RCU/Design/Requirements/Requirements.html

+72 −1

Original line number	Diff line number	Diff line
		@@ -2129,6 +2129,8 @@ Some of the relevant points of interest are as follows:
		<li> <a href="#Hotplug CPU">Hotplug CPU</a>.
		<li> <a href="#Scheduler and RCU">Scheduler and RCU</a>.
		<li> <a href="#Tracing and RCU">Tracing and RCU</a>.
		<li> <a href="#Accesses to User Memory and RCU">
		Accesses to User Memory and RCU</a>.
		<li> <a href="#Energy Efficiency">Energy Efficiency</a>.
		<li> <a href="#Scheduling-Clock Interrupts and RCU">
		Scheduling-Clock Interrupts and RCU</a>.
		@@ -2512,7 +2514,7 @@ disabled across the entire RCU read-side critical section.
		<p>
		It is possible to use tracing on RCU code, but tracing itself
		uses RCU.
		For this reason, <tt>rcu_dereference_raw_notrace()</tt>
		For this reason, <tt>rcu_dereference_raw_check()</tt>
		is provided for use by tracing, which avoids the destructive
		recursion that could otherwise ensue.
		This API is also used by virtualization in some architectures,
		@@ -2521,6 +2523,75 @@ cannot be used.
		The tracing folks both located the requirement and provided the
		needed fix, so this surprise requirement was relatively painless.

		<h3><a name="Accesses to User Memory and RCU">
		Accesses to User Memory and RCU</a></h3>

		<p>
		The kernel needs to access user-space memory, for example, to access
		data referenced by system-call parameters.
		The <tt>get_user()</tt> macro does this job.

		<p>
		However, user-space memory might well be paged out, which means
		that <tt>get_user()</tt> might well page-fault and thus block while
		waiting for the resulting I/O to complete.
		It would be a very bad thing for the compiler to reorder
		a <tt>get_user()</tt> invocation into an RCU read-side critical
		section.
		For example, suppose that the source code looked like this:

		<blockquote>
		<pre>
		1 rcu_read_lock();
		2 p = rcu_dereference(gp);
		3 v = p->value;
		4 rcu_read_unlock();
		5 get_user(user_v, user_p);
		6 do_something_with(v, user_v);
		</pre>
		</blockquote>

		<p>
		The compiler must not be permitted to transform this source code into
		the following:

		<blockquote>
		<pre>
		1 rcu_read_lock();
		2 p = rcu_dereference(gp);
		3 get_user(user_v, user_p); // BUG: POSSIBLE PAGE FAULT!!!
		4 v = p->value;
		5 rcu_read_unlock();
		6 do_something_with(v, user_v);
		</pre>
		</blockquote>

		<p>
		If the compiler did make this transformation in a
		<tt>CONFIG_PREEMPT=n</tt> kernel build, and if <tt>get_user()</tt> did
		page fault, the result would be a quiescent state in the middle
		of an RCU read-side critical section.
		This misplaced quiescent state could result in line 4 being
		a use-after-free access, which could be bad for your kernel's
		actuarial statistics.
		Similar examples can be constructed with the call to <tt>get_user()</tt>
		preceding the <tt>rcu_read_lock()</tt>.

		<p>
		Unfortunately, <tt>get_user()</tt> doesn't have any particular
		ordering properties, and in some architectures the underlying <tt>asm</tt>
		isn't even marked <tt>volatile</tt>.
		And even if it was marked <tt>volatile</tt>, the above access to
		<tt>p->value</tt> is not volatile, so the compiler would not have any
		reason to keep those two accesses in order.

		<p>
		Therefore, the Linux-kernel definitions of <tt>rcu_read_lock()</tt>
		and <tt>rcu_read_unlock()</tt> must act as compiler barriers,
		at least for outermost instances of <tt>rcu_read_lock()</tt> and
		<tt>rcu_read_unlock()</tt> within a nested set of RCU read-side critical
		sections.

		<h3><a name="Energy Efficiency">Energy Efficiency</a></h3>

		<p>

Documentation/RCU/stallwarn.txt

+6 −0

Original line number	Diff line number	Diff line
		@@ -57,6 +57,12 @@ o A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that
		CONFIG_PREEMPT_RCU case, you might see stall-warning
		messages.

		You can use the rcutree.kthread_prio kernel boot parameter to
		increase the scheduling priority of RCU's kthreads, which can
		help avoid this problem. However, please note that doing this
		can increase your system's context-switch rate and thus degrade
		performance.

		o A periodic interrupt whose handler takes longer than the time
		interval between successive pairs of interrupts. This can
		prevent RCU's kthreads and softirq handlers from running.

Documentation/admin-guide/kernel-parameters.txt

+4 −0

Original line number	Diff line number	Diff line
		@@ -4047,6 +4047,10 @@
		rcutorture.verbose= [KNL]
		Enable additional printk() statements.

		rcupdate.rcu_cpu_stall_ftrace_dump= [KNL]
		Dump ftrace buffer after reporting RCU CPU
		stall warning.

		rcupdate.rcu_cpu_stall_suppress= [KNL]
		Suppress RCU CPU stall warning messages.

MAINTAINERS

+1 −1

Original line number	Diff line number	Diff line
		@@ -9326,7 +9326,7 @@ F: drivers/misc/lkdtm/*

		LINUX KERNEL MEMORY CONSISTENCY MODEL (LKMM)
		M: Alan Stern <stern@rowland.harvard.edu>
		M: Andrea Parri <andrea.parri@amarulasolutions.com>
		M: Andrea Parri <parri.andrea@gmail.com>
		M: Will Deacon <will@kernel.org>
		M: Peter Zijlstra <peterz@infradead.org>
		M: Boqun Feng <boqun.feng@gmail.com>

arch/arm/kernel/smp.c

+2 −4

Original line number	Diff line number	Diff line
		@@ -264,15 +264,13 @@ int __cpu_disable(void)
		return 0;
		}

		static DECLARE_COMPLETION(cpu_died);

		/*
		* called on the thread which is asking for a CPU to be shutdown -
		* waits until shutdown has completed, or it is timed out.
		*/
		void __cpu_die(unsigned int cpu)
		{
		if (!wait_for_completion_timeout(&cpu_died, msecs_to_jiffies(5000))) {
		if (!cpu_wait_death(cpu, 5)) {
		pr_err("CPU%u: cpu didn't die\n", cpu);
		return;
		}
		@@ -319,7 +317,7 @@ void arch_cpu_idle_dead(void)
		* this returns, power and/or clocks can be removed at any point
		* from this CPU and its cache by platform_cpu_kill().
		*/
		complete(&cpu_died);
		(void)cpu_report_death();

		/*
		* Ensure that the cache lines associated with that completion are