Commit 31da0670 authored by Paul E. McKenney's avatar Paul E. McKenney
Browse files

Merge branches 'consolidate.2019.08.01b', 'fixes.2019.08.12a',...

Merge branches 'consolidate.2019.08.01b', 'fixes.2019.08.12a', 'lists.2019.08.13a' and 'torture.2019.08.01b' into HEAD

consolidate.2019.08.01b: Further consolidation cleanups
fixes.2019.08.12a: Miscellaneous fixes
lists.2019.08.13a: Optional lockdep arguments for RCU list macros
torture.2019.08.01b: Torture-test updates
Loading
Loading
Loading
Loading
+72 −1
Original line number Diff line number Diff line
@@ -2129,6 +2129,8 @@ Some of the relevant points of interest are as follows:
<li>	<a href="#Hotplug CPU">Hotplug CPU</a>.
<li>	<a href="#Scheduler and RCU">Scheduler and RCU</a>.
<li>	<a href="#Tracing and RCU">Tracing and RCU</a>.
<li>	<a href="#Accesses to User Memory and RCU">
Accesses to User Memory and RCU</a>.
<li>	<a href="#Energy Efficiency">Energy Efficiency</a>.
<li>	<a href="#Scheduling-Clock Interrupts and RCU">
	Scheduling-Clock Interrupts and RCU</a>.
@@ -2512,7 +2514,7 @@ disabled across the entire RCU read-side critical section.
<p>
It is possible to use tracing on RCU code, but tracing itself
uses RCU.
For this reason, <tt>rcu_dereference_raw_notrace()</tt>
For this reason, <tt>rcu_dereference_raw_check()</tt>
is provided for use by tracing, which avoids the destructive
recursion that could otherwise ensue.
This API is also used by virtualization in some architectures,
@@ -2521,6 +2523,75 @@ cannot be used.
The tracing folks both located the requirement and provided the
needed fix, so this surprise requirement was relatively painless.

<h3><a name="Accesses to User Memory and RCU">
Accesses to User Memory and RCU</a></h3>

<p>
The kernel needs to access user-space memory, for example, to access
data referenced by system-call parameters.
The <tt>get_user()</tt> macro does this job.

<p>
However, user-space memory might well be paged out, which means
that <tt>get_user()</tt> might well page-fault and thus block while
waiting for the resulting I/O to complete.
It would be a very bad thing for the compiler to reorder
a <tt>get_user()</tt> invocation into an RCU read-side critical
section.
For example, suppose that the source code looked like this:

<blockquote>
<pre>
 1 rcu_read_lock();
 2 p = rcu_dereference(gp);
 3 v = p-&gt;value;
 4 rcu_read_unlock();
 5 get_user(user_v, user_p);
 6 do_something_with(v, user_v);
</pre>
</blockquote>

<p>
The compiler must not be permitted to transform this source code into
the following:

<blockquote>
<pre>
 1 rcu_read_lock();
 2 p = rcu_dereference(gp);
 3 get_user(user_v, user_p); // BUG: POSSIBLE PAGE FAULT!!!
 4 v = p-&gt;value;
 5 rcu_read_unlock();
 6 do_something_with(v, user_v);
</pre>
</blockquote>

<p>
If the compiler did make this transformation in a
<tt>CONFIG_PREEMPT=n</tt> kernel build, and if <tt>get_user()</tt> did
page fault, the result would be a quiescent state in the middle
of an RCU read-side critical section.
This misplaced quiescent state could result in line&nbsp;4 being
a use-after-free access, which could be bad for your kernel's
actuarial statistics.
Similar examples can be constructed with the call to <tt>get_user()</tt>
preceding the <tt>rcu_read_lock()</tt>.

<p>
Unfortunately, <tt>get_user()</tt> doesn't have any particular
ordering properties, and in some architectures the underlying <tt>asm</tt>
isn't even marked <tt>volatile</tt>.
And even if it was marked <tt>volatile</tt>, the above access to
<tt>p-&gt;value</tt> is not volatile, so the compiler would not have any
reason to keep those two accesses in order.

<p>
Therefore, the Linux-kernel definitions of <tt>rcu_read_lock()</tt>
and <tt>rcu_read_unlock()</tt> must act as compiler barriers,
at least for outermost instances of <tt>rcu_read_lock()</tt> and
<tt>rcu_read_unlock()</tt> within a nested set of RCU read-side critical
sections.

<h3><a name="Energy Efficiency">Energy Efficiency</a></h3>

<p>
+6 −0
Original line number Diff line number Diff line
@@ -57,6 +57,12 @@ o A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that
	CONFIG_PREEMPT_RCU case, you might see stall-warning
	messages.

	You can use the rcutree.kthread_prio kernel boot parameter to
	increase the scheduling priority of RCU's kthreads, which can
	help avoid this problem.  However, please note that doing this
	can increase your system's context-switch rate and thus degrade
	performance.

o	A periodic interrupt whose handler takes longer than the time
	interval between successive pairs of interrupts.  This can
	prevent RCU's kthreads and softirq handlers from running.
+4 −0
Original line number Diff line number Diff line
@@ -4047,6 +4047,10 @@
	rcutorture.verbose= [KNL]
			Enable additional printk() statements.

	rcupdate.rcu_cpu_stall_ftrace_dump= [KNL]
			Dump ftrace buffer after reporting RCU CPU
			stall warning.

	rcupdate.rcu_cpu_stall_suppress= [KNL]
			Suppress RCU CPU stall warning messages.

+1 −1
Original line number Diff line number Diff line
@@ -9326,7 +9326,7 @@ F: drivers/misc/lkdtm/*

LINUX KERNEL MEMORY CONSISTENCY MODEL (LKMM)
M:	Alan Stern <stern@rowland.harvard.edu>
M:	Andrea Parri <andrea.parri@amarulasolutions.com>
M:	Andrea Parri <parri.andrea@gmail.com>
M:	Will Deacon <will@kernel.org>
M:	Peter Zijlstra <peterz@infradead.org>
M:	Boqun Feng <boqun.feng@gmail.com>
+2 −4
Original line number Diff line number Diff line
@@ -264,15 +264,13 @@ int __cpu_disable(void)
	return 0;
}

static DECLARE_COMPLETION(cpu_died);

/*
 * called on the thread which is asking for a CPU to be shutdown -
 * waits until shutdown has completed, or it is timed out.
 */
void __cpu_die(unsigned int cpu)
{
	if (!wait_for_completion_timeout(&cpu_died, msecs_to_jiffies(5000))) {
	if (!cpu_wait_death(cpu, 5)) {
		pr_err("CPU%u: cpu didn't die\n", cpu);
		return;
	}
@@ -319,7 +317,7 @@ void arch_cpu_idle_dead(void)
	 * this returns, power and/or clocks can be removed at any point
	 * from this CPU and its cache by platform_cpu_kill().
	 */
	complete(&cpu_died);
	(void)cpu_report_death();

	/*
	 * Ensure that the cache lines associated with that completion are
Loading