Commit af96134d authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull RCU updates from Paul McKenney:
 "Documentation updates

  Miscellaneous fixes, perhaps most notably:

   - Remove RCU_NONIDLE(). The new visibility of most of the idle loop
     to RCU has obsoleted this API.

   - Make the RCU_SOFTIRQ callback-invocation time limit also apply to
     the rcuc kthreads that invoke callbacks for CONFIG_PREEMPT_RT.

   - Add a jiffies-based callback-invocation time limit to handle
     long-running callbacks. (The local_clock() function is only invoked
     once per 32 callbacks due to its high overhead.)

   - Stop rcu_tasks_invoke_cbs() from using never-onlined CPUs, which
     fixes a bug that can occur on systems with non-contiguous CPU
     numbering.

  kvfree_rcu updates:

   - Eliminate the single-argument variant of k[v]free_rcu() now that
     all uses have been converted to k[v]free_rcu_mightsleep().

   - Add WARN_ON_ONCE() checks for k[v]free_rcu*() freeing callbacks too
     soon. Yes, this is closing the barn door after the horse has
     escaped, but Murphy says that there will be more horses.

  Callback-offloading updates:

   - Fix a number of bugs involving the shrinker and lazy callbacks.

  Tasks RCU updates

  Torture-test updates"

* tag 'rcu.2023.06.22a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (32 commits)
  torture: Remove duplicated argument -enable-kvm for ppc64
  doc/rcutorture: Add description of rcutorture.stall_cpu_block
  rcu/rcuscale: Stop kfree_scale_thread thread(s) after unloading rcuscale
  rcu/rcuscale: Move rcu_scale_*() after kfree_scale_cleanup()
  rcutorture: Correct name of use_softirq module parameter
  locktorture: Add long_hold to adjust lock-hold delays
  rcu/nocb: Make shrinker iterate only over NOCB CPUs
  rcu-tasks: Stop rcu_tasks_invoke_cbs() from using never-onlined CPUs
  rcu: Make rcu_cpu_starting() rely on interrupts being disabled
  rcu: Mark rcu_cpu_kthread() accesses to ->rcu_cpu_has_work
  rcu: Mark additional concurrent load from ->cpu_no_qs.b.exp
  rcu: Employ jiffies-based backstop to callback time limit
  rcu: Check callback-invocation time limit for rcuc kthreads
  rcu: Remove RCU_NONIDLE()
  rcu: Add more RCU files to kernel-api.rst
  rcu-tasks: Clarify the cblist_init_generic() function's pr_info() output
  rcu-tasks: Avoid pr_info() with spin lock in cblist_init_generic()
  rcu/nocb: Recheck lazy callbacks under the ->nocb_lock from shrinker
  rcu/nocb: Fix shrinker race against callback enqueuer
  rcu/nocb: Protect lazy shrinker against concurrent (de-)offloading
  ...
parents 1ef6663a 2e31da75
Loading
Loading
Loading
Loading
+1 −35
Original line number Diff line number Diff line
@@ -2071,41 +2071,7 @@ call.

Because RCU avoids interrupting idle CPUs, it is illegal to execute an
RCU read-side critical section on an idle CPU. (Kernels built with
``CONFIG_PROVE_RCU=y`` will splat if you try it.) The RCU_NONIDLE()
macro and ``_rcuidle`` event tracing is provided to work around this
restriction. In addition, rcu_is_watching() may be used to test
whether or not it is currently legal to run RCU read-side critical
sections on this CPU. I learned of the need for diagnostics on the one
hand and RCU_NONIDLE() on the other while inspecting idle-loop code.
Steven Rostedt supplied ``_rcuidle`` event tracing, which is used quite
heavily in the idle loop. However, there are some restrictions on the
code placed within RCU_NONIDLE():

#. Blocking is prohibited. In practice, this is not a serious
   restriction given that idle tasks are prohibited from blocking to
   begin with.
#. Although nesting RCU_NONIDLE() is permitted, they cannot nest
   indefinitely deeply. However, given that they can be nested on the
   order of a million deep, even on 32-bit systems, this should not be a
   serious restriction. This nesting limit would probably be reached
   long after the compiler OOMed or the stack overflowed.
#. Any code path that enters RCU_NONIDLE() must sequence out of that
   same RCU_NONIDLE(). For example, the following is grossly
   illegal:

      ::

	  1     RCU_NONIDLE({
	  2       do_something();
	  3       goto bad_idea;  /* BUG!!! */
	  4       do_something_else();});
	  5   bad_idea:


   It is just as illegal to transfer control into the middle of
   RCU_NONIDLE()'s argument. Yes, in theory, you could transfer in
   as long as you also transferred out, but in practice you could also
   expect to get sharply worded review comments.
``CONFIG_PROVE_RCU=y`` will splat if you try it.)

It is similarly socially unacceptable to interrupt an ``nohz_full`` CPU
running in userspace. RCU must therefore track ``nohz_full`` userspace
+0 −1
Original line number Diff line number Diff line
@@ -1117,7 +1117,6 @@ All: lockdep-checked RCU utility APIs::

	RCU_LOCKDEP_WARN
	rcu_sleep_check
	RCU_NONIDLE

All: Unchecked RCU-protected pointer access::

+78 −62
Original line number Diff line number Diff line
@@ -4731,43 +4731,6 @@
			the propagation of recent CPU-hotplug changes up
			the rcu_node combining tree.

	rcutree.use_softirq=	[KNL]
			If set to zero, move all RCU_SOFTIRQ processing to
			per-CPU rcuc kthreads.  Defaults to a non-zero
			value, meaning that RCU_SOFTIRQ is used by default.
			Specify rcutree.use_softirq=0 to use rcuc kthreads.

			But note that CONFIG_PREEMPT_RT=y kernels disable
			this kernel boot parameter, forcibly setting it
			to zero.

	rcutree.rcu_fanout_exact= [KNL]
			Disable autobalancing of the rcu_node combining
			tree.  This is used by rcutorture, and might
			possibly be useful for architectures having high
			cache-to-cache transfer latencies.

	rcutree.rcu_fanout_leaf= [KNL]
			Change the number of CPUs assigned to each
			leaf rcu_node structure.  Useful for very
			large systems, which will choose the value 64,
			and for NUMA systems with large remote-access
			latencies, which will choose a value aligned
			with the appropriate hardware boundaries.

	rcutree.rcu_min_cached_objs= [KNL]
			Minimum number of objects which are cached and
			maintained per one CPU. Object size is equal
			to PAGE_SIZE. The cache allows to reduce the
			pressure to page allocator, also it makes the
			whole algorithm to behave better in low memory
			condition.

	rcutree.rcu_delay_page_cache_fill_msec= [KNL]
			Set the page-cache refill delay (in milliseconds)
			in response to low-memory conditions.  The range
			of permitted values is in the range 0:100000.

	rcutree.jiffies_till_first_fqs= [KNL]
			Set delay from grace-period initialization to
			first attempt to force quiescent states.
@@ -4806,21 +4769,6 @@
			When RCU_NOCB_CPU is set, also adjust the
			priority of NOCB callback kthreads.

	rcutree.rcu_divisor= [KNL]
			Set the shift-right count to use to compute
			the callback-invocation batch limit bl from
			the number of callbacks queued on this CPU.
			The result will be bounded below by the value of
			the rcutree.blimit kernel parameter.  Every bl
			callbacks, the softirq handler will exit in
			order to allow the CPU to do other work.

			Please note that this callback-invocation batch
			limit applies only to non-offloaded callback
			invocation.  Offloaded callbacks are instead
			invoked in the context of an rcuoc kthread, which
			scheduler will preempt as it does any other task.

	rcutree.nocb_nobypass_lim_per_jiffy= [KNL]
			On callback-offloaded (rcu_nocbs) CPUs,
			RCU reduces the lock contention that would
@@ -4834,14 +4782,6 @@
			the ->nocb_bypass queue.  The definition of "too
			many" is supplied by this kernel boot parameter.

	rcutree.rcu_nocb_gp_stride= [KNL]
			Set the number of NOCB callback kthreads in
			each group, which defaults to the square root
			of the number of CPUs.	Larger numbers reduce
			the wakeup overhead on the global grace-period
			kthread, but increases that same overhead on
			each group's NOCB grace-period kthread.

	rcutree.qhimark= [KNL]
			Set threshold of queued RCU callbacks beyond which
			batch limiting is disabled.
@@ -4859,6 +4799,56 @@
			on rcutree.qhimark at boot time and to zero to
			disable more aggressive help enlistment.

	rcutree.rcu_delay_page_cache_fill_msec= [KNL]
			Set the page-cache refill delay (in milliseconds)
			in response to low-memory conditions.  The range
			of permitted values is in the range 0:100000.

	rcutree.rcu_divisor= [KNL]
			Set the shift-right count to use to compute
			the callback-invocation batch limit bl from
			the number of callbacks queued on this CPU.
			The result will be bounded below by the value of
			the rcutree.blimit kernel parameter.  Every bl
			callbacks, the softirq handler will exit in
			order to allow the CPU to do other work.

			Please note that this callback-invocation batch
			limit applies only to non-offloaded callback
			invocation.  Offloaded callbacks are instead
			invoked in the context of an rcuoc kthread, which
			scheduler will preempt as it does any other task.

	rcutree.rcu_fanout_exact= [KNL]
			Disable autobalancing of the rcu_node combining
			tree.  This is used by rcutorture, and might
			possibly be useful for architectures having high
			cache-to-cache transfer latencies.

	rcutree.rcu_fanout_leaf= [KNL]
			Change the number of CPUs assigned to each
			leaf rcu_node structure.  Useful for very
			large systems, which will choose the value 64,
			and for NUMA systems with large remote-access
			latencies, which will choose a value aligned
			with the appropriate hardware boundaries.

	rcutree.rcu_min_cached_objs= [KNL]
			Minimum number of objects which are cached and
			maintained per one CPU. Object size is equal
			to PAGE_SIZE. The cache allows to reduce the
			pressure to page allocator, also it makes the
			whole algorithm to behave better in low memory
			condition.

	rcutree.rcu_nocb_gp_stride= [KNL]
			Set the number of NOCB callback kthreads in
			each group, which defaults to the square root
			of the number of CPUs.	Larger numbers reduce
			the wakeup overhead on the global grace-period
			kthread, but increases that same overhead on
			each group's NOCB grace-period kthread.

	rcutree.rcu_kick_kthreads= [KNL]
			Cause the grace-period kthread to get an extra
			wake_up() if it sleeps three times longer than
@@ -4866,6 +4856,13 @@
			This wake_up() will be accompanied by a
			WARN_ONCE() splat and an ftrace_dump().

	rcutree.rcu_resched_ns= [KNL]
			Limit the time spend invoking a batch of RCU
			callbacks to the specified number of nanoseconds.
			By default, this limit is checked only once
			every 32 callbacks in order to limit the pain
			inflicted by local_clock() overhead.

	rcutree.rcu_unlock_delay= [KNL]
			In CONFIG_RCU_STRICT_GRACE_PERIOD=y kernels,
			this specifies an rcu_read_unlock()-time delay
@@ -4880,6 +4877,16 @@
			rcu_node tree with an eye towards determining
			why a new grace period has not yet started.

	rcutree.use_softirq=	[KNL]
			If set to zero, move all RCU_SOFTIRQ processing to
			per-CPU rcuc kthreads.  Defaults to a non-zero
			value, meaning that RCU_SOFTIRQ is used by default.
			Specify rcutree.use_softirq=0 to use rcuc kthreads.

			But note that CONFIG_PREEMPT_RT=y kernels disable
			this kernel boot parameter, forcibly setting it
			to zero.

	rcuscale.gp_async= [KNL]
			Measure performance of asynchronous
			grace-period primitives such as call_rcu().
@@ -5082,8 +5089,17 @@

	rcutorture.stall_cpu_block= [KNL]
			Sleep while stalling if set.  This will result
			in warnings from preemptible RCU in addition
			to any other stall-related activity.
			in warnings from preemptible RCU in addition to
			any other stall-related activity.  Note that
			in kernels built with CONFIG_PREEMPTION=n and
			CONFIG_PREEMPT_COUNT=y, this parameter will
			cause the CPU to pass through a quiescent state.
			Given CONFIG_PREEMPTION=n, this will suppress
			RCU CPU stall warnings, but will instead result
			in scheduling-while-atomic splats.

			Use of this module parameter results in splats.


	rcutorture.stall_cpu_holdoff= [KNL]
			Time to wait (s) after boot before inducing stall.
+12 −0
Original line number Diff line number Diff line
@@ -412,3 +412,15 @@ Read-Copy Update (RCU)
.. kernel-doc:: include/linux/rcu_sync.h

.. kernel-doc:: kernel/rcu/sync.c

.. kernel-doc:: kernel/rcu/tasks.h

.. kernel-doc:: kernel/rcu/tree_stall.h

.. kernel-doc:: include/linux/rcupdate_trace.h

.. kernel-doc:: include/linux/rcupdate_wait.h

.. kernel-doc:: include/linux/rcuref.h

.. kernel-doc:: include/linux/rcutree.h
+1 −1
Original line number Diff line number Diff line
@@ -17822,7 +17822,7 @@ M: Boqun Feng <boqun.feng@gmail.com>
R:	Steven Rostedt <rostedt@goodmis.org>
R:	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
R:	Lai Jiangshan <jiangshanlai@gmail.com>
R:	Zqiang <qiang1.zhang@intel.com>
R:	Zqiang <qiang.zhang1211@gmail.com>
L:	rcu@vger.kernel.org
S:	Supported
W:	http://www.rdrop.com/users/paulmck/RCU/
Loading