patch-2.3.30 linux/Documentation/vm/locking

Next file: linux/Documentation/vm/numa
Previous file: linux/Documentation/video4linux/zr36120.txt
Back to the patch index
Back to the overall index

diff -u --recursive --new-file v2.3.29/linux/Documentation/vm/locking linux/Documentation/vm/locking
@@ -75,7 +75,11 @@
 The vmlist lock nests with the inode i_shared_lock and the kmem cache
 c_spinlock spinlocks. This is okay, since code that holds i_shared_lock 
 never asks for memory, and the kmem code asks for pages after dropping
-c_spinlock.
+c_spinlock. The vmlist lock also nests with pagecache_lock and 
+pagemap_lru_lock spinlocks, and no code asks for memory with these locks
+held.
+
+The vmlist lock is grabbed while holding the kernel_lock spinning monitor.
 
 The vmlist lock can be a sleeping or spin lock. In either case, care
 must be taken that it is not held on entry to the driver methods, since
@@ -85,3 +89,50 @@
 which is also the spinlock that page stealers use to protect changes to
 the victim process' ptes. Thus we have a reduction in the total number
 of locks. 
+
+swap_list_lock/swap_device_lock
+-------------------------------
+The swap devices are chained in priority order from the "swap_list" header. 
+The "swap_list" is used for the round-robin swaphandle allocation strategy.
+The #free swaphandles is maintained in "nr_swap_pages". These two together
+are protected by the swap_list_lock. 
+
+The swap_device_lock, which is per swap device, protects the reference 
+counts on the corresponding swaphandles, maintained in the "swap_map"
+array, and the "highest_bit" and "lowest_bit" fields.
+
+Both of these are spinlocks, and are never acquired from intr level. The
+locking heirarchy is swap_list_lock -> swap_device_lock.
+
+To prevent races between swap space deletion or async readahead swapins
+deciding whether a swap handle is being used, ie worthy of being read in
+from disk, and an unmap -> swap_free making the handle unused, the swap
+delete and readahead code grabs a temp reference on the swaphandle to
+prevent warning messages from swap_duplicate <- read_swap_cache_async.
+
+Swap cache locking
+------------------
+Pages are added into the swap cache with kernel_lock held, to make sure
+that multiple pages are not being added (and hence lost) by associating
+all of them with the same swaphandle.
+
+Pages are guaranteed not to be removed from the scache if the page is 
+"shared": ie, other processes hold reference on the page or the associated 
+swap handle. The only code that does not follow this rule is shrink_mmap,
+which deletes pages from the swap cache if no process has a reference on 
+the page (multiple processes might have references on the corresponding
+swap handle though). lookup_swap_cache() races with shrink_mmap, when
+establishing a reference on a scache page, so, it must check whether the
+page it located is still in the swapcache, or shrink_mmap deleted it.
+(This race is due to the fact that shrink_mmap looks at the page ref
+count with pagecache_lock, but then drops pagecache_lock before deleting
+the page from the scache).
+
+do_wp_page and do_swap_page have MP races in them while trying to figure
+out whether a page is "shared", by looking at the page_count + swap_count.
+To preserve the sum of the counts, the page lock _must_ be acquired before
+calling is_page_shared (else processes might switch their swap_count refs
+to the page count refs, after the page count ref has been snapshotted).
+
+Swap device deletion code currently breaks all the scache assumptions,
+since it grabs neither mmap_sem nor page_table_lock.

FUNET's LINUX-ADM group, linux-adm@nic.funet.fi
TCL-scripts by Sam Shen (who was at: slshen@lbl.gov)