16-Aug-2016: Analysis of race condition warnings in the Linux kernel

user warning: Got error 28 from storage engine query: SELECT t.*,v.weight AS v_weight_unused FROM term_node r INNER JOIN term_data t ON r.tid = t.tid INNER JOIN vocabulary v ON t.vid = v.vid WHERE r.vid = 270 ORDER BY v.weight, t.weight, t.name in /opt/drupal-6.38/modules/taxonomy/taxonomy.module on line 640.

Pavel Andrianov has finished Google Summer of Code 2016 project "Analysis of race condition warnings in the Linux kernel" for The Linux Foundation.

Race conditions are a kind of bugs that are hard to detect — they may manifest itself only on rare schedules, and they are hard to fix — they often require rethinking and careful selection of synchronization mechanism.

The LDV Tools static verification framework performs the analysis of Linux kernel modules and detects both errors of incorrect usages of API between modules and kernel core and data race conditions – when two or more threads can access the same shared data simultaneously without proper synchronization. Race conditions are symptoms of bugs in the kernel, but not always bugs.

As the result LDV Tools reports a number of warnings for a kernel module. Not all warning are bugs, because of inaccuracy of LDV Tools in both analysis of module itself and assumptions about an environment of the module. The Pavel's task was to analyze warnings for the Linux kernel and to report found bugs to kernel developers.

First, Pavel launched the LDV Tools on the subdirectory drivers/net/wireless. He analyzed about 100 warnings and found about twenty suspicious cases. After careful investigation he decided to report two real race conditions and they were fixed.

Second, he launched the tool on all drivers from the Linux kernel 4.5. The tool reported potential races for about 400 kernel modules. During participation in GSOC 2016 Pavel managed to analyze about 100 modules and found that many warnings are due to inaccuracies in assumptions about the module environment, handling synchronization primitives, analysis of shared data, path conditions, etc. He classified about 20 warnings as real data races, but not all were reported, because some of them were detected in old unsupported modules. Total list of reported bugs is presented below:

7 applied patches:

  1. 30462b5 ("rtlwifi: Remove unused parameter from rtl_ps_set_rf_state()")
  2. 204e2ab ("rtlwifi: rtl8188ee: Fix potential race condition")
  3. c3ae8ec4 ("rtlwifi: rtl8192ee: Fix potential race condition")
  4. 31c2e76 ("rtlwifi: rtl8723be: Fix potential race condition")
  5. 4f29b34 ("rtlwifi: rtl8723ae: Fix potential race condition")
  6. 300c32c ("rtlwifi: rtl8821ae: Fix potential race condition")
  7. f52b041 ("libertas: Add spinlock to avoid race condition")

Reported cases (Linux Kernel Mailing List):

  1. 2016/6/14/488 ("wcn36xx: potential race condition")
  2. 2016/7/1/492 ("A potential race")
  3. 2016/8/2/158 ("wl3501_cs: Add spinlock to wl3501_reset")
  4. 2016/8/8/84 ("A potential race in drivers/atm/eni.ko")
  5. 2016/8/12/113 ("A potential data race in drivers/net/ethernet/smsc/smc91c92_cs.ko")
  6. 2016/8/12/143 ("A potential race in drivers/scsi/megaraid.ko")
  7. 2016/8/12/217 ("A potential data race in drivers/scsi/mvumi.ko")
  8. 2016/8/15/246 ("Potential data race in drivers/net/ethernet/sis/sis190.ko")
  9. 2016/8/15/291 ("A potential data race in drivers/isdn/hardware/eicon/diva_mnt.ko")

Many thanks to Vaishali Thakkar for help in understanding kernel peculiarities, reviewing bug reports and patches.