20-Aug-2021: Analysis & Fixing of Race Condition Warnings in the Linux Kernel

Saubhik Mukherjee has finished the GSoC 2021 Project "LSB: Analysis & Fixing of Race Condition Warnings in the Linux Kernel" for The Linux Foundation. The task is to catch bugs due to race conditions in Linux kernel 5.4 device drivers. A software verification framework called Klever automates the verification against a concurrency safety requirement specification. Using the Klever web interface, the generated error traces are analyzed and classified as false warnings or real bugs. In case of bugs, a bug fix patch is prepared and sent to the driver subsystem developers, or a bug report is sent. Causes of the false warnings are also determined to improve the tool.

Out of 311 unsafe traces analyzed, 30 of them were actual or benign bugs. This gives a true positive rate of about 9.6%. Many of the bugs were in obsolete or staging drivers and already rejected in previous GSoCs of this project. Out of the 281 false positives, 154 (55%) of them are related to EMG (Environment Model Generator), and 128 (45%) of them are related to problems in the verifier tool. EMG is a model of interaction between driver activities, for example, sequence of callback calls. Source code analysis is a task of the verifier tool.

Saubhik is a Computer Science graduate student at Georgia Institute of Technology, Atlanta, GA. His interest is in writing systems software and wishes to contribute further to the Linux kernel in the future.

GSoC 2021 was done part-time in a 10-week time window. Saubhik started getting involved in the project from about two weeks before the official GSoC start date. Throughout the period, Saubhik divided his time between investigating the error traces and preparing patches and framing questions to the mentors and the mailing lists.

For managing verification processes and for analysis of verification results, Klever provides a multi-user web interface. It shows both details, e.g. code coverage and error traces that contain possible values of variables and function arguments as well as all statements from program entry points to found errors. The feature of viewing the source code from the error trace is useful for quick reference. The error trace also contains helpful information regarding registration/de-registration of various callbacks or interrupt handlers, highlights uses of locks etc. Different threads are displayed in different colors alongside the complete trace: one thread for the module init & exit, another thread for a callback, and so on. Once an error trace is marked as a false positive or a bug, similar traces are automatically marked with the same tag and description. Also, experts can compare verification results obtained at different times.

Saubhik also identified some areas of improvement for the Klever tool. There was a platform driver specification problem with Environment Model Generator (EMG) which does not join power management callback thread in the remove function for platform drivers. This was fixed. Also there was some EMG related problem with Industrial I/O drivers which was also fixed in subsequent runs. This improves verification results by fixing existing specifications and developing new ones.

Saubhik expresses his gratitude to mentors Pavel Andrianov and Ilja Zakharov. The email discussions were extremely helpful which involved discussions about tricky error traces, potential fixes, patch reviews, and framing questions to ask the community.

Patches and Mailing List Discussions:

  1. [PATCH] net: appletalk: cops: Fix data race in cops_probe1 (Accepted)
  2. [PATCH] tty: serial: owl: Fix data race in owl_uart_remove (Rejected)
  3. [question] Is it possible to remove an active port (without shutdown)?
  4. [question] De-registration does not remove port
  5. [PATCH v2] char: tpm: vtpm_proxy: Fix race in init (Applied & Tested)
  6. [question] potential race between capinc_tty_init & capi_release