Issue
How do I safely use direct IO while forking in a multi-threaded application?
If the following conditions are met, the subsequent associated copy is not properly synchronized:
- (a) a thread within a multi-threaded process having direct IO outstanding to a buffer with a size that is not a multiple of the page size and/or is not a page-aligned memory address,
- (b) the process forks, and
- (c) copy-on-write fault about the page occurs,
This can result in a race condition where the newly copied page contains only part of the DMA data that it should contain, thereby causing silent corruption. This can result in both in-memory and on-disk data corruption depending on whether the direct IO operation is a read or a write. Either type of data corruption can compromise the data integrity of the application and possibly the entire system.
Environment
Red Hat Enterprise Linux
Resolution
You can safely use direct IO while forking in a multi-threaded application with a buffer for direct IO that satisfies the following conditions:
- its size is a multiple of the page size, and
- it is allocated at a page-aligned memory address
Example:
Raw
#define _XOPEN_SOURCE 600
#include <stdlib.h>
#include <unistd.h>
...
int main()
{
long page_size;
char *buf;
...
page_size = sysconf(_SC_PAGESIZE);
posix_memalign(&buf, (size_t)page_size,
(size_t)page_size * X); // X is an integer number
...
}
Note: Do not share the buffer with other threads.
Comment
There were multiple unsuccessful attempts to fix this race condition without compromising performance. Either the race was not fully resolved of the performance consequences were too severe:
1.) The copy-on-write operation was moved from the page fault handler to the fork code, but this does not fully prevent the race condition.
2.) A lock_page() was added to the direct IO path, this does prevent the race condition but at a huge performance cost.
Finally, in the end Linux just lives with this race. Therefore, all Linux users should avoid running multi-threaded applications that perform direct IO and fork operations simultaneously.