jeudi 22 mars 2018

Linux C API in C++: File read() hangs on large file

I'm writing a disk benchmark app that is intended to run on a cluster and gather performance statistics from the nodes. This is for an assignment in a cluster computing course, so I don't have the option of using IOZone or any of the standard benchmarks (hopefully the italics will forestall the inevitable "use a benchmark tool" comments). Because I lack proficiency in C I'm writing the app in C++, though I am of course using the Linux C system calls to bypass the C++ runtime's I/O caching behavior.

The problem is simple enough. Upon making the read() system call, the program appears to hang indefinitely. The target file is 1 GB of binary data and I'm attempting to read directly into buffers that can be 1, 10 or 100 MB in size. I'm using std::vector to implement dynamically-sized buffers and handing off &vec[0] to read(). I'm also calling open() with the O_DIRECT flag to bypass kernel caching.

The essential coding details are captured below:

std::string fpath{"/path/to/file"};
auto fd = open(fpath.c_str(), O_RDONLY | O_DIRECT | O_LARGEFILE);

size_t buf_sz{1024*1024};          // 1 MiB buffer
std::vector<char> buffer(buf_sz);  // Creates vector pre-allocated with buf_sz chars (bytes)
                                   // Result is 0-filled buffer of size buf_sz

auto bytes_read = read(fd, &buffer[0], buf_sz);

My experience with the Linux API is very limited, so I'm not certain where I'm running afoul but I have a laundry list of suspicions. I've put some time into researching the problem but I haven't landed on anything that suggests a probable cause or fix. Most examples of read() hanging appear to be when using pipes or non-standard I/O devices (e.g., serial). Disk I/O, not so much.

So is there something critical I'm neglecting to do? Is there some obscure caveat that makes the read() system call on large files while bypassing kernel I/O caches from the C++ runtime fail? Is there a subtle problem with using STL containers to allocate memory that gets passed to system calls? It isn't apparent from the simplified code above, but the buffers I'm allocating are on the heap, so perhaps virtual memory becomes an issue? I don't see how, but I'm admittedly somewhat fuzzy on memory handling details of the Linux kernel. I apologize for the long list of questions, but I need to get a basic understanding of what's causing the problem before I can start asking targeted questions.

Poking through the executable with gdb shows that buffers are allocated correctly, and the file I've tested with checks out in xxd. I'm using g++ 7.3.1 (with C++11 support) to compile my code on a Fedora Server 27 VM. I suppose it's possible that the VM is poorly configured for the kinds of loads I'm putting on it, but I'd like to rule out any programmatic mistakes first. If it turns out the details are relevant I'll post specs on the VM, physical host and/or hypervisor.

Aucun commentaire:

Enregistrer un commentaire