This is merely a proposal ticket, and I haven't thought it through yet.
hiresperf uses kernel_write to dump ring buffer data to the file. However, under the circumstance where the polling rate and logging rate are both high, kernel_write may also significantly compete for cachelines with profiled applications.
The kernel_write is essentially a buffered file I/O. However, in this case, since we are mainly dumping data, the buffered I/O seems to be unnecessary.
One option we can consider is using Direct I/O.