-
Notifications
You must be signed in to change notification settings - Fork 14.1k
Optimize fs::write
#134730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize fs::write
#134730
Conversation
Write then truncate instead of truncate then write.
|
Do you have numbers? How much faster are we talking? We should have a library benchmark (if we don't already have one). |
|
I need to run a proper benchmark but if the end file size is about the same then there seems to be an order of magnitude difference, which seems significant regardless: fn write_file_truncate(file_name: &str, data: &[u8]) {
if let Ok(mut file) = OpenOptions::new()
.write(true)
.create(true)
.truncate(true)
.open(file_name)
{
file.write_all(data).unwrap();
}
}
fn write_file_set_len(file_name: &str, data: &[u8]) {
if let Ok(mut file) = OpenOptions::new()
.write(true)
.create(true)
.open(file_name)
{
file.write_all(data).unwrap();
let pos = file.stream_position().unwrap();
file.set_len(pos);
}
}
static DATA: &str = include_str!("p&p.txt");
fn main() {
let now = std::time::Instant::now();
for _ in 0..1000 {
write_file_truncate("p&p.txt", DATA.as_bytes());
//write_file_set_len("p&p.txt", DATA.as_bytes());
}
println!("{} ms", now.elapsed().as_millis());
}Where The difference was 200 to 500 ms for truncate vs. 60 to 80 ms for So this allows writing Pride and Prejudice an order of magnitude faster. |
|
Benchmark on Linux:
|
|
Would marking a file as sparse make any difference here? At $work I've got significant speedups from that when incrementally writing a file on NTFS, but it was a different IO pattern than this.
That's likely due to auto_da_alloc. It's trying to be "helpful" here by adding an implicit fsync for this particular pattern. It can be disabled via mount options. |
|
I think this might change behavior around special files on Linux. E.g. >>> null = open("/dev/null", "w+")
>>> null.truncate(1024)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 22] Invalid argumentPreviously, |
|
Ah, that severely dampens my enthusiasm for this. We could work around that by ignoring For the record, here are some benchmarks I did: Windows ReFSWSL ext4 |
I doubt it in this case. It's useful when you want to fill in a lot of zeros at virtually no cost but if you're actually writing data sequentially up to the file size then it doesn't really help. |
|
I'm going to close this and move discussion back to #127606. As I said, I'm no longer thinking this is a good idea. |
Doing a write then truncate instead of truncate then write is much faster on Windows (and potentially some filesystems on other systems too). A downside is that it may leave the file in an inconsistent state if
File::set_lenfails.Fixes #127606
I'm nominating for libs-api because this may not honour the API of
std::fs::write. Maybe t-libs can also think of a reason not to do this.