std::io::LineWriter: cap write_vectored newline scan to avoid quadrat…#155797
std::io::LineWriter: cap write_vectored newline scan to avoid quadrat…#155797devnexen wants to merge 2 commits into
Conversation
|
rustbot has assigned @Mark-Simulacrum. Use Why was this reviewer chosen?The reviewer was selected based on:
|
40453e0 to
e974bd8
Compare
| // that many slices per call, so newlines past the cap can't influence | ||
| // this attempt anyway. Without the cap, repeated calls from | ||
| // write_all_vectored() make this scan quadratic. | ||
| const MAX_BUFS_TO_SCAN: usize = 1024; |
There was a problem hiding this comment.
For this code to be portable it has to work on all platforms, not just unix.
We do have fn max_iov() in sys::fd::unix, but idk how windows or other OSes behave.
There was a problem hiding this comment.
Fair point — the comment was making a Linux-shaped claim about platform-agnostic code. I looked at sys::fd::*::max_iov() first, but it's a private const fn with no
Windows definition, so reaching it from io::buffered would mean adding a new sys::io::MAX_IOV across every platform module. There's also no single right value to wire
it to: there are two disjoint max_iov()s in tree (fd vs Solid socket) because the limit depends on the transport, and LineWriter is generic over W: Write so it can't
know which applies.
The cap here isn't really about the syscall anyway — it's about not doing more memchr work per call than makes sense. 1024 happens to be the value of UIO_MAXIOV /
IOV_MAX on Linux, the BSDs, Apple, Cygwin, and Solid (same hardcoded number, same reasoning). On POSIX-fallback Unixes (16) or Windows (no cap) it still does its
quadratic-guard job, and write_all_vectored's outer loop keeps things correct.
There was a problem hiding this comment.
Shouldn't this (to be correct) take the head of bufs? But I'm also struggling to see why we care about code trying to pass thousands of buffers to write_vectored, is that common? It feels like it's a pretty expensive thing in itself: it means allocating at least 16 * 1024 = 16kb of pointers, and especially for line-buffered output that seems like it wouldn't actually be helpful?
There was a problem hiding this comment.
It already does ; &bufs[..bufs.len().min(MAX_BUFS_TO_SCAN)] is the head; the .rev() is on the iterator, to find the last newline within those first 1024. Bonus: lines
passed to the inner write_vectored is also ≤1024, matching IOV_MAX.
On whether thousands of bufs is realistic: the FIXME came from #121956, which fixed the same quadratic pattern in BufWriter/Stdout/Stderr and left LineWriter flagged
because the newline scan made it harder. The cap is a no-op when bufs.len() < 1024, so the common path is unaffected
There was a problem hiding this comment.
This is taking the head for the purposes of looking for last_newline_buf_idx -- for scan -- but we then use regular bufs below. The None path of the scan specifically falls back on writing back the full buffer (write_vectored(bufs)).
I think in that case we could have a buffered newline that doesn't get flushed (gets stuck in the inner BufWriter), right?
…ic write_all_vectored.
e974bd8 to
5dcf27d
Compare

…ic write_all_vectored.