Description
Hi,
I've been rummaging around this repo for a bit lately, getting a bit more familiar with the code by looking at some bugs.
What I really like working with is performance though and I noticed that even though files are formatting independently of each other (in some cases), it's all run single-threaded.
In most cases that's entirely reasonable since rustfmt
is using rustc-types that are inherently not thread-safe, but in some cases it's possible to run each format file in a thread. So I tested that out.
The performance benefits are pretty staggering, of course, the measurements are not especially scientific, and it's only measured on my machine, but we're not talking about percentage points here but 3x - 10x for big projects on my machine.
Running rustfmt
on all files in tokio
takes about 400ms while running multithreaded rustfmt
takes about 50ms.
Running rustfmt
on all files an unpublished large codebase takes about 5000ms, while multithreaded is about 800ms.
Running cargo fmt --all
on tokio
is around 265ms, while running cargo fmt --all
multithreaded is around 80ms.
Running cargo fmt --all
on the same unpublished large codebase is around 1800ms, multithreaded it's 570ms.
On a smaller repo with just a 7 files, multithreaded was about 2x faster.
Downsides
Making rustfmt
run concurrently will cause problems because of the good old C
-ghost, shared globals, in this case it's stdout
and stderr
, any kind of interaction with those facilities will cause a mess.
Exemplified:
File A and File B is processed concurrently, in verbose.
There's a diff in File B, and an error in file A.
You could get some output lite:
"Using rustfmt config file rustfmt.cfg for filea.rs"
"Using rustfmt config file rustfmt.cfg for fileb.rs"
Diff at fileb.rs l 322
Error writing files {msg}
(Error in file A, but you won't know that).
This is a non-trivial problem since stdout
and stderr
while they are internally synced for a given (e)println
are not synced on the file-parsing level. And locking stdout
and/or stderr
until a file-pass finishes would nullify most of the benefits of multithreading completely.
Proposed solution
Reword formatting runs to take an output buffer and then print that chronologically based on which file is processed.
This preserves previous printing behaviour completely, but requires a pretty sweeping change where these buffers have to be passed down anywhere printing is wanted.
In practice there's only one part of the code where this becomes particularly problematic, and that's the diff-writer which uses term
, which prints colored messages to the terminal.
There's another potential issue, which is preserving ordering between writes to stdout
and stderr
.
I.E. Buffers are passed down into the application, the code prints into those instead.
Now the code first prints with an eprintln
then a println
. The calling code gets the buffers back and is responsible for flushing them to stderr
and stdout
respectively, in which order should the calling code flush them?
This isn't necessarily a huge problem. Ideally code deep down the call-stack shouldn't be printing, and refactoring that may be the best thing to do for clarity anyway, but it's something to consider.