Disable inlining of custom `io::Error` destructor by kotauskas · Pull Request #149146 · rust-lang/rust

kotauskas · 2025-11-20T12:55:38Z

Inlining the destructor of ErrorData (which currently gets inlined through the destructor of the packed Repr) is in no way helpful in real programs, as the source of the error will not be inlined, so there will not be any match assumptions to gain. The cost, meanwhile, is a code size increase by a factor of up to 5.4 in the case of dropping multiple io::Results in the same function. Accordingly, this disables the inlining to avoid unhelpful code bloat in opt-level = 3 programs.

The destructor of ErrorData on 32-bit platforms might be suffering from the same problem, but fixing that would require some sort of annotation that puts #[inline(never)] on the compiler-generated part of the destructor.

rustbot · 2025-11-20T12:55:43Z

r? @tgross35

rustbot has assigned @tgross35.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

joboet · 2025-11-20T15:21:49Z

I don't think preventing inlining of the whole function is a good idea. For cases where the compiler knows the error variant, inlining allows removing the entire destructor. I think it'd be better to prevent inlining the destructor of the custom variant, which is the only one that actually needs destruction.

kotauskas · 2025-11-20T19:16:23Z

For cases where the compiler knows the error variant, inlining allows removing the entire destructor.

Those are highly unlikely to occur on hot paths in real programs: the construction point of the error would have to get inlined into the destruction point. It's most certainly not worth inlining the branches (which still worsen the code size, albeit not as much as inlining the destructor) into every io::Error destruction point in hopes that a handful of them will be slightly faster in the particular case of someone having stubbed out a bunch of functions that return io::Result with ones that always return Err(io::Error::from(io::ErrorKind::Unsupported)). Especially the fact that this only happens when stubbing on #[cfg] makes it very unlikely for the destructor elision made possible by inlining the branches to yield meaningful improvements.

tgross35 · 2025-11-21T01:53:38Z

We can at least
@bors2 try @rust-timer queue

Is only removing #[inline] without forbidding it sufficient enough to do anything here?

Disable inlining of packed `io::Error` destructor

rust-bors · 2025-11-21T04:10:31Z

☀️ Try build successful (CI)
Build commit: 07c4fb2 (07c4fb26dcf5a66012bc0079769c55cae7a6a00a, parent: 5f7653df82f7076960d5760830554c98f4cab215)

rust-timer · 2025-11-21T05:30:16Z

Finished benchmarking commit (07c4fb2): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.7%	[-1.6%, -0.3%]	8
Improvements ✅ (secondary)	-6.2%	[-18.3%, -0.1%]	13
All ❌✅ (primary)	-0.7%	[-1.6%, -0.3%]	8

Max RSS (memory usage)

Results (primary 1.3%, secondary -2.7%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	3.5%	[3.5%, 3.5%]	1
Regressions ❌ (secondary)	1.8%	[1.6%, 2.0%]	2
Improvements ✅ (primary)	-1.0%	[-1.0%, -1.0%]	1
Improvements ✅ (secondary)	-3.4%	[-5.0%, -1.4%]	13
All ❌✅ (primary)	1.3%	[-1.0%, 3.5%]	2

Cycles

Results (primary 0.2%, secondary -2.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	2.5%	[2.5%, 2.5%]	1
Regressions ❌ (secondary)	4.1%	[1.6%, 7.2%]	8
Improvements ✅ (primary)	-2.2%	[-2.2%, -2.2%]	1
Improvements ✅ (secondary)	-7.7%	[-15.7%, -2.0%]	10
All ❌✅ (primary)	0.2%	[-2.2%, 2.5%]	2

Binary size

Results (primary -0.3%, secondary -1.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.1%, 0.7%]	6
Regressions ❌ (secondary)	6.9%	[6.9%, 6.9%]	1
Improvements ✅ (primary)	-0.4%	[-1.0%, -0.1%]	41
Improvements ✅ (secondary)	-1.2%	[-13.2%, -0.0%]	94
All ❌✅ (primary)	-0.3%	[-1.0%, 0.7%]	47

Bootstrap: 472.873s -> 472.231s (-0.14%)
Artifact size: 388.93 MiB -> 388.44 MiB (-0.12%)

scottmcm · 2025-11-21T07:12:40Z

+    // meanwhile, is a code size increase by a factor of up to 5.4 in the case
+    // of dropping multiple io::Results in the same function
+    // (https://godbolt.org/z/8hfGchjsT).
+    #[inline(never)]


Pondering: why is this the correct one to inline(never)? What if we made decode_repr be inline(never) and allowed inlining the destructor as that turns into just a call to decode_repr?

Said otherwise, if this drop is a problem, wouldn't data and data_mut and such also not want this inlining?

decode_repr is a generic function in which C can be one of &Custom, &mut Custom, or Box<Custom>. In the case of the references, which are instantiated by data and data_mut, no types with destructors are involved and there are no destructors to shuffle around. In the Box<Custom> case, which is what's called by the destructor of Repr, decode_repr returns ownership of that which needs its destructor called and never calls any destructors itself, meaning that putting #[inline(never)] on it and removing it from Repr::drop would just make the destructor get inlined again (except the code bloat would be even worse because the branching would get duplicated).

The ideal thing would be to ensure ErrorData's drop is outlined but only when it has a nontrivial drop, right? I was thinking this could be done with something like:

trait OutlineDrop { const SHOULD_OUTLINE: bool = false; } impl<T> OutlineDrop for &T {} impl<T> OutlineDrop for &mut T {} impl OutlineDrop for Box<Custom> { const SHOULD_OUTLINE: bool = true; } impl<C: MaybeOutlineDrop> Drop for ErrorData<C> { #[inline(always)] fn drop(&mut self) { #[inline(never)] fn drop_slow<C: MaybeOutlineDrop>(this: &mut ErrorData<C>) { // Insert a variant with no drop needed, then drop the custom variant // within this outlined function. let mut dst = ErrorData::Os(0); mem::swap(this, &mut dst); drop(dst); } // Only call the outlined drop if C::SHOULD_OUTLINE && matches!(self, ErrorData::Custom(_)) { drop_slow(self); } } }

But from a quick check it seems like the matches! check is outlined, despite the #[inline(always)], which completely defeats the purpose.

The current optimization is kind of weird for this too. Looking at your godbolt link, it seems like it decides to inline the first drop and outline the second drop? Interesting heuristics.

I do think it's worth double checking whether there is an option to get this behavior in more places the nontrivial drop may happen, as Scott mentioned. But if there doesn't seem to be a good way, I don't see any reason not to make this change given the perf results.

What is the result without any #[inline]/#[inline(never)] btw? Seems like that could give it some flexibility.

kotauskas · 2025-11-21T08:34:01Z

A note on the code size figures: the only benchmarks that regressed are rlibs (probably code that would otherwise be instantiated downstream being moved to the rlibs themselves, in actuality likely reducing the code size after linking), and the executables are wholly in the green.

tgross35 · 2026-01-03T10:15:32Z

Would you be able to follow up with the questions at #149146 (comment)?

@rustbot author

rustbot · 2026-01-03T10:15:36Z

Reminder, once the PR becomes ready for a review, use @rustbot ready.

kotauskas · 2026-01-03T23:40:26Z

@rust-timer queue

kotauskas · 2026-01-03T23:43:13Z

@tgross35 I don't actually have a proper setup for building and benchmarking standard library changes, so I'll need someone with appropriate permissions to invoke rust-timer to test how it performs with neither #[inline] nor #[inline(never)].

joboet · 2026-01-04T10:04:23Z

@bors try @rust-timer queue

Disable inlining of packed `io::Error` destructor

the8472 · 2026-05-22T19:08:12Z

@bors try @rust-timer queue

Disable inlining of custom `io::Error` destructor

rust-bors · 2026-05-22T21:18:28Z

☀️ Try build successful (CI)
Build commit: e7abc97 (e7abc97a754a3491611e1ebe2980e62d11e201b3, parent: b52edc25bfbaa955b4b83c10f998e5224c3478b2)

rust-timer · 2026-05-22T21:59:55Z

Finished benchmarking commit (e7abc97): comparison URL.

Overall result: ❌✅ regressions and improvements - please read:

Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf.

Next, please: If you can, justify the regressions found in this try perf run in writing along with @rustbot label: +perf-regression-triaged. If not, fix the regressions and do another perf run. Neutral or positive results will clear the label automatically.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	0.8%	[0.4%, 2.4%]	6
Regressions ❌ (secondary)	3.9%	[0.8%, 9.8%]	3
Improvements ✅ (primary)	-0.5%	[-0.6%, -0.4%]	2
Improvements ✅ (secondary)	-3.9%	[-7.7%, -1.0%]	6
All ❌✅ (primary)	0.5%	[-0.6%, 2.4%]	8

Max RSS (memory usage)

Results (primary -0.7%, secondary -2.2%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	4.1%	[4.1%, 4.1%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-3.1%	[-3.6%, -2.6%]	2
Improvements ✅ (secondary)	-2.2%	[-2.4%, -2.1%]	3
All ❌✅ (primary)	-0.7%	[-3.6%, 4.1%]	3

Cycles

Results (primary 2.6%, secondary -1.7%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	2.6%	[2.6%, 2.6%]	1
Regressions ❌ (secondary)	4.9%	[2.3%, 7.6%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-4.5%	[-8.4%, -2.9%]	7
All ❌✅ (primary)	2.6%	[2.6%, 2.6%]	1

Binary size

Results (primary 0.0%, secondary -0.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.1%	[0.0%, 1.2%]	54
Regressions ❌ (secondary)	0.2%	[0.0%, 1.8%]	68
Improvements ✅ (primary)	-0.4%	[-2.9%, -0.0%]	13
Improvements ✅ (secondary)	-6.0%	[-8.3%, -1.0%]	6
All ❌✅ (primary)	0.0%	[-2.9%, 1.2%]	67

Bootstrap: 513.181s -> 510.613s (-0.50%)
Artifact size: 400.52 MiB -> 400.44 MiB (-0.02%)

kotauskas · 2026-05-22T22:45:09Z

I've eyeballed the standard library from the try build and it seems like the culprit of the dismal results is that decode_repr is optimizing very poorly. Will come up with a more direct way of doing this tomorrow.
@rustbot author

rustbot · 2026-05-23T21:15:32Z

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

kotauskas · 2026-05-23T21:19:06Z

With me having finally set up x.py on my machine, the codegen of stage 1 std is looking good now. This should be the last iteration.

Kobzol · 2026-05-23T21:21:35Z

@bors try @rust-timer queue

Disable inlining of custom `io::Error` destructor

rust-bors · 2026-05-23T23:31:00Z

☀️ Try build successful (CI)
Build commit: 719220c (719220c985fd5f95c8f0904e3fe52ba04b754025, parent: 54333ff079780f803f65dcee30c544050b35f544)

rust-timer · 2026-05-24T00:12:29Z

Finished benchmarking commit (719220c): comparison URL.

Overall result: ❌✅ regressions and improvements - please read:

Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf.

Next, please: If you can, justify the regressions found in this try perf run in writing along with @rustbot label: +perf-regression-triaged. If not, fix the regressions and do another perf run. Neutral or positive results will clear the label automatically.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	1.5%	[0.5%, 2.6%]	2
Regressions ❌ (secondary)	1.1%	[1.1%, 1.1%]	1
Improvements ✅ (primary)	-0.5%	[-0.9%, -0.3%]	7
Improvements ✅ (secondary)	-7.6%	[-13.9%, -3.5%]	7
All ❌✅ (primary)	-0.0%	[-0.9%, 2.6%]	9

Max RSS (memory usage)

Results (primary -7.0%, secondary -1.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	1.9%	[1.4%, 2.3%]	3
Improvements ✅ (primary)	-7.0%	[-9.2%, -4.8%]	2
Improvements ✅ (secondary)	-3.2%	[-5.9%, -2.2%]	4
All ❌✅ (primary)	-7.0%	[-9.2%, -4.8%]	2

Cycles

Results (primary -1.2%, secondary -7.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	2.3%	[2.3%, 2.3%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-3.0%	[-3.6%, -2.3%]	2
Improvements ✅ (secondary)	-7.6%	[-15.0%, -3.6%]	6
All ❌✅ (primary)	-1.2%	[-3.6%, 2.3%]	3

Binary size

Results (primary -0.3%, secondary -4.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.0%, 1.0%]	8
Regressions ❌ (secondary)	0.6%	[0.0%, 1.0%]	4
Improvements ✅ (primary)	-0.7%	[-4.9%, -0.0%]	13
Improvements ✅ (secondary)	-6.0%	[-14.7%, -0.4%]	9
All ❌✅ (primary)	-0.3%	[-4.9%, 1.0%]	21

Bootstrap: 510.282s -> 511.986s (0.33%)
Artifact size: 400.55 MiB -> 400.30 MiB (-0.06%)

kotauskas · 2026-05-24T13:44:23Z

I've made the amazing discovery that the most recent version of the compiler is actually doing the thing this PR originally proposed:

On the left is the latest nightly, outlining the destructor as this PR originally did via #[inline(never)] on the Drop implementation, and on the right is the latest version of this PR, where I inline the error kind check and avoid the call if the error isn't custom. That's why I can't seem to beat the very first entirely green perf run: the comparison is essentially between the old version of the PR and the new one, rather than between the inlining spam seen in the Godbolt link and this PR.

Still, I can't reproduce the ripgrep size regression on my machine, and it's actually a huge improvement:

Also worth mentioning: I've eyeballed some standard library disassembly and the drop_custom call does in fact get elided when the error is known to not be custom. My conjecture that it wouldn't happen was erroneous.

With all that being said, I think the PR is good to land.
@rustbot ready
@rustbot label: +perf-regression-triaged

rustbot assigned tgross35 Nov 20, 2025

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Nov 20, 2025

This comment has been minimized.

Sign in to view

rust-bors Bot added a commit that referenced this pull request Nov 21, 2025

Auto merge of #149146 - kotauskas:patch-1, r=<try>

07c4fb2

Disable inlining of packed `io::Error` destructor

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Nov 21, 2025

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Nov 21, 2025

scottmcm reviewed Nov 21, 2025

View reviewed changes

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 3, 2026

This comment has been minimized.

Sign in to view

rust-bors Bot added a commit that referenced this pull request Jan 4, 2026

Auto merge of #149146 - kotauskas:patch-1, r=<try>

2263b42

Disable inlining of packed `io::Error` destructor

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 4, 2026

kotauskas force-pushed the patch-1 branch from c9de6f2 to 4298a05 Compare May 22, 2026 17:39

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 22, 2026

This comment has been minimized.

Sign in to view

rust-bors Bot pushed a commit that referenced this pull request May 22, 2026

Auto merge of #149146 - kotauskas:patch-1, r=<try>

e7abc97

Disable inlining of custom `io::Error` destructor

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 22, 2026

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 22, 2026

Disable inlining of custom io::Error destructor

7237890

kotauskas force-pushed the patch-1 branch from 4298a05 to 7237890 Compare May 23, 2026 21:15

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 23, 2026

rust-bors Bot pushed a commit that referenced this pull request May 23, 2026

Auto merge of #149146 - kotauskas:patch-1, r=<try>

719220c

Disable inlining of custom `io::Error` destructor

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 24, 2026

Uh oh!

Conversation

kotauskas commented Nov 20, 2025 • edited by rustbot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Nov 20, 2025

Uh oh!

joboet commented Nov 20, 2025

Uh oh!

kotauskas commented Nov 20, 2025

Uh oh!

tgross35 commented Nov 21, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

rust-bors Bot commented Nov 21, 2025

Uh oh!

This comment has been minimized.

rust-timer commented Nov 21, 2025

Overall result: ✅ improvements - no action needed

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tgross35 Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kotauskas commented Nov 21, 2025

Uh oh!

tgross35 commented Jan 3, 2026

Uh oh!

rustbot commented Jan 3, 2026

Uh oh!

This comment has been minimized.

kotauskas commented Jan 3, 2026

Uh oh!

This comment has been minimized.

kotauskas commented Jan 3, 2026

Uh oh!

joboet commented Jan 4, 2026

Uh oh!

This comment has been minimized.

This comment has been minimized.

the8472 commented May 22, 2026

Uh oh!

This comment has been minimized.

This comment has been minimized.

rust-bors Bot commented May 22, 2026

Uh oh!

This comment has been minimized.

rust-timer commented May 22, 2026

Overall result: ❌✅ regressions and improvements - please read:

Uh oh!

kotauskas commented May 22, 2026

Uh oh!

rustbot commented May 23, 2026

Uh oh!

kotauskas commented May 23, 2026

Uh oh!

Kobzol commented May 23, 2026

Uh oh!

This comment has been minimized.

This comment has been minimized.

rust-bors Bot commented May 23, 2026

Uh oh!

This comment has been minimized.

rust-timer commented May 24, 2026

Overall result: ❌✅ regressions and improvements - please read:

Uh oh!

kotauskas commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

kotauskas commented Nov 20, 2025 •

edited by rustbot

Loading

tgross35 Nov 21, 2025 •

edited

Loading

kotauskas commented May 24, 2026 •

edited

Loading