feat: improve integer parsing performance#156695
Draft
gilescope wants to merge 3 commits into
Draft
Conversation
Signed-off-by: Giles Cope <gilescope@gmail.com>
This comment has been minimized.
This comment has been minimized.
Signed-off-by: Giles Cope <gilescope@gmail.com>
This comment has been minimized.
This comment has been minimized.
Signed-off-by: Giles Cope <gilescope@gmail.com>
ds84182
reviewed
May 18, 2026
| /// ``` | ||
| #[unstable(feature = "int_from_ascii", issue = "134821")] | ||
| #[inline] | ||
| // `inline(always)` so the body's `radix`-dependent `ilog` bound |
There was a problem hiding this comment.
bits and is_signed_ty are both compile time constants, and there are only 34 possible radix', so there is a better solution than forcing this to be inlined.
Contributor
Author
There was a problem hiding this comment.
I tried, but the perf was worse. We could make the always inline just be for arm or we could close this and accept that at some point arm64 llvm will improve?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Const functions have improved significantly. Rather than the rough approximation of
can_not_overflowthat was introduced earlier (#95399), we can be accurate on digits that can lead to the fast path.There is also a slight specialisation for
u128to improve its parsing performance specifically as we can reduce the u128 multiplications that are done for fewer digits.(I've done additional fuzz testing on both implementations to ensure there's no regressions. And full disclosure: this PR was designed by Claude and I while I was having a bath yesterday using the
/remote-control)Benchmarks Arm64
Apple M3 Max, ./x bench library/coretests --test-args=from_str, against main. New u128/i128 and long-string benches added in this PR.
38 wins (≥5% faster), 1 regression (≥5% slower), 16 within ±5%. Best −55%, worst +13%.
The regression is structural - it's always that i32 / radix 36 combo.
Benchmarks x86
On x86_64 (AMD Ryzen 9 5950X): all 55 benches stayed within ±1.5%.
x86 already looks about as optimum as can be.
I've included the results of some additional benchmarks in here
_longthat exercise the longer path more. I could include them in the PR but one could have too many benchmarks.That said, if you prefer minimal changes in this PR and just want most of the performance wins, it seems changing
#[inline]to#[inline(always)]here will give you most of the perf wins with a one line change:I'll leave the choice up to you. I guess there must be some compile time cost to executing the const function, but there is an elegance that we have the maximum number hitting the unchecked path as possible.