fix(compiler): Correct parsetree caching behaviour by spotandjake · Pull Request #2280 · grain-lang/grain

spotandjake · 2025-04-19T06:57:25Z

This corrects the behaviour of parsetree caching, we noticed the other day that the lsp wasn't updating prompting #2267. This corrects the root source of that bug by validating the source hashes match when we are caching.

I think with this change we would be safe to remove the compiler reset allowing the lsp to run a little faster however I don't want todo that in this pr as I think we still need to validate the behaviour of the other things we don't reset DependencyGraph and Env, Ctype levels, Warnings, FS_Access, testing locally everything was working again though.

Note: After I made this pr I realized we use file_older in the dependencytree to check if something is dirty I'm wondering if instead of my source hashing semantics it would be better to use that api but I would like a bit of feedback before I go and do that.

ospencer · 2025-04-22T15:48:42Z

+    Option.fold(~none=None, ~some=Hashtbl.find_opt(cached_parsetrees), name)
+  ) {
+  | Some((cached_source, cached_program))
+      when cached_source == Hashtbl.hash(source) =>


Hashing the entire parsetree is super expensive. I would use file_older or some other mechanism to verify that the file hasn't changed and that the cache is valid.

Hashing the entire parsetree is super expensive. I would use file_older or some other mechanism to verify that the file hasn't changed and that the cache is valid.

I'm only hashing the source which is just hashing a string right? I am not against using file_older though ~~but I don't think that would work here because of compile_string caching~~.

That's not as bad; I thought it was hashing the AST.

Nope I am essentially just using the source hash to validate if the contents of the name has changed and we should be using a new parsetree rather than the cache.

I think it would be useful to test this with a large file (maybe like 25k-50k lines) and make sure the hashing isn't really noticeable compared to the parsing. Also, we should make an issue to only keep a number of cached parsetrees, because this is basically a memory leak in its current form.

We do not. They're both O(n), but remember that runtime complexity just describes how performance of the algorithm grows with more input, not actual running. You can have two O(n) algorithms, but one might take 20x longer than the other one. Both being O(n) just means that with more data, that one will still only take 20x longer than the other one.

Strings are easy to hash and have quick, optimized algorithms, whereas hashing a big data structure requires chasing a bunch of pointers, offset calculations, etc. ASTs are also a lot more data than the source strings.

Coming back to this, do you think it would be acceptable to test with Unix.time() and just ensure the time taken in A is less than B?

My concern with this approach is if the times are too close, a cpu hickup could cause our tests to flake.

Apologies if you thought I meant I wanted perf tests in the test suite. I just want benchmarks that you ran on your machine and you report back on the numbers.

The results from testing this are below, it seems that it's way faster to have the cache, the top image is with the caching logic implemented and the bottom is without the caching logic enabled.

The reason I mentioned the O(n) above was just a note on scaling that if the cache is that much faster for smaller files or larger files it should scale relatively the same.

I agree that we should open an issue about leaking parsetrees I think when we delete them though, is a complicated question. We may also want to consider skipping the hash altogether for short inputs, we could do a very quick check of the byte length of the string, and if it's under a threshold, we just don't cache. I'll open that issue whenever this is ready to be merged.

As just one more note, I think we should avoid using file_older where possible to try and reduce the compilers dependence on a file system.

Thanks for checking!

spotandjake added the compiler label Apr 19, 2025

spotandjake self-assigned this Apr 19, 2025

spotandjake requested review from alex-snezhko, marcusroberts, ospencer, peblair and phated as code owners April 19, 2025 06:57

spotandjake changed the title ~~fix(grainc): Correct cache parsetree cache behaviour~~ fix(grainc): Correct parsetree caching behaviour Apr 19, 2025

ospencer reviewed Apr 22, 2025

View reviewed changes

fix(grainc): Correct cache parsetree cache behaviour

6d70250

spotandjake force-pushed the spotandjake-parsetree-cache branch from 7cb911d to 6d70250 Compare February 5, 2026 19:43

ospencer approved these changes Feb 8, 2026

View reviewed changes

ospencer changed the title ~~fix(grainc): Correct parsetree caching behaviour~~ fix(compiler): Correct parsetree caching behaviour Feb 8, 2026

ospencer added this pull request to the merge queue Feb 8, 2026

spotandjake mentioned this pull request Feb 8, 2026

Cleanup cached parsetrees #2359

Open

Merged via the queue into grain-lang:main with commit 5f3f54d Feb 8, 2026
12 checks passed

github-actions Bot mentioned this pull request Feb 8, 2026

chore: release main #2306

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(compiler): Correct parsetree caching behaviour#2280

fix(compiler): Correct parsetree caching behaviour#2280
ospencer merged 1 commit into
grain-lang:mainfrom
spotandjake:spotandjake-parsetree-cache

spotandjake commented Apr 19, 2025 •

edited

Loading

Uh oh!

ospencer Apr 22, 2025

Uh oh!

spotandjake Apr 23, 2025 •

edited

Loading

Uh oh!

ospencer Apr 23, 2025

Uh oh!

spotandjake Apr 23, 2025

Uh oh!

ospencer Apr 24, 2025

Uh oh!

ospencer Apr 27, 2025

Uh oh!

spotandjake Oct 29, 2025 •

edited

Loading

Uh oh!

ospencer Oct 30, 2025

Uh oh!

spotandjake Feb 5, 2026 •

edited

Loading

Uh oh!

ospencer Feb 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

spotandjake commented Apr 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ospencer Apr 22, 2025

Choose a reason for hiding this comment

Uh oh!

spotandjake Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ospencer Apr 23, 2025

Choose a reason for hiding this comment

Uh oh!

spotandjake Apr 23, 2025

Choose a reason for hiding this comment

Uh oh!

ospencer Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

ospencer Apr 27, 2025

Choose a reason for hiding this comment

Uh oh!

spotandjake Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ospencer Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

spotandjake Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ospencer Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

spotandjake commented Apr 19, 2025 •

edited

Loading

spotandjake Apr 23, 2025 •

edited

Loading

spotandjake Oct 29, 2025 •

edited

Loading

spotandjake Feb 5, 2026 •

edited

Loading