Skip to content

gh-145685: per-type method cache implementation#150160

Open
kumaraditya303 wants to merge 29 commits into
python:mainfrom
kumaraditya303:mrocache
Open

gh-145685: per-type method cache implementation#150160
kumaraditya303 wants to merge 29 commits into
python:mainfrom
kumaraditya303:mrocache

Conversation

@kumaraditya303

@kumaraditya303 kumaraditya303 commented May 20, 2026

Copy link
Copy Markdown
Contributor

@kumaraditya303 kumaraditya303 added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 27, 2026
@bedevere-bot

Copy link
Copy Markdown

🤖 New build scheduled with the buildbot fleet by @kumaraditya303 for commit 728021c 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F150160%2Fmerge

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 27, 2026
Comment thread Python/typecache.c
Comment thread Python/typecache.c
Comment thread Python/typecache.c Outdated
Comment thread Python/typecache.c Outdated
Comment thread Include/internal/pycore_typecache.h Outdated
Comment thread Python/typecache.c Outdated
@JelleZijlstra JelleZijlstra removed their request for review June 3, 2026 12:48
@kumaraditya303 kumaraditya303 requested a review from itamaro as a code owner June 4, 2026 11:07
@kumaraditya303 kumaraditya303 requested a review from colesbury June 4, 2026 18:05
Comment thread Include/internal/pycore_typecache.h Outdated

@colesbury colesbury left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments below (mostly found by Claude). Also:

  • Have you run single-threaded benchmarks to ensure that we haven't introduced any regressions with this?
  • Does the type_lookup benchmark scale as well as the other scaling benchmarks? In other words, should we consider this a partial or full fix for the type cache scaling issues?

Comment thread Objects/typeobject.c
Comment thread Objects/typeobject.c Outdated
Comment thread Python/typecache.c
Comment thread Include/internal/pycore_typecache.h Outdated
@read-the-docs-community

read-the-docs-community Bot commented Jun 16, 2026

Copy link
Copy Markdown

Comment thread Python/typecache.c
}
}
new_cache->version_tag = type->tp_version_tag;
cache_set(type, new_cache);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The racy cache_set here can mask that two or more threads have concurrently resized and set the new cache, possibly leaking one or more allocations.

Personally, I think resizing here should not be lock-free. I don't see any good reason why it should.

Also, taking a quick glance at the tests, I don't see any that exercise the resizing path.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How? cache_resize is called under TYPE_LOCK.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, the atomic in cache_set is because there can be concurrent readers (but not other writers).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry I missed that comment. I don't suppose an assert(type_lock is held) would be easy to add?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the comment at the top "Lock-free per type method cache implementation" is misleading in light of this.

@kumaraditya303

Copy link
Copy Markdown
Contributor Author

Have you run single-threaded benchmarks to ensure that we haven't introduced any regressions with this?

Benchmark results show that it is 1% slower on gil enabled build https://github.com/facebookexperimental/free-threading-benchmarking/blob/main/results/bm-20260616-3.16.0a0-4fbec93/bm-20260616-vultr-x86_64-kumaraditya303-mrocache-3.16.0a0-4fbec93-vs-base.md

I think it it is within the error of benchmark machines and not significant.

@markshannon

Copy link
Copy Markdown
Member

I'm very wary of dismissing 1% slower as "within the noise" without some other evidence that there is no slowdown for a change like this.

It also looks like you've missed several of the stats.
How do the stats look for cache hits, misses and collisions?

How does this change memory use?

@colesbury

Copy link
Copy Markdown
Contributor

Results from running locally on my machine (via Claude, GIL-enabled, clang-20, optimized, lto=thin) look good: https://gist.github.com/colesbury/1e5c1eb4ccc1e626696b568d18d8bbc4. Cache misses drop substantially, but the overall effect is small (possibly within the noise margin) because:

  • the MRO cache doesn't matter as much now that we rely heavily on specialization.
  • the MRO hit rate was already high on these benchmarks

A few more suggestions:

  • We should drop the type_cache_collisions stat because it's not longer tracked and doesn't make sense in the same way. Previously a collision in the MRO cache would overwrite the previous cache entry; now we do linear probing.
  • We should track the number of times we resize/clear cache a type's cache

@kumaraditya303

Copy link
Copy Markdown
Contributor Author

We should track the number of times we resize/clear cache a type's cache

I've added stats for resizes and invalidations and removed collisions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants