gh-95004: specialize access to enums and fix scaling on free-threading by kumaraditya303 · Pull Request #148184 · python/cpython

kumaraditya303 · 2026-04-06T18:21:53Z

Issue: Specialize access to Enum attributes #95004

Fidget-Spinner

Thanks for doing this, I have just 3 comments.

Lib/test/test_opcache.py

Python/specialize.c

…o enums

Fidget-Spinner

Pretty close, just one question and one minor nit.

Misc/NEWS.d/next/Core_and_Builtins/2026-04-06-18-25-53.gh-issue-95004.CQeT_H.rst

Python/specialize.c

colesbury

I just tried this on a 96 core AWS machine and the scaling is not great: ~2.9x faster.

Given that, I'm pretty ambivalent about the chnage
I don't like to add benchmarks to Tools/ftscalingbench/ftscalingbench.py that don't scale well

kumaraditya303 · 2026-04-06T19:28:37Z

I just tried this on a 96 core AWS machine and the scaling is not great: ~2.9x faster.

On my mac I see 4.2x faster:

❯ ./python.exe Tools/ftscalingbench/ftscalingbench.py
Running benchmarks with 10 threads
object_cfunction           3.2x faster
cmodule_function           2.8x faster
object_lookup_special      4.3x faster
context_manager            4.9x faster
mult_constant              2.3x faster
generator                  2.3x faster
pymethod                   3.0x faster
pyfunction                 2.7x faster
module_function            2.7x faster
load_string_const          4.1x faster
load_tuple_const           3.6x faster
create_pyobject            4.4x faster
create_closure             4.4x faster
create_dict                3.8x faster
create_frozendict          4.0x faster
thread_local_read          3.5x faster
method_caller              3.3x faster
instantiate_dataclass      4.9x faster
instantiate_namedtuple     4.9x faster
instantiate_typing_namedtuple  4.9x faster
super_call                 4.6x faster
classmethod_call           3.8x faster
staticmethod_call          3.5x faster
deepcopy                   1.8x slower
setattr_non_interned       4.5x faster
enum_attr                  4.2x faster

I don't like to add benchmarks to Tools/ftscalingbench/ftscalingbench.py that don't scale well

I can remove the benchmark if you prefer.

Fidget-Spinner · 2026-04-06T19:39:30Z

Could it be that the Enum itself is not using deferred refcounting? It's a LOAD_GLOBAL_MODULE which increfs and then decrefs at each LOAD_ATTR.

Fidget-Spinner · 2026-04-06T20:00:43Z

I found the problem:

It seems that this perpetually deopts at the first _GUARD_TYPE_VERSION. Then, that causes a re-specialization, which is obviously bottlenecked on a lot of things. So it seems the current specialization/deopt needs to be fixed.

Investigating now.

Fidget-Spinner · 2026-04-06T20:12:17Z

Before:

taskset -c 0,2,4,6,8,10 ./python Tools/ftscalingbench/ftscalingbench.py enum_attr -t 6 --scale 10000
Running benchmarks with 6 threads
enum_attr                  4.2x faster

After:

Running benchmarks with 6 threads
enum_attr                  6.0x faster

Seems the pre-existing specialization for METACLASS_CHECK was bugged. Diff to fix this:

diff --git a/Python/specialize.c b/Python/specialize.c
index 355b6eabdb7..bfa7b8148e4 100644
--- a/Python/specialize.c
+++ b/Python/specialize.c
@@ -1220,13 +1220,14 @@ specialize_class_load_attr(PyObject *owner, _Py_CODEUNIT *instr,
 #ifdef Py_GIL_DISABLED
             maybe_enable_deferred_ref_count(descr);
 #endif
-            write_u32(cache->type_version, tp_version);
             write_ptr(cache->descr, descr);
             if (metaclass_check) {
-                write_u32(cache->keys_version, meta_version);
+                write_u32(cache->keys_version, tp_version);
+                write_u32(cache->type_version, meta_version);
                 specialize(instr, LOAD_ATTR_CLASS_WITH_METACLASS_CHECK);
             }
             else {
+                write_u32(cache->type_version, tp_version);
                 specialize(instr, LOAD_ATTR_CLASS);
             }
             Py_XDECREF(descr);

This is bugged in 3.14 as well it seems 5d3201f

Fidget-Spinner · 2026-04-06T20:26:05Z

@colesbury can you please try this again? Thank you!

kumaraditya303 · 2026-04-07T12:09:34Z

I see much better scaling now on AMD Ryzen Threadripper 3970X 32-Core Processor:

taskset -c 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 ./python Tools/ftscalingbench/ftscalingbench.py enum_attr -t 16 --scale 10000
Running benchmarks with 16 threads
enum_attr                 15.5x faster

Thanks @Fidget-Spinner for fixing this

colesbury

LGTM

kumaraditya303 added 2 commits April 6, 2026 20:00

fix scaling of enums

212bc3c

add tests

48ec270

bedevere-app bot added the awaiting core review label Apr 6, 2026

bedevere-app bot mentioned this pull request Apr 6, 2026

Specialize access to Enum attributes #95004

Closed

📜🤖 Added by blurb_it.

76e1ffc

Fidget-Spinner reviewed Apr 6, 2026

View reviewed changes

Lib/test/test_opcache.py Outdated Show resolved Hide resolved

Lib/test/test_opcache.py Outdated Show resolved Hide resolved

Python/specialize.c Outdated Show resolved Hide resolved

kumaraditya303 added 2 commits April 7, 2026 00:02

address review

4103816

Merge branch 'enums' of https://github.com/kumaraditya303/cpython int…

9c70f7a

…o enums

kumaraditya303 requested review from Fidget-Spinner and colesbury April 6, 2026 18:33

Fidget-Spinner reviewed Apr 6, 2026

View reviewed changes

Misc/NEWS.d/next/Core_and_Builtins/2026-04-06-18-25-53.gh-issue-95004.CQeT_H.rst Outdated Show resolved Hide resolved

Python/specialize.c Show resolved Hide resolved

fix news wording

fc85a0d

Fidget-Spinner approved these changes Apr 6, 2026

View reviewed changes

bedevere-app bot added awaiting merge and removed awaiting core review labels Apr 6, 2026

kumaraditya303 added performance Performance or resource usage topic-free-threading labels Apr 6, 2026

colesbury reviewed Apr 6, 2026

View reviewed changes

Fix LOAD_ATTR_CLASS_WITH_METACLASS_CHECK cache

2702e8b

colesbury approved these changes Apr 7, 2026

View reviewed changes

kumaraditya303 merged commit e371ce1 into python:main Apr 7, 2026
54 checks passed

bedevere-app bot removed the awaiting merge label Apr 7, 2026

kumaraditya303 deleted the enums branch April 7, 2026 16:23

Uh oh!

Conversation

kumaraditya303 commented Apr 6, 2026 • edited by bedevere-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Fidget-Spinner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Fidget-Spinner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

colesbury left a comment

Choose a reason for hiding this comment

Uh oh!

kumaraditya303 commented Apr 6, 2026

Uh oh!

Fidget-Spinner commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Fidget-Spinner commented Apr 6, 2026

Uh oh!

Fidget-Spinner commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Fidget-Spinner commented Apr 6, 2026

Uh oh!

kumaraditya303 commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

colesbury left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kumaraditya303 commented Apr 6, 2026 •

edited by bedevere-app bot

Loading

Fidget-Spinner commented Apr 6, 2026 •

edited

Loading

Fidget-Spinner commented Apr 6, 2026 •

edited

Loading

kumaraditya303 commented Apr 7, 2026 •

edited

Loading