Skip to content

Parse files in parallel when possible#21175

Open
ilevkivskyi wants to merge 3 commits intopython:masterfrom
ilevkivskyi:parallel-native-parse
Open

Parse files in parallel when possible#21175
ilevkivskyi wants to merge 3 commits intopython:masterfrom
ilevkivskyi:parallel-native-parse

Conversation

@ilevkivskyi
Copy link
Copy Markdown
Member

The idea is simple: new parser doesn't need the GIL, so we can parse files in parallel. Not sure why, but the most I see is ~4-5x speed-up with 8 threads, if I add more threads, it doesn't get visibly faster (I have 16 physical cores).

Some notes on implementation:

  • I use stdlib ThreadPoolExecutor, it seems to work OK.
  • I refactored parse_file() a bit, so that we can parallelize (mostly) just the actual parsing. I see measurable degradation if I try to parallelize all of parse_file().
  • I do not use psutil because it is an optional dependency. We may want to actually make it a required dependency at some point.
  • It looks like there is a weird mypyc bug, that causes ast_serialize to be None sometimes in some threads. I simply add an ugly workaround for now.
  • I only implement parallelization in the coordinator process. The workers counterpart can be done after Split type-checking into interface and implementation in parallel workers #21119 is merged (it will be trivial).

cc @JukkaL

@github-actions

This comment has been minimized.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 6, 2026

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

# TODO: we should probably use psutil instead.
# With psutil we can get a number of physical cores, while all stdlib
# functions include virtual cores (which is not optimal for performance).
available_threads = os.cpu_count() or 2 # conservative fallback
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, len(psutil.Process().cpu_affinity()) is better everywhere except Darwin/macOS, where psutil doesn't support that; though I still suggest taking the minimum of that and os.cpu_count() as the later respects -X cpu_count and/or PYTHON_CPU_COUNT for Python 3.13+ users (especially containerized users).

If you don't want to add a psutil dependency yet, I recommend os.sched_getaffinity(0) which is how os.process_cpu_count() is implemented on Python 3.13+. (you should also still call os.cpu_count() and use it if it is smaller, for the same reasons as above).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I tried sched_getaffinity() but it is not available on Python 3.10 (which we still support). I guess we may need to write a separate helper with various fallbacks logic to make this ~reliable.

Copy link
Copy Markdown
Contributor

@mr-c mr-c Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I see os.sched_getaffinity() all the way back to Python 3.3: https://docs.python.org/3.3/library/os.html#os.sched_getaffinity

However,

They are only available on some Unix platforms

So maybe your platform didn't implement it until a later Python version.

Yeah, helper function + memoization is very helpful here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants