Skip to content

gh-122931: Allow stable API extensions to include a multiarch tuple in the filename#122917

Open
stefanor wants to merge 6 commits into
python:mainfrom
stefanor:stable-abi-multiarch
Open

gh-122931: Allow stable API extensions to include a multiarch tuple in the filename#122917
stefanor wants to merge 6 commits into
python:mainfrom
stefanor:stable-abi-multiarch

Conversation

@stefanor

@stefanor stefanor commented Aug 11, 2024

Copy link
Copy Markdown
Contributor

This permits stable ABI extensions for multiple architectures to be co-installed into the same directory, without clashing with each other, the same way (non-stable ABI) regular extensions can.

It is listed below the current .abi3 suffix because setuptools will select the first suffix containing .abi3, as the target filename. We do this to protect older Python versions predating this patch.

@stefanor stefanor force-pushed the stable-abi-multiarch branch 2 times, most recently from 5e380a1 to 7185a58 Compare August 12, 2024 07:35
stefanor added a commit to stefanor/cpython that referenced this pull request Aug 12, 2024
…ple in the filename

This permits stable ABI extensions for multiple architectures to be
co-installed into the same directory, without clashing with each other,
the same way (non-stable ABI) regular extensions can.

It is listed below the current .abi3 suffix because setuptools will
select the first suffix containing .abi3, as the target filename.
We do this to protect older Python versions predating this patch.
@stefanor stefanor force-pushed the stable-abi-multiarch branch from 7185a58 to 2955027 Compare August 12, 2024 07:51
@stefanor stefanor changed the title Allow stable API extensions to include a multiarch tuple in the filename gh-122931 Allow stable API extensions to include a multiarch tuple in the filename Aug 12, 2024
stefanor added a commit to stefanor/cpython that referenced this pull request Aug 12, 2024
…ple in the filename

This permits stable ABI extensions for multiple architectures to be
co-installed into the same directory, without clashing with each other,
the same way (non-stable ABI) regular extensions can.

It is listed below the current .abi3 suffix because setuptools will
select the first suffix containing .abi3, as the target filename.
We do this to protect older Python versions predating this patch.
@stefanor stefanor force-pushed the stable-abi-multiarch branch from 2955027 to c5b58a4 Compare August 12, 2024 09:21

@vstinner vstinner left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should document this change in https://docs.python.org/dev/whatsnew/3.14.html (Doc/whatsnew/3.14.rst). I expect that you would elaborate a bit on the usage, how to use the feature, why it's needed, etc.

@stefanor stefanor requested a review from encukou as a code owner August 28, 2024 12:03
@stefanor

Copy link
Copy Markdown
Contributor Author

You should document this change in https://docs.python.org/dev/whatsnew/3.14.html

Done.

FWIW, in Debian we plan to backport this to 3.13 too.

@vstinner

Copy link
Copy Markdown
Member

I'm only aware of Debian who uses "multiarch". Do other operating systems also use it? Maybe Debian variants, Ubuntu, and Ubuntu variants?

This change will slow down any "import module". I don't recall if there is a cache for that or not.

@stefanor

Copy link
Copy Markdown
Contributor Author

Maybe Debian variants, Ubuntu, and Ubuntu variants?

All Debian derivatives, yes. They typically don't deviate very much, when it comes to plumbing.

@stefanor

Copy link
Copy Markdown
Contributor Author

This change will slow down any "import module". I don't recall if there is a cache for that or not.

What do you want to do about that? Is there a benchmark you'd like to see results for?

I see a table of 5 entries (including NULL) increasing to 6. That is one extra item to search, when:

  1. Importing C extensions from multiarch filenames.
  2. Importing C extensions with no tag of any kind (foo.so).
  3. Failing to import something because there isn't a C extension with this name.

@vstinner

vstinner commented Sep 2, 2024

Copy link
Copy Markdown
Member

What do you want to do about that? Is there a benchmark you'd like to see results for?

Example of command:

strace -o trace python3 -c pass

strace output:

newfstatat(AT_FDCWD, "/usr/lib64/python3.12/encodings/__init__.cpython-312-x86_64-linux-gnu.so", 0x7ffe20002e50, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/usr/lib64/python3.12/encodings/__init__.abi3.so", 0x7ffe20002e50, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/usr/lib64/python3.12/encodings/__init__.so", 0x7ffe20002e50, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/usr/lib64/python3.12/encodings/__init__.py", {st_mode=S_IFREG|0644, st_size=5884, ...}, 0) = 0

newfstatat(AT_FDCWD, "/usr/lib64/python3.12/encodings/__init__.py", {st_mode=S_IFREG|0644, st_size=5884, ...}, 0) = 0
openat(AT_FDCWD, "/usr/lib64/python3.12/encodings/__pycache__/__init__.cpython-312.pyc", O_RDONLY|O_CLOEXEC) = 3

Internally, when Python imports the encoding package, it has to do 4 fstat() calls. You can see a check on the .abi3.so suffix.

If we add a new entry to the import suffix, it will add one fstat() syscall per import (at least, to import a package). We should consider the impact on performance if we decide to add this feature.

@freakboy3742

Copy link
Copy Markdown
Contributor

I'm only aware of Debian who uses "multiarch". Do other operating systems also use it? Maybe Debian variants, Ubuntu, and Ubuntu variants?

FWIW: macOS, iOS and Android all use multiarch as a configuration value - it's used to differentiate ARM from x86_64 simulators, and devices from simulators/emulators.

However, on iOS and Android, there's limited need to keep binary artefacts in the same folder, as any given executable can only have a single architecture's executables. In addition, in the case of iOS, the binaries need to be migrated to the Frameworks folder and named as Frameworks, so any "side-by-side" benefits would be lost. Any "other platform" executables need to be stripped out as part of the build process, at which point there's no naming conflict.

macOS uses the multiarch config value, but the value is always "darwin", and the universal binary format exists to support multiple architectures in a single binary file. I guess it might be useful to be able to have x86_64 and ARM64 binaries side-by-side... but there's probably only a year or two left in the official supported life of x86_64, so I don't think adding this feature will ultimately be that helpful for the macOS use case.

@vstinner

vstinner commented Sep 2, 2024

Copy link
Copy Markdown
Member

@stefanor: Did you consider to maintain this change as a downstream-only patch in Debian?

If you would like to make it upstream, I would suggest making it optional, disabled by default, and add a configure option to enable it. That's how I added some Fedora specific changes, such as:

@stefanor

stefanor commented Sep 3, 2024

Copy link
Copy Markdown
Contributor Author

Here are some benchmarks:

Google Sheet with Data

There is no discernable performance difference in minimal interpreter startup python -c ''.

import time
import sys
from subprocess import check_call

t1 = time.perf_counter()
for i in range(1000):
    check_call([sys.executable, "-c", ""])
t2 = time.perf_counter()
print((t2-t1) / 1000)

image

Looking at strace, I see only a single extra syscall.


We can manufacture an import-intensive benchmark:

import pkgutil
import time

t1 = time.perf_counter()
for module in pkgutil.walk_packages(onerror=lambda pkg: None):
    if module.name.startswith("test."):
        continue
    if module.name.endswith(".__main__"):
        continue
    if module.name in {"antigravity", "idlelib.idle", "this", "zen"}:
        continue
    try:
        __import__(module)
    except Exception:
        pass
t2 = time.perf_counter()
print((t2 - t1))

image

Analysing this, I see 116 stat syscalls on .so filenames jumping to 232. But in a 100ms total execution time, that's negligible, and I can't see any statistically significant results.

@stefanor

stefanor commented Sep 3, 2024

Copy link
Copy Markdown
Contributor Author

I'm only aware of Debian who uses "multiarch".

Also, note that this is used on all linux platforms. The default for (non-stable ABI) extensions is to include the multiarch tuple in the extension filename. Pick a random binary wheel built in manylinux, and you'll see them.

@stefanor: Did you consider to maintain this change as a downstream-only patch in Debian?

I would be happy to do that. Although I prefer not carrying downstream-only patches long-term if possible. Look at the mess around dist-packages for example.

Without this MR, there's no real point in supporting the multiarch tags in the non-stable ABI extensions. Either we get the benefit from doing it everywhere, or you say you don't need to support the feature, and we rip it out everywhere. We've got half a feature at the moment. Would you prefer it to all be behind a config argument?

If you would like to make it upstream, I would suggest making it optional, disabled by default, and add a configure option to enable it. That's how I added some Fedora specific changes, such as:

These are not entirely Fedora-specific. And that's a good argument for always trying to upstream.

https://docs.python.org/dev/using/configure.html#cmdoption-with-platlibdir

In Debian we use those paths for our cross-compilers. I imagine there is a scenario where it's useful to have Python headers in there.

https://docs.python.org/dev/using/configure.html#cmdoption-with-wheel-pkg-dir

We're a happy customer of this too, It meant one less patch to carry.

@stefanor

stefanor commented Sep 4, 2024

Copy link
Copy Markdown
Contributor Author

There is no discernable performance difference in minimal interpreter startup python -c ''.

Maybe 0.5% slowdown with PGO. For 1 extra syscall, I'm surprised it's visible. This may still not be significant.

image

@vstinner

Copy link
Copy Markdown
Member

@erlend-aasland @encukou: Would you mind to have a look at this issue? What do you think? Should it become the default behavior, or should it be a configure option?

@encukou

encukou commented Sep 23, 2024

Copy link
Copy Markdown
Member

IMO, the most important thing here is to keep this in mind for abi4 (or whatever we name the free-threading-compatible stable ABI). There it should be the only option.
If we get abi4 in 3.14, it's probably fine to keep this as a downstream-only patch for one more release.

@stefanor

Copy link
Copy Markdown
Contributor Author

IMO, the most important thing here is to keep this in mind for abi4 (or whatever we name the free-threading-compatible stable ABI). There it should be the only option.

Yeah, that sounds sensible.

If we get abi4 in 3.14, it's probably fine to keep this as a downstream-only patch for one more release.

Is that expected? We've been on abi3 for quite a while.

@stefanor

Copy link
Copy Markdown
Contributor Author

There it should be the only option.

If we want that for the stable ABI, do we want to do the same thing for the regular ABI? It would probably affect a lot of build systems that assume they can name their output extension.so

@encukou

encukou commented Sep 24, 2024

Copy link
Copy Markdown
Member

Is that expected? We've been on abi3 for quite a while.

Yes, over a decade. abi3 showing its age, and since it's incompatible free-threading we'll need a new alternative soon.
abi3 is not going away, but changing it this late in the cycle might not be worth it.

If we want that for the stable ABI, do we want to do the same thing for the regular ABI?

I don't see much reason to deprecate and remove that, so that we don't break the build systems you mention. (Note that even abi3/abi4 extensions will work with a bare .so extension, since the stable ABI is a subset of the full one.)

@stefanor

stefanor commented Oct 7, 2024

Copy link
Copy Markdown
Contributor Author

FWIW, in Debian we plan to backport this to 3.13 too.

I'm going to include it in our 3.13.0 upload.

@stefanor

stefanor commented Dec 19, 2024

Copy link
Copy Markdown
Contributor Author

🔔 (I'd like to remind people to please consider this)

@stefanor

Copy link
Copy Markdown
Contributor Author

It is a continuous source of friction and latent bugs. From a packaging tool author as well as package-with-c-exts author perspective, it's definitely not ideal.

Noted that you wouldn't recommand anybody else implement this interface. Being closer to it, I have a much more positive view. If you look at the problems it's solving, I really don't see any other obvious way to go about it that cause less friction. It massively reduces the amount of duplication and churn we have to do in the distro. It should also be a useful tool for anybody who wants to ship something for multiple Python releases and/or architectures.

How would you like to update the NEWS messaging to represent your view here?

If this is all something that Python wants to deprecate, we should design a replacement for it. We just had a packaging summit at PyCon. If that was something we should have discussed, I wish I'd known that.

@stefanor stefanor force-pushed the stable-abi-multiarch branch from 0117088 to 7b071c2 Compare May 20, 2026 16:31
@rgommers

Copy link
Copy Markdown
Contributor

How would you like to update the NEWS messaging to represent your view here?

I think that entry is fine, it's just a factual description of the technical change in this PR. I don't think it needs to touch on anything bigger-picture.

If this is all something that Python wants to deprecate

I certainly don't speak for Python (I'm not even a core dev), but: I don't think anyone wants to push for that. It's fine to exist.

We just had a packaging summit at PyCon. If that was something we should have discussed, I wish I'd known that.

The key thing that seems needed is integration testing for interleaved installs CPython, so that the next time any requirements get captured in time, rather than afterwards. That came up on Discourse before, can't find the link right now but it was recent so I guess you remember?

@github-actions github-actions Bot removed the stale Stale PR or inactive for long period of time. label May 22, 2026
@vstinner

Copy link
Copy Markdown
Member

@encukou:

@vstinner, what would it take for you to reconsider?

This change adds 2 more stat() syscalls per Python import.

Example with ./python -c 'import pip'.

  • import encodings does 7 stat() syscalls before and 9 syscalls after.
  • import pip does 7 stat() syscalls before and 9 syscalls after.

Example with ./python -m pip --version (measure using strace):

  • Before: 4,027 syscalls
  • After: 4,106 syscall (+79)

My position didn't change. IMO it's a bad idea to add syscalls on all Unix platforms whereas only Debian, Ubuntu and variants need this change. As I wrote previously, I would prefer adding a configure option and use this option on Debian/Ubuntu.

We are working hard to reduce the Python startup time (lazy import is the latest major work on that). Adding "useless" syscalls sounds counter productive to me.

I would be perfectly fine with a configure option, so Debian/Ubuntu no longer have to maintain a patch downstream.

@stefanor

Copy link
Copy Markdown
Contributor Author

@vstinner: There are a few other components to this.

  1. We have an opportunity to implement this for abi4t now, and not need the extra stat. Can we do this?
  2. We have an asymmetry between abi4 and the default non-stable ABI filenames. This is an opportunity to resolve that, with the temporary addition of an extra stat.

@brettcannon brettcannon changed the title gh-122931 Allow stable API extensions to include a multiarch tuple in the filename gh-122931: Allow stable API extensions to include a multiarch tuple in the filename May 25, 2026
@vstinner

Copy link
Copy Markdown
Member

What are abi4 and abi4t?

Python only supports abi3 and abi3t: https://docs.python.org/dev/c-api/stable.html#stable-application-binary-interfaces.

@stefanor

Copy link
Copy Markdown
Contributor Author

That is, of course, what I meant.

It was proposed earlier in this MR, that we defer the change for abi4, and now we have abi3t, which is logically equivalent.

@eli-schwartz

Copy link
Copy Markdown
Contributor

As I mentioned the historical background for adding platform triplet information in the above linked ticket, I also happened to notice there was quite some discussion about the benefits. The background: #67169

I think that these are useful considerations that are still just as relevant today as they were in the past.

And we are already past the point of no return in committing to an additional syscall, I think (given we are settled on looking for abi3t.so), which means there isn't even a performance problem to be had in switching from one syscall to one syscall. So, what precisely is the objection to at least handling this?

@vstinner, can you confirm you have no objection? Previously you said your objection was about adding new syscalls, but when asked specifically about abi3t you took a detour into arguing about what the final name of abi3t is, and never really replied after that. The exact same thing happened in October of 2025, when you were asked about getting this right at least for the next ABI and instead of answering the question you said "right now there are no plans for an abi4": #122917 (comment)

abi4 is here. It is called abi3t. Debating which name to use is strange arguing over semantics. Everyone agrees it is a new ABI, with a new ABI soname, and it's the one after abi3.so.

What name shall we look it up using? abi3t.so or abi3t-powerpc64le-linux-gnu.so? Can this PR land if @stefanor updates it to only modify abi3t?

(@stefanor, please proactively do so anyway. I often find that people are much more resistant to "will you accept it if I XYZ" than "I did XYZ, will you accept it now". It's weird but it does help, so hopefully that will be enough here.)

@rgommers

Copy link
Copy Markdown
Contributor

What name shall we look it up using? abi3t.so or abi3t-powerpc64le-linux-gnu.so?

If this is still changed, it really should be done yesterday, and it might arguably already be too late to do the "or" rather than "look for abi3t-powerpc64le-linux-gnu.so first, fall back to abi3t.so". Because build backends and build systems are already released with abi3t support, and doing the unconditional change to ``abi3t-xxx.so` would need new releases of Maturin, scikit-build-core and CMake at this point (extension name is hardcoded there currently I believe).

It's unlikely any wheels were already uploaded to PyPI, but it's not impossible - those would have to be yanked or hard-deleted (with lock files, yanking may not be enough).

@vstinner

vstinner commented Jun 26, 2026

Copy link
Copy Markdown
Member

What name shall we look it up using? abi3t.so or abi3t-powerpc64le-linux-gnu.so? Can this PR land if @stefanor updates it to only modify abi3t?

Python 3.15 looks for '.abi3t.so' suffix:

$ python3.15
>>> import _imp; _imp.extension_suffixes()
['.cpython-315-x86_64-linux-gnu.so', '.abi3.so', '.abi3t.so', '.so']

I'm not aware of a plan to look for '.abi3t-powerpc64le-linux-gnu.so. Well, I mean outside this issue :-)

UPDATE: The '.abi3t.so' suffix was decided in PEP 803: https://peps.python.org/pep-0803/#filename-tag.

@vstinner

Copy link
Copy Markdown
Member

I'm not sure that this pull request is the best place to gather feedback on supporting another filename tag for C extensions. Someone should open a dedicated topic on discuss.python.org.

It's unlikely any wheels were already uploaded to PyPI, but it's not impossible - those would have to be yanked or hard-deleted (with lock files, yanking may not be enough).

Changing filenames impacts many things, not just _imp.extension_suffixes(). For example, should setuptools be updated to include the multiarch tuple in the filename? Should pip be updated to accept it as well?

How to support Python 3.14 and older which don't support it? Should the two filenames be uploaded to PyPI for each release? Such question should be answered first.

@stefanor

Copy link
Copy Markdown
Contributor Author

Can this PR land if @stefanor updates it to only modify abi3t?

@eli-schwartz: Absolutely, I can do that. Commits are cheap :)

abi4 is here. It is called abi3t. Debating which name to use is strange arguing over semantics. Everyone agrees it is a new ABI, with a new ABI soname, and it's the one after abi3.so.

Yeah, that's my feeling too. I expected that the people involved in this PR discussion would help to make this happen for abi3t, from the starting gate.


If this is still changed, it really should be done yesterday

@rgommers: Agreed. But I've been up against a hard rock here.

The biggest objection to this change has been performance (of supporting both names), and that's also what we'd get from just doing this right for abi3t from the start.


For example, should setuptools be updated to include the multiarch tuple in the filename? Should pip be updated to accept it as well?

@vstinner: I did the analysis of tool support in #122931 (comment) Yes, some work needs to happen, but I'm willing to drive it.

How to support Python 3.14 and older which don't support it? Should the two filenames be uploaded to PyPI for each release? Such question should be answered first.

I will adjust this PR, on Eli's advice, to target abi3t only. 3.14 doesn't support abi3t so that isn't a concern, this question becomes moot. We only need to support one filename.

@rgommers

Copy link
Copy Markdown
Contributor

@vstinner: I did the analysis of tool support in #122931 (comment) Yes, some work needs to happen, but I'm willing to drive it.

That list still seems accurate. packaging/pip/uv shouldn't be affected, it's build backends/systems only AFAIK.

@stefanor stefanor requested a review from itamaro as a code owner June 26, 2026 12:21
@stefanor stefanor force-pushed the stable-abi-multiarch branch from f88b12a to f84ef65 Compare June 26, 2026 12:27
stefanor added 6 commits June 26, 2026 09:12
…ple in the filename

This permits stable ABI extensions for multiple architectures to be
co-installed into the same directory, without clashing with each other,
the same way (non-stable ABI) regular extensions can.

It is listed below the current .abi3 suffix because setuptools will
select the first suffix containing .abi3, as the target filename.
We do this to protect older Python versions predating this patch.
@stefanor stefanor force-pushed the stable-abi-multiarch branch from f84ef65 to 67e2548 Compare June 26, 2026 13:15
@stefanor

stefanor commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

Rebased onto main to get the benefit of #150208 for tests.

@eli-schwartz

Copy link
Copy Markdown
Contributor

If this is still changed, it really should be done yesterday, and it might arguably already be too late to do the "or" rather than "look for abi3t-powerpc64le-linux-gnu.so first, fall back to abi3t.so". Because build backends and build systems are already released with abi3t support, and doing the unconditional change to ``abi3t-xxx.so` would need new releases of Maturin, scikit-build-core and CMake at this point (extension name is hardcoded there currently I believe).

I'm certain that build backends will be happy to do new releases in order to align with the technical decisions of an evolving situation in unstable, unreleased cpython. This isn't a barrier.

It's unlikely any wheels were already uploaded to PyPI, but it's not impossible - those would have to be yanked or hard-deleted (with lock files, yanking may not be enough).

If they were uploaded then they're already broken by default since the ABI won't be frozen until rc1. Given cpython refuses to limit itself out of concern that wheels with wrong ABI / crashing extensions will have already been uploaded, it would be heavily inconsistent to limit itself out of concern that wheels with undetected / wrong filename extensions might have been uploaded to PyPI.

They're both just as "naughty" / unsupported, but the breakage if someone did the naughty thing is far worse for wrong ABI, and that's already a routine event in cpython development.

@vstinner

Copy link
Copy Markdown
Member

doing the unconditional change to ``abi3t-xxx.so` would need new releases of Maturin, scikit-build-core and CMake at this point (extension name is hardcoded there currently I believe).

Again, that's why I think that a discussion on discuss.python.org would be better (to coordinate). Maybe in the Packaging category.

@stefanor

Copy link
Copy Markdown
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants