In the long run, I think having a higher limit would be the best solution. However, we also should discuss what we can do to save space. One suggestion is to remove old pre-releases. Although pre-releases prior to 2.10 should probably be consulted about with core.
Thanks for starting the discussion
Earlier today, I deleted the following releases which had been yanked previously:
10.0.0: contains extra files which shouldn’t be included
9.6.0: contains extra files which shouldn’t be included
9.5.0: Accidently contains breaking change
9.0.0: (no reason given)
I count ~35 pre-release from 3.0.0b1 (Feb 2021) through 10.0.0rc1 (May 2024) which I think would be safe to delete (though maybe we yank them first? That should give us some space more space back while we wait for the request to be processed.
I’m not a fan of deleting existing releases. Deleting yanked and pre-releases is IMO OK as a last resort (as in the current case), but I would avoid even deleting pre-releases if possible.
@gundalow: it’s mainly a personal preference, IMO releases are immutable and should stay there forever, if there aren’t very good reasons for not keeping them. (Like having malicious code in them, or some legal reasons.) The pre-releases are part of the release history like every other release.
But I’d definitely still prefer deleting them over not being able to publish new releases
Agreed that deleting releases shouldn’t be taken lightly, but judging from prior experience, we have no idea when someone might get around to evaluating the PyPI quota increase request. If hard decisions have to be made, I’d suggest starting with the oldest alphas, then oldest betas, and so on, and only as-needed (ie, not pre-emptively killing off all old pre-releases).
I like @nitzmahone’s proposal. The oldest pre-releases are for Ansible 2.5, so 2.5.0a1 would be the first release to be deleted once we need more space (but not before that).
Monitor the space in PyPI and request increase when needed
Come up with the rules of deletion of released packages (if/when needed and which one to be deleted, the process to be followed before and after the deletion)
How and where to archive the deleted pacakges.
Now for a part of the rule I agree with @felixfontein 's first comment on deleting the existing release.
I wanted to also spell out how people are able hit yanked releases: pip install ansiblenever considers them during dependency resolution. But if somebody pinned it to the exact release, only then it’ll be installed.
So people affected by fully removing such releases are going to be those who favor reproducible deployments and pin ansible in their requirements files and scripts. We’ve seen one case, evidently, but there may be people having such pins but running their automation periodically. JFTR.
Additionally, PyPI stats are available via BigQuery: Statistics · PyPI. We should be able to inspect it somehow and verify that the releases being removed have low downloads.
So alphas from oldest to newest, then betas from oldest to newest and then rcs from oldest to newest or generally pre-releases from oldest to newest? I tend to the latter.
I fully agree. Let’s not do this generally, only when (as you put it) hard decisions have to be made in order to be able to do a new release.
I hope it won’t come to this, but this could mean we might have to delete the current 11.0.0a1 and 11.0.0a2 releases while keeping 2.5.0b1 which is 6 1/2 years old.
Why is it more important to keep old betas than current / pretty new alphas? I’m open to both, I just want to understand. I would have said the other way round makes more sense.
I don’t see the point in keeping old alpha/beta versions at all - they served their purpose back in their days, but IMO it is extremely unlikely any new value will be obtained from those releases at all.
And, of course, if there is something to be achieved, we could/should copy these files over to a simple httpd server or file server anywhere and remove them from PyPI. PyPI is not a storage solution.
I somehow agree with @felixfontein. I also have a bad feeling about deleting old packages / versions. I think we should keep them, at least for historical reasons. Even if they’re only pre-releases.
On the other hand, I agree with @russoz that PyPI might not be the right place to do this.
I’ve been searching for archiving PyPI packages in case there’s already a solution and stumbled upon Software Heritage. It looks like they have a way to archive PyPI packages. Would this be way to a) delete old Packages from PyPI and b) not loose them completely?
GitHub also has size limits. I don’t know how they work and what exceeding them results in; in the worst case (I can think of right now), it could happen that we exceed the size limit of the ansible-community organization, and suddenly we can no longer add commits to any of the repositories in there until we start deleting tihngs. I don’t think size limits would/should work that way, but
Current size seems to be 9.814 GiB. I think this is too much, we won’t be able to do the three releases (9.13.0, 10.7.0 and 11.1.0) planned for December 3rd.
That’s a great idea. Deleting all wheels of pre-releases should free up some hundrets of megabytes, while the source dists are still there for historical purposes.
So maybe the refined proposal:
First delete wheels of pre-releases, starting with the oldest ones.
Once all wheels of pre-releases are gone and we need more space, start with alpha 1 pre-releases, from oldest to newest.
After that, the alpha 2 releases from oldest to newest.
After that, alpha 3; then beta 1; then beta 2; then beta 3 (did we ever do that?); then rc1; then rc2.