Page MenuHomePhabricator

RFC removing old code / code cleanup
Closed, ResolvedPublic

Description

Some core methods were marked as deprecated a long time ago. For example setAction() was deprecated about five or six years ago (and is still used in checkimages). Anyway do we have any suggestion for removing old or deprecated code parts? Or should it always be kept?

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 399755 had a related patch set uploaded (by Xqt; owner: Xqt):
[pywikibot/core@master] [IMPR] Introduce a timestamp in deprecated decorator

https://gerrit.wikimedia.org/r/399755

Change 399755 had a related patch set uploaded (by Xqt; owner: Xqt):
[pywikibot/core@master] [IMPR] Introduce a timestamp in deprecated decorator

https://gerrit.wikimedia.org/r/399755

If we are going to have a solely date-based version scheme, then this patch looks fine. But if semantic versioning is preferred, then I'd suggest using the version number instead of timestamp.

The current versioning system (3.0.20171212) does not look like either of them. I don't know why we have that 0 between the major version and patch version. For example in 3.0.20170713 We have added thanks log support (new functionality) which normally should bump the minor version, but we have not done so.

I'd suggest using the version number instead of timestamp.

The problem with that is that versioning is today done separately from doing a deprecating commit and to the one doing the commit the current version isn't necessarily clear.

The current versioning system (3.0.20171212) does not look like either of them. I don't know why we have that 0 between the major version and patch version. For example in 3.0.20170713 We have added thanks log support (new functionality) which normally should bump the minor version, but we have not done so.

I believe the current scheme is date versioning only but with a 3.0 prefix to distinguish it from the 2.0 efforts. We could possibly change to 3.DATE to clarify that but I'm not sure this it's needed.

This reminds me off

And as a one off we should define when/why/how we release somewhere on mediawiki.org so that new people (and us) can find it in a year.

@valhallasw Did you get an opportunity to do that?

The problem with that is that versioning is today done separately from doing a deprecating commit and to the one doing the commit the current version isn't necessarily clear.

I suppose we could use after=<latest_version> as the deprecation parameter. However, the idea was mainly for semantic versioning. For date versioning, it might not matter.

I believe the current scheme is date versioning only but with a 3.0 prefix to distinguish it from the 2.0 efforts. We could possibly change to 3.DATE to clarify that but I'm not sure this it's needed.

I guess the main question that remains is how we are going to handle backward incompatible changes in this versioning scheme.
If we are going to bump the main version for any backward incompatible change, then maybe having since=<major_version> would be enough. On the other hand, if deprecated code removal is going to be date-based only, e.g. after 1 year of deprecation as some have suggested, then a timestamp looks more appropriate.

... if deprecated code removal is going to be date-based only, e.g. after 1 year of deprecation as some have suggested, then a timestamp looks more appropriate.

Since regular releases was a problem in the past I would probably go with this to avoid re-adding some of those stumbling blocks.

The problem with that is that versioning is today done separately from doing a deprecating commit and to the one doing the commit the current version isn't necessarily clear.

I suppose we could use after=<latest_version> as the deprecation parameter. However, the idea was mainly for semantic versioning. For date versioning, it might not matter.

latest_version is completly unclear or in other words it is undefined. A lot of deprecation warnings are done when creating the rewrite branch from previous compat branch. We could say that compat is the 1.0 branch [1] whereas 2.0 was a separate branch based on rewrite development where some patches where ported to from time to time. After we dropped that 2.0 branch we started the 3.0 counting [2].

Finding out the deprecation history, the timestamp is the only usefull value we could extract from repository.

I believe the current scheme is date versioning only but with a 3.0 prefix to distinguish it from the 2.0 efforts. We could possibly change to 3.DATE to clarify that but I'm not sure this it's needed.

I think we should indicate the major 3 release Independent of the timestamp versioning.

I guess the main question that remains is how we are going to handle backward incompatible changes in this versioning scheme.
If we are going to bump the main version for any backward incompatible change, then maybe having since=<major_version> would be enough. On the other hand, if deprecated code removal is going to be date-based only, e.g. after 1 year of deprecation as some have suggested, then a timestamp looks more appropriate.

I propose to increase the minor release for backward incompatibility e.g. remove older depecated parts. They get a tag which is the version number [3] and could easily restored from repository (but without any backports).

But I have no preferences about the mayor release number; maybe this could indicate a new python code base or revoking some (like py 2.6 for example)

[1] https://pypi.org/project/PyWikipediaBot/#history
[2] https://pypi.org/project/pywikibot/#history
[3] https://gerrit.wikimedia.org/r/#/admin/projects/pywikibot/core,tags

I propose to increase the minor release for backward incompatibility e.g. remove older depecated parts.

Increasing minor version for backward incompatibility? Is there any package that does that? I think it'll confuse users.

I personally prefer a full semver system, no dates. I believe it has a well understood meaning. However, a decision needs to be made and I don't mind other systems, as long as we are clear about how the version should be changed.

Increasing minor version for backward incompatibility? Is there any package that does that? I think it'll confuse users.

pywikibot was preparing to do it with 2.1

I personally prefer a full semver system, no dates. I believe it has a well understood meaning. However, a decision needs to be made and I don't mind other systems, as long as we are clear about how the version should be changed.

I am fine with semver too.

Change 409823 had a related patch set uploaded (by Xqt; owner: Xqt):
[pywikibot/core@master] [PEP440] Enable semver versioning for pywikibot

https://gerrit.wikimedia.org/r/409823

Well, per semver backwards incompatibility should be noted as 4.0.0 (major,minor,patch). But I personally more like 3.1.20180212 (major.incompatibility.stamp).

But I personally more like 3.1.20180212 (major.incompatibility.stamp).

Under semver any incompatibility in API is considered a major change. In a major.incompatibility.stamp scheme, what kind of change will be associated with major version change? Do you know any prominent python package with this scheme?

I agree with Dalba. I've no idea for what the major release stands except 1.0 means trunk, 2.0 means the interim rewrite stable branch and 3 the current master which never will change in future except we have a complete code redesign.

I think semver release counting is wellknown what it means and is PEP440 compatible. And finally we can prepare the next release before published at pypi for example. Currently we mark the release after publishing which is also wrong because a new version is started then.

OK. I'm not against semver, I used it a lot in some of my older projects so I can see some serious questions we should solve before presenting this change: How this change will affect the developers? Will we require to increment PATCH number for every bugfix made and MINOR for every improvement? Or this change is just for PyPi and HISTORY.rst purposes and no developer needs to struggle with that?

PS: If we will follow semver, pywikibot could rapidly increment to 5.0.0 in several months if Py3.3 and Py2.6 abandoned separately. How would the Pywikibot users react about this massive jump? Or should dropping of old Python release support be excluded of backward incompatible changes in other way semver allows (build metadata)?

Will we require to increment PATCH number for every bugfix made and MINOR for every improvement?

Based on our current release cycle, I presume that we can keep developing on the dev (pre-release) version. The person who is going to update the changelog and do the release, should look at the changes since the last release and bump the version accordingly.

PS: If we will follow semver, pywikibot could rapidly increment to 5.0.0 in several months if Py3.3 and Py2.6 abandoned separately.

I was hoping for it to be done in a single release, but if not...

How would the Pywikibot users react about this massive jump?

I don't know how they would react, but the same reaction that would be appropriate for an incompatibility-bump in a major.incompatibility.stamp scheme would be appropriate here, too. (Under both circumstances they should check the changelog and see if the breaking change affects them or not, and if it does, they should either update their code, or keep using the older pywikibot version.)

Or should dropping of old Python release support be excluded of backward incompatible changes in other way semver allows (build metadata)?

That question holds for current scheme, too. Should we bump the incompatibility or should we use build metadata? (The point being that even if switch over to semver, things won't change in that front.) Such a change would perhaps require a separate RfC/announcement, updating the documentation, and AFAIK using semver build metadata is not permitted under PEP 440.

The person who is going to update the changelog and do the release, should look at the changes since the last release and bump the version accordingly.

Do we have anyone/anyones willing to take that on? If I understand the pywikibot history correctly that has been the major hurdle for releasing in the past and was the reason for the stamp solution to get releases going again.

Do we have anyone/anyones willing to take that on?

I did it several times in past and it is very easy to do it with master branch in opposite to the previous so called stable release 2.0.

If I understand the pywikibot history correctly that has been the major hurdle for releasing in the past and was the reason for the stamp solution to get releases going again.

As I remember it was introduced when we closed the 2.0 stable branch after a long time of um maintenance to have the master branch as the new stable release at pypi.

Personally I uploaded a new release after approximately one month when tests where successful and I tagged the last related commit with the version number created by setup.py. Also I updated the HISTORY file (not the ChangeLog which is really hard to work and I gave up) With those tags developers or bot operators are always able to went back to this release which might be important for breaking changes.

It is a problem of stamp based versioning that the new release is known after uploading it which also looks wrong because after tagging a release a new version is created. With semver this could be changed and the new release is known before uploading it including any doc strings.

The person who is going to update the changelog and do the release, should look at the changes since the last release and bump the version accordingly.

Do we have anyone/anyones willing to take that on? If I understand the pywikibot history correctly that has been the major hurdle for releasing in the past and was the reason for the stamp solution to get releases going again.

That's a good point. I also would like to add that we might not always agree on what is and what is not a breaking change and how often we should bump the version (another source of arguing). For example I consider changing requests requirement a breaking change...

It'll be definitely easier for us (developers) to continue to use a date based versioning, but the burden will be moved to users who will have to check the changes for each new release.

It is a problem of stamp based versioning that the new release is known after uploading it which also looks wrong because after tagging a release a new version is created. With semver this could be changed and the new release is known before uploading it including any doc strings.

Actually I think we can use a hard-coded timestamp as version. It should not matter if the version differs a few days from the exact release date. Perhaps that could be resolved by changing the way are setup.py works.

That's a good point. I also would like to add that we might not always agree on what is and what is not a breaking change and how often we should bump the version (another source of arguing). For example I consider changing requests requirement a breaking change...

This was not a breaking change by the bot framework but from EventStreams web service itself. pwb may run with older versions of requests but newer is needed for the sseclient working with the newly EventStreams. Anyway it doesn't care currently.

Creating a new release monthly after tests pass would be a good period I guess. On the other hand bugfixes for highly used framework parts e.g. regressions could be a good reason to do it earlier (https://gerrit.wikimedia.org/r/#/c/420170/ for example would be a candidate for it).

Actually I think we can use a hard-coded timestamp as version. It should not matter if the version differs a few days from the exact release date. Perhaps that could be resolved by changing the way are setup.py works.

I made a proposal for it and still prefer new versioning .

In https://gerrit.wikimedia.org/r/#/c/pywikibot/core/+/399755/ there can also be found multiple code blocks deprecated for even 4 or 5 years. I think that these deprecated for more than 2 years should be removed as such a long time with deprecation warning seems enough to me.

PS: Probably the versioning issue should have its own task as it sidelined the main task issue described in the task description

  • Any script parts except the are reused by other scripts; on the other hand this should not be done, see T60942.

Does everyone agree on this one? I plan to fix all pep8-naming errors within scripts, which will require renaming some methods, so not requiring deprecation would make things much easier.

Maybe we should move scripts directory to _scripts to make this more explicit?

Maybe we should move scripts directory to _scripts to make this more explicit?

I don't support this idea, _scripts could possibly break too many things.

Personally I would continue to deprecate parts of scripts as before.

Personally I would continue to deprecate parts of scripts as before.

We also omitted deprecating of script parts in past. Scripts are ready to use and I mean it is in the scope of bot operators if they change the script and have to rebase them. The are invited to update the current scripts if theiy want. If deprecation is needed this prevents standardizing and improvements like I've suggested on several scripts due to T196851 and it is mostly impossible to keep both the old and the new implementation.

Maybe we should move scripts directory to _scripts to make this more explicit?

The archive folde would be a place for older scripts if necessary. And we also have the monthly updated repository tags to easily went back to an older implementation.

Personally I would continue to deprecate parts of scripts as before.

We also omitted deprecating of script parts in past. Scripts are ready to use and I mean it is in the scope of bot operators if they change the script and have to rebase them. The are invited to update the current scripts if theiy want. If deprecation is needed this prevents standardizing and improvements like I've suggested on several scripts due to T196851 and it is mostly impossible to keep both the old and the new implementation.

I'm confused, there already are some deprecations in 7 scripts. So we just need to start deprecating in scripts? I would support to finish https://gerrit.wikimedia.org/r/#/c/pywikibot/core/+/399755/ ASAP for this.

I'm confused, there already are some deprecations in 7 scripts. So we just need to start deprecating in scripts? I would support to finish https://gerrit.wikimedia.org/r/#/c/pywikibot/core/+/399755/ ASAP for this.

Personally I never was a friend of deprecation script parts but I would be fine with it as long as we can implement any workflow to drop them (in short time) to enable script improvements.

For example of this imagetransfer patch you may deprecate the two renamed methods but you cannot do anything for the run method. The name has been kept but the behaviour is different.

Another sample of clean_sandbox improvements: There is nothing we can deprecate but might be a lot of breaking changes starting with SandboxBot.site Attribute or the self.translated_content which both becomes invalid.

Or have a look at makecat which is completely rewritten.

What about SemVer version for framework and for each script its own SemVer version?

Change 399755 merged by jenkins-bot:
[pywikibot/core@master] [IMPR] Introduce a timestamp in deprecated decorator

https://gerrit.wikimedia.org/r/399755

Okay, now we can remove deprecations older than 2 ( ? 3? ) years!

Change 449025 had a related patch set uploaded (by Xqt; owner: Xqt):
[pywikibot/core@master] [cleanup] Remove unsupported removeImage and placeImage Page methods

https://gerrit.wikimedia.org/r/449025

Change 449026 had a related patch set uploaded (by Xqt; owner: Xqt):
[pywikibot/core@master] [cleanup] Remove getParsedString Site method

https://gerrit.wikimedia.org/r/449026

Okay, now we can remove deprecations older than 2 ( ? 3? ) years!

I started to remove unsupported code. See related patches.

Change 449026 merged by jenkins-bot:
[pywikibot/core@master] [cleanup] Remove getParsedString Site method

https://gerrit.wikimedia.org/r/449026

Change 449025 merged by Mpaa:
[pywikibot/core@master] [cleanup] Remove unsupported removeImage and placeImage Page methods

https://gerrit.wikimedia.org/r/449025

Change 463775 had a related patch set uploaded (by Xqt; owner: Xqt):
[pywikibot/core@master] [cleanup] remove pre mw 1.14 code

https://gerrit.wikimedia.org/r/463775

Change 463775 merged by jenkins-bot:
[pywikibot/core@master] [cleanup] remove pre mw 1.14 code

https://gerrit.wikimedia.org/r/463775

Change 577598 had a related patch set uploaded (by Xqt; owner: Xqt):
[pywikibot/core@master] [cleanup] Remove outdated has_api method

https://gerrit.wikimedia.org/r/577598

Change 577598 merged by jenkins-bot:
[pywikibot/core@master] [cleanup] Remove outdated has_api method

https://gerrit.wikimedia.org/r/577598

Change 585904 had a related patch set uploaded (by Xqt; owner: Xqt):
[pywikibot/core@master] [cleanup] Desupport old deprecated methods

https://gerrit.wikimedia.org/r/585904

Change 585904 merged by jenkins-bot:
[pywikibot/core@master] [cleanup] Desupport old deprecated methods

https://gerrit.wikimedia.org/r/585904

Change 409823 had a related patch set uploaded (by Xqt; owner: Xqt):
[pywikibot/core@master] [IMPR] Use one central point for framework version

https://gerrit.wikimedia.org/r/409823

Change 606438 had a related patch set uploaded (by Xqt; owner: Xqt):
[pywikibot/core@master] [cleanup] Cleanup getFilesFromAnHash and getImagesFromAnHash

https://gerrit.wikimedia.org/r/606438

Change 606438 merged by jenkins-bot:
[pywikibot/core@master] [cleanup] Cleanup getFilesFromAnHash and getImagesFromAnHash

https://gerrit.wikimedia.org/r/606438

Change 409823 merged by jenkins-bot:
[pywikibot/core@master] [IMPR] Use one central point for framework version

https://gerrit.wikimedia.org/r/409823

Change 612544 had a related patch set uploaded (by Xqt; owner: Xqt):
[pywikibot/core@master] [cleanup] Use FutureWarning for methods deprecated for 6 years

https://gerrit.wikimedia.org/r/612544

Change 612544 merged by jenkins-bot:
[pywikibot/core@master] [cleanup] Use FutureWarning for methods deprecated for 6 years

https://gerrit.wikimedia.org/r/612544

Change 616219 had a related patch set uploaded (by Xqt; owner: Xqt):
[pywikibot/core@master] [cleanup] Remove outdated tools

https://gerrit.wikimedia.org/r/616219

Change 616219 merged by jenkins-bot:
[pywikibot/core@master] [cleanup] Remove outdated Python 2.6 backports from tools

https://gerrit.wikimedia.org/r/616219

Change 630638 had a related patch set uploaded (by Xqt; owner: Xqt):
[pywikibot/core@master] [cleanup] Show a FutureWarning for methods which are deprecated for 5 years

https://gerrit.wikimedia.org/r/630638

Change 630638 merged by jenkins-bot:
[pywikibot/core@master] [cleanup] Show a FutureWarning for methods which are deprecated for 5 years

https://gerrit.wikimedia.org/r/630638

@Xqt: Hi, all related patches in Gerrit have been merged. Can this task be resolved (via Add Action...Change Status in the dropdown menu), or is there more to do in this task? Asking as you are set as task assignee. Thanks in advance!

Xqt removed Xqt as the assignee of this task.Apr 6 2021, 6:29 AM
Xqt added a project: Tracking-Neverending.

@Xqt: Hi, all related patches in Gerrit have been merged. Can this task be resolved (via Add Action...Change Status in the dropdown menu), or is there more to do in this task? Asking as you are set as task assignee. Thanks in advance!

This is a never ending task now. We have a kind of code cleanup process were code parts get a FutureWarning if it is deprecated for a long time. After FutureWarning is shown the code will be dropped with the overnext release (or later)

I'm not sure having a never ending task is a good idea. You can create a ticket for specific clean ups (python2) or you can use gerrit hashtags if you want to have a way to find patch. You can also create a ticket for cleanups doable after a release maybe? (for example all deletions that can be done in v5 or v6) but a task that never gets resolved doesn't make much sense to me.

I'm not sure having a never ending task is a good idea. You can create a ticket for specific clean ups (python2) or you can use gerrit hashtags if you want to have a way to find patch. You can also create a ticket for cleanups doable after a release maybe? (for example all deletions that can be done in v5 or v6) but a task that never gets resolved doesn't make much sense to me.

Sure, the tasks ends if all deprecations are gone. Currently whe have a lot of it. Cleanups are announced via FutureWarning and ROADMAP but I won't create a task for all of them. Therefore this tracking will be mostly enough.

Removing Tracking-Neverending; not "neverending" by definition (plus no open subtasks and/or unclear scope). Please don't create neverending tasks, thanks.

Xqt claimed this task.

Ok, closing then.