Wikipedia:Bot requests
This is a page for requesting tasks to be done by bots per the bot policy. This is an appropriate place to put ideas for uncontroversial bot tasks, to get early feedback on ideas for bot tasks (controversial or not), and to seek bot operators for bot tasks. Consensus-building discussions requiring large community input (such as request for comments) should normally be held at WP:VPPROP or other relevant pages (such as a WikiProject's talk page). You can check the "Commonly Requested Bots" box above to see if a suitable bot already exists for the task you have in mind. If you have a question about a particular bot, contact the bot operator directly via their talk page or the bot's talk page. If a bot is acting improperly, follow the guidance outlined in WP:BOTISSUE. For broader issues and general discussion about bots, see the bot noticeboard. Before making a request, please see the list of frequently denied bots, either because they are too complicated to program, or do not have consensus from the Wikipedia community. If you are requesting that a template (such as a WikiProject banner) is added to all pages in a particular category, please be careful to check the category tree for any unwanted subcategories. It is best to give a complete list of categories that should be worked through individually, rather than one category to be analyzed recursively (see example difference). Alternatives to bot requests WP:AWBREQ, for simple tasks that involve a handful of articles and/or only needs to be done once (e.g. adding a category to a few articles). WP:URLREQ, for tasks involving changing or updating URLs to prevent link rot (specialized bots deal with this). WP:USURPREQ, for reporting a domain be usurped eg. |url-status=usurped WP:SQLREQ, for tasks which might be solved with an SQL query (e.g. compiling a list of articles according to certain criteria). WP:TEMPREQ, to request a new template written in wiki code or Lua. WP:SCRIPTREQ, to request a new user script. Many useful scripts already exist, see Wikipedia:User scripts/List. WP:CITEBOTREQ, to request a new feature for WP:Citation bot, a user-initiated bot that fixes citations. Note to bot operators: The {{BOTREQ}} template can be used to give common responses, and make it easier to keep track of the task's current status. If you complete a request, note that you did with {{BOTREQ|done}}, and archive the request after a few days (WP:1CA is useful here).
| Commonly Requested Bots |
This is a page for requesting tasks to be done by bots per the bot policy. This is an appropriate place to put ideas for uncontroversial bot tasks, to get early feedback on ideas for bot tasks (controversial or not), and to seek bot operators for bot tasks. Consensus-building discussions requiring large community input (such as request for comments) should normally be held at WP:VPPROP or other relevant pages (such as a WikiProject's talk page).
You can check the "Commonly Requested Bots" box above to see if a suitable bot already exists for the task you have in mind. If you have a question about a particular bot, contact the bot operator directly via their talk page or the bot's talk page. If a bot is acting improperly, follow the guidance outlined in WP:BOTISSUE. For broader issues and general discussion about bots, see the bot noticeboard.
Before making a request, please see the list of frequently denied bots, either because they are too complicated to program, or do not have consensus from the Wikipedia community. If you are requesting that a template (such as a WikiProject banner) is added to all pages in a particular category, please be careful to check the category tree for any unwanted subcategories. It is best to give a complete list of categories that should be worked through individually, rather than one category to be analyzed recursively (see example difference).
Alternatives to bot requests
- WP:AWBREQ, for simple tasks that involve a handful of articles and/or only needs to be done once (e.g. adding a category to a few articles).
- WP:URLREQ, for tasks involving changing or updating URLs to prevent link rot (specialized bots deal with this).
- WP:USURPREQ, for reporting a domain be usurped eg.
|url-status=usurped - WP:SQLREQ, for tasks which might be solved with an SQL query (e.g. compiling a list of articles according to certain criteria).
- WP:TEMPREQ, to request a new template written in wiki code or Lua.
- WP:SCRIPTREQ, to request a new user script. Many useful scripts already exist, see Wikipedia:User scripts/List.
- WP:CITEBOTREQ, to request a new feature for WP:Citation bot, a user-initiated bot that fixes citations.
Note to bot operators: The {{BOTREQ}} template can be used to give common responses, and make it easier to keep track of the task's current status. If you complete a request, note that you did with {{BOTREQ|done}}, and archive the request after a few days (WP:1CA is useful here).
| Legend |
|---|
|
|
|
|
|
| Manual settings |
| When exceptions occur, please check the setting first. |
| Bot-related archives |
|---|
Citation source replacement with {{Cite Köppen-Geiger cc 2007}}
[edit]Hundreds (thousands?) of mountain articles use as a reference a paper titled "Updated world map of the Köppen-Geiger climate classification" that was published in 2007 in Hydrology and Earth System Sciences, Volume 11, Issue 5. Typically these use {{Cite journal}} passing in the appropriate values. However, the values were not consistently applied and so we have generated references that do not provide all the necessary information each time it's used nor is the information as complete as it could be. Thus, I created the template to provide complete information about the source reference so that it's consistent across Wikipedia. I then searched for "Updated world map of the Köppen-Geiger" to find articles using this reference source and started replacing the citation source text with <ref name=KGcc2007>{{Cite Köppen-Geiger cc 2007}}</ref>. The search found other articles on rivers and human settlements also using this source reference. At this point, I have manually edited over 360 pages to make this change, the high majority being mountain articles but also a few articles about rivers and populated places. The search currently returns over 2800 results. So at this point I think it would be good if a bot could automate this edit to articles using this source reference. Typically in mountain articles, the citation is in a "Climate" section which usually begins with the sentence "Based on the [[Koppen climate classification]],". Other times it's in the lead section. Often the citation is using a named reference, typically "Peel" for the first author, although the paper does have three authors, which is why I chose to use KGcc2007 rather than Peel. Perhaps the reference could be named a bit different to denote it was a bot edit, e.g. <ref name=KGcc2007be>. I have been using "{{Cite Köppen-Geiger cc 2007}}" as the edit summary. RedWolf (talk) 21:00, 12 November 2025 (UTC)
- What is the need to mass replace all of these citations with this template? Tenshi! (Talk page) 13:52, 19 November 2025 (UTC)
- pinging RedWolf —usernamekiran (talk) 13:40, 21 November 2025 (UTC)
- In addition to what I already stated above, many of the raw source references are just using ISSN with a large range which if clicked would return a page with over 13,000 results (i.e. ISSN 1027-4606); I don't understand why editors gave a huge range. The template adds the DOI and BIBCODE as well as an archive link so there are direct links to the source paper. All uses of this source will have consistent and basically complete information. As well it provides all the other benefits with using a template (e.g. what links here for usage count). RedWolf (talk) 17:09, 21 November 2025 (UTC)
Using this template causes a no target error (see Category:Harv and Sfn no-target errors)) when used in conjunction with short form references. Lake Nipisso as an example[1]. This could be avoided by whitelisting the template (which can be requested at Module talk:Footnotes), but that won't work as long as you're using #invoke for the cite (which short form refs just don't support). -- LCU ActivelyDisinterested «@» °∆t° 20:26, 17 January 2026 (UTC)
- Tried to fix Lake Nipisso by setting ref={{harvid}} but then get a refs duplicate error. Can't change the #invoke because that's the established practice. Why can't the short refs limitation be fixed? RedWolf (talk) 19:33, 7 March 2026 (UTC)
Redirects related to those nominated at RfD
[edit]Per the initial discussion at Wikipedia talk:Redirects for discussion#Avoided double redirects of nominated redirects I believe there is consensus for an ongoing bot task that does the following:
- Looks at each redirect nominated at RfD
- Determines whether there are any other redirects, in any namespace, that meet one or more of the following criteria:
- Are marked as an avoided-double redirect of a nominated redirect
- Are redirects to the nominated redirect
- Redirect to the same target as the nominated redirect and
- Differ only in the presence or absence of diacritics, and/or
- Differ only in case
- If the bot finds any redirects that match and which are not currently nominated at RfD, then it should post a message in the discussion along the lines of:
- The bot should not take any actions other than leaving the note, the goal is simply to make human editors aware that these redirects exist.
I don't know how frequently the bot should run, but it should probably wait at least 15 minutes after a nomination before checking or editing so as not to get into edit conflicts or complications as discussions of multiple redirects are often nominated individually and then the discussions manually combined. Thryduulf (talk) 13:11, 17 June 2025 (UTC)
- There is a strong consensus; if there are no objections in the next day or so, I'll file a BRFA. In the meantime I'll code up the bot. GalStar (talk) 17:56, 17 June 2025 (UTC)
- I've just thought of a third case to check for: differences only in hyphenation/dashes. Thryduulf (talk) 21:38, 17 June 2025 (UTC)
- Actually that's generalisable to differences only in punctuation. Thryduulf (talk) Thryduulf (talk) 03:40, 18 June 2025 (UTC)
- @GalStar is there any update on this? Thryduulf (talk) 20:01, 25 June 2025 (UTC)
- I'm still working on it. I'm still getting some of the underlying tooling working, but I should be done soon. GalStar (talk) 16:40, 26 June 2025 (UTC)
- Thank you. Thryduulf (talk) 16:50, 26 June 2025 (UTC)
- If anyone is wondering, I'm currently porting my code to toolforge, so it can run continuously, and without the unreliability of my home network. This is taking longer than I expected however. GalStar (talk) 17:17, 26 June 2025 (UTC)
- Thank you. Thryduulf (talk) 16:50, 26 June 2025 (UTC)
BRFA filed GalStar (talk) (contribs) 20:56, 2 July 2025 (UTC)
- I'm still working on it. I'm still getting some of the underlying tooling working, but I should be done soon. GalStar (talk) 16:40, 26 June 2025 (UTC)
- I've just thought of a third case to check for: differences only in hyphenation/dashes. Thryduulf (talk) 21:38, 17 June 2025 (UTC)
Restored from Archive 87. The BRFA mentioned above (Wikipedia:Bots/Requests for approval/GraphBot 2) was abandoned before a working bot was written so the task remains outstanding. Wikipedia talk:Redirects for discussion#Avoided double redirects of nominated redirects received no objections to my reinstating this request. Thryduulf (talk) 20:05, 31 December 2025 (UTC)
- @Thryduulf Is there a current discussion that falls into this category that I can test on? Vanderwaalforces (talk) 17:31, 5 February 2026 (UTC)
- I don't know of any off the top of my head. I can't think of an easy way to find any other than by doing what this requests asks a bot to do (i.e. look through all nominated redirects and check for similar ones). Thryduulf (talk) 19:46, 5 February 2026 (UTC)
- @Thryduulf You’re right, I just put some logics together and tried it; I couldn’t find any too. Maybe there currently isn’t any, but I have coded this task btw, maybe I should file a BRFA? Vanderwaalforces (talk) 19:59, 5 February 2026 (UTC)
- @Thryduulf Take a look at my test at testwiki. I intentionally created a redirect drama. What do you think? Vanderwaalforces (talk) 09:42, 9 February 2026 (UTC)
BRFA filed. Vanderwaalforces (talk) 12:58, 9 February 2026 (UTC)- I'm not going to get a chance to look until this evening, sorry. Thryduulf (talk) 13:36, 9 February 2026 (UTC)
- I don't know of any off the top of my head. I can't think of an easy way to find any other than by doing what this requests asks a bot to do (i.e. look through all nominated redirects and check for similar ones). Thryduulf (talk) 19:46, 5 February 2026 (UTC)
The BRFA has been withdrawn by the operator (Vanderwaalforces) if any other operator wants to take on the task. Thryduulf (talk) 15:51, 24 February 2026 (UTC)
Move links clean up request
[edit]I think that cleanup bots should replace link's that havent been edited once the target page is moved to somewhere else shane (talk to me if you want!) 23:30, 18 January 2026 (UTC)
- Probably a bad idea for a bot task. See WP:NOTBROKEN. Anomie⚔ 00:01, 19 January 2026 (UTC)
Not done for the template at the top so this will archive. (do you need to be a botop to mark it?) HurricaneZetaC 20:20, 1 February 2026 (UTC)
Unnecessary disambiguations
[edit]Is there a way a bot could find instances of unnecessary disambiguations? Specifically, instances of articles named "Title (parenthetical)" where there isn't currently an article at just "Title". (In other words, something like Floofy (band) existing where Floofy is still a redlink.)
I ask this because sometimes I see people stick (film) or (band) at the end of article names unnescessarily, or sometimes the non-parenthetical gets deleted via AFD or PROD and the parenthetical version is never moved to reclaim the title. Ten Pound Hammer • (What did I screw up now?) 17:15, 19 January 2026 (UTC)
- I’ve also seen several instances of this during NPP, I will try coming up with something. Vanderwaalforces (talk) 17:42, 20 January 2026 (UTC)
- I did a quick database query and found a total of 37,887 of these, which is far too large to make any kind of useful report. * Pppery * it has begun... 17:50, 20 January 2026 (UTC)
- Can these be moved by a bot? If better information is needed before a bot run, then maybe sort by disambiguation and move the ones we know for sure can be moved (pages with
(film),(country film), etc.). Gonnym (talk) 18:06, 20 January 2026 (UTC)- Yeah, that might narrow it down. Start with ones that are "Name (film)" in cases where "Name" doesn't exist, then maybe the same with (band), as those are the two I see most often. Ten Pound Hammer • (What did I screw up now?) 18:20, 20 January 2026 (UTC)
- All 37,000 can't be moved by a bot because sometimes the actual name of a proper noun includes parentheses, like Barugh (Great and Little) (okay, Barugh technically exists, but I'm not convinced there aren't any like that where the base name is red). Specific parenthetical disambiguators can probably be botted; see Wikipedia:Database reports/Specific unnecessary disambiguations. * Pppery * it has begun... 18:27, 20 January 2026 (UTC)
- SD0001, I'll be honest I've not done much in the way of database reports, could we get this unnecessary dab report refreshed? Primefac (talk) 15:50, 16 March 2026 (UTC)
- The bot isn't updating because there's no
|interval=param. I've added it now for 30 days - feel free to adjust if more frequent updates are desired. It's also possible to just hit "Update the table now" for an on-demand update. – SD0001 (talk) 16:07, 16 March 2026 (UTC)- Thanks! Didn't even think to look at the page code for that sort of thing. Primefac (talk) 16:18, 16 March 2026 (UTC)
- The bot isn't updating because there's no
- SD0001, I'll be honest I've not done much in the way of database reports, could we get this unnecessary dab report refreshed? Primefac (talk) 15:50, 16 March 2026 (UTC)
- Can these be moved by a bot? If better information is needed before a bot run, then maybe sort by disambiguation and move the ones we know for sure can be moved (pages with
- TenPoundHammer, from my work on Wikipedia:Missing redirects project I obtained User:Qwerfjkl/sandbox/55, which BD2412 organized. — Qwerfjkltalk 20:57, 20 January 2026 (UTC)
- Well, this is quite it then. Vanderwaalforces (talk) 21:01, 20 January 2026 (UTC)
- @Qwerfjkl: I have been meaning to ask your permission to subdivide that page, as it is of rather unwieldy length. Cheers! BD2412 T 21:49, 20 January 2026 (UTC)
- BD2412, by all means. — Qwerfjkltalk 22:28, 20 January 2026 (UTC)
- @Qwerfjkl: I have been meaning to ask your permission to subdivide that page, as it is of rather unwieldy length. Cheers! BD2412 T 21:49, 20 January 2026 (UTC)
- @Qwerfjkl: sweet, that's a big help Ten Pound Hammer • (What did I screw up now?) 00:46, 21 January 2026 (UTC)
- So could all discrete subsets be moved by bot, provided that the page without "(parenthetical)" still does not exist or redirects to the disambiguated title? Wikiwerner (talk) 17:40, 15 March 2026 (UTC)
- Wikiwerner, these need to be handled on a case-by-case basis in general. — Qwerfjkltalk 10:53, 16 March 2026 (UTC)
- Why? I think it wold be a reasonable bot request to take a specific disambiguator and move all pages with that disambiguator where the base name either has never existed or is a single-revision redirect to the disambiguated page and move them to the base name. * Pppery * it has begun... 13:30, 16 March 2026 (UTC)
- In fact, I was originally going to code this task, but seeing the discussion so far, I paused because I am not sure we want to, even though I think we should too. Vanderwaalforces (talk) 13:57, 16 March 2026 (UTC)
- If someone wants to modify User:Plastikspork/massmove.js to strip out suffixes, any admin would be able to batch-remove disambiguators such as
(film). It should theoretically just require shifting your lookup value from the front to the back of the string, but I'm pulled in a few too many directions at the moment so I'm not sure if I'm able to do this myself. Primefac (talk) 15:15, 16 March 2026 (UTC) - It really was that easy; User:Primefac/massmove2.js allows for stripping of suffixes. Primefac (talk) 15:44, 16 March 2026 (UTC)
- Thank you, duh! Vanderwaalforces (talk) 18:38, 16 March 2026 (UTC)
- There are only a few hundred (film) and (band) articles in the dbase report, and we can add other values to the report based on the discussion below. I figure to minimise disruption maybe do batches of forty or fifty at a time spread out over a couple of days. Primefac (talk) 10:23, 18 March 2026 (UTC)
- I do have concerns that we may end up automating "primary topic" status to a bunch of titles that do not merit that over the title redirecting elsewhere. BD2412 T 14:39, 18 March 2026 (UTC)
- I was only planning on moving pages that have a redlinked base name. Primefac (talk) 18:42, 19 March 2026 (UTC)
- I do have concerns that we may end up automating "primary topic" status to a bunch of titles that do not merit that over the title redirecting elsewhere. BD2412 T 14:39, 18 March 2026 (UTC)
- There are only a few hundred (film) and (band) articles in the dbase report, and we can add other values to the report based on the discussion below. I figure to minimise disruption maybe do batches of forty or fifty at a time spread out over a couple of days. Primefac (talk) 10:23, 18 March 2026 (UTC)
- Thank you, duh! Vanderwaalforces (talk) 18:38, 16 March 2026 (UTC)
- If someone wants to modify User:Plastikspork/massmove.js to strip out suffixes, any admin would be able to batch-remove disambiguators such as
- Pppery, what are some examples of disambiguators that you think could be safely stripped? — Qwerfjkltalk 17:03, 16 March 2026 (UTC)
- In fact, I was originally going to code this task, but seeing the discussion so far, I paused because I am not sure we want to, even though I think we should too. Vanderwaalforces (talk) 13:57, 16 March 2026 (UTC)
- Why? I think it wold be a reasonable bot request to take a specific disambiguator and move all pages with that disambiguator where the base name either has never existed or is a single-revision redirect to the disambiguated page and move them to the base name. * Pppery * it has begun... 13:30, 16 March 2026 (UTC)
- Wikiwerner, these need to be handled on a case-by-case basis in general. — Qwerfjkltalk 10:53, 16 March 2026 (UTC)
- Well, this is quite it then. Vanderwaalforces (talk) 21:01, 20 January 2026 (UTC)
Automatically add Template:AI-retrieved source
[edit]Since we have Template:AI-retrieved source, it would be nice if a bot could add this template to refs based on whether they have the utm_source parameter set to a LLM value. (See User:Headbomb/unreliable for a list of these utm_source values).
Sources that were manually verified by someone can simply be marked as "good" by removing the utm_source parameter. Laura240406 (talk) 21:56, 19 January 2026 (UTC)
- I would be interested in developing this (n.b. I see that {{AI-retrieved source}} suggests either adding a
|checked=parameter or commenting out the template rather than modifying the source URL). — chrs || talk 02:46, 20 January 2026 (UTC) - We could also change the citation templats, so that if the URL parameter contains e.g. "UTM_source=chatgpt.com", then {{AI-retrieved source}} is added at the end of the citation. Wikiwerner (talk) 11:36, 19 March 2026 (UTC)
Infobox links to football statistics
[edit]For those players that appear in lists such as List of men's footballers with 1,000 or more official appearances, List of men's footballers with 100 or more international caps, List of women's footballers with 100 or more international caps, List of footballers with 500 or more goals or List of women footballers with 300 or more goals, create a link that would lead to that respective page (or its list subsection) directly from the relevant number in the infobox, as is already the case in the Cristiano Ronaldo article when it comes to the international caps statistic, for example.
For lists such as List of men's footballers with 50 or more international goals, List of women's footballers with 100 or more international goals, List of footballers with 500 or more goals or List of women footballers with 300 or more goals, the same may be done, with the caveat that many of their statistics already link to player-specific lists (like in the case of the Barbra Banda article, for example), which is of course preferable and should not be changed.
I have not managed to find examples of such lists in other sports, otherwise they could be included as well.
Thank you a lot for consideration! BasicWriting (talk) 01:18, 25 January 2026 (UTC)
- @BasicWriting To be honest, your request isn't clear, at least to me. Mind explaining better? Vanderwaalforces (talk) 14:50, 29 January 2026 (UTC)
- I'm not sure how better to explain this. This is a further example of an infobox that does have a link and this of one that doesn't. You may also see some of my mechanical contributions to this end, like the last one here, before I realized the task was too broad. Thanks a lot! BasicWriting (talk) 15:37, 30 January 2026 (UTC)
- @BasicWriting So, I took my time to look at this, and I think it is more technical that it seems; we do not have a clear if–then logic yet. I wanted to ask "Which infobox fields are in scope?" but then I realised that the field is not consistent. For example, Ivan Perišić, the International caps field that needs to be linked is
| nationalcaps4 =, but in another article, like Barbra Banda, the field was| nationalcaps2 =, and for Robert Lewandowski, it is| nationalcaps3 =; in all, that is pretty inconsistent if you ask me. It is not a problem that it is inconsistent, it is only inconsistent for a bot to actually efficiently and correctly work with. Although, I can see that the equivalent| nationalteam# =for these articles is usually only mentioning the "country", which is something we could use to streamline. - Can you provide a fixed whitelist of list pages the bot may link to? Must the player be explicitly present on the list page?, Is linking to the list page sufficient, or must it be player-anchored?
- The request is conceptually valid, but currently underspecified. Vanderwaalforces (talk) 16:09, 2 February 2026 (UTC)
- Thank you for looking into this! The way this particular issue could be streamlined is that, if we take the article about international caps as an example, the international caps that ought, according to my proposal, to link to that respective list, are by definition all the international caps above 100. So the bot could be considering the number itself and linking any number of 100 and above to said list. If only the player articles that the list links to are considered, then those players will very likely be those explicitly present on the list page (and same for other lists), so the whitelist could simply be all the articles being linked to by the nine lists I have listed. However, the important caveat is that, again, some of those statistics do already have links to player-specific lists, like in my List of international goals scored by Barbra Banda case, so I'd envision the bot as checking for those instances too and not changing them. As for anchoring, so far, those infobox links that are present are not anchored so far, but perhaps they could be to make it easier for the viewers.
- The only issue with this strategy would be those players, who have completed the milestone playing for several teams. But this seems to not be an issue with, say, those players who competed for both West Germany and later Germany (like Lothar Matthäus), as their infobox statistics is taken together, and it is probably the case in other instances too (such as, the goals in the infobox are listed separately for various teams, but the infobox does provide a total sum).
- These are my thoughts, but obviously I am not as gifted in coding as to foresee all possible issues with taking this route. BasicWriting (talk) 19:06, 2 February 2026 (UTC)
- @BasicWriting I am still exploring other ways to correctly identify which
| nationalteam# =we can correctly look at. Is it true that most of these Country names are almost always linked to “COUNTRY national football team”? Vanderwaalforces (talk) 19:50, 2 February 2026 (UTC)- I would think so (in case of the international lists, anyway). BasicWriting (talk) 22:33, 2 February 2026 (UTC)
- @BasicWriting For this list and this one too, take anybody listed there as an example, what value in the infobox should be linked to these lists? Please do the same for the last 4 lists you mentioned. Vanderwaalforces (talk) 22:54, 3 February 2026 (UTC)
- Of the lists you've mentioned: for the first one,
| totalcaps =, for the latter,| totalgoals =, at least that is my base understanding. In the latter case, the link would look better if it were not taking the parentheses the goals are shown in, but I think that does indeed happen automatically already. Not every player does have the sum listed in the infobox, but that should not be not an issue, I imagine. (Or we might write a different bot that first sums the infobox statistics, but that is a different task altogether.) - For the list you've mentioned earlier, it would indeed show a country name after
| nationalteam# =, but some of those might be names of non-existing countries. It might be practical to make it navigate to the largest number among the national teams. - In case of women footballers, much of the same holds with the further caveat that their infobox statistics seem to be oftentimes lacking, so the bot won't link many instances.
- In any case, I would, as a precaution, make the bot first look at the number it is linking, and check whether it is truly equal or greater than 1000, 500, 300, 100 etc. in those respective situations. And, again, not apply any of the above in the case there already is a link present in the infobox field, which is going to be the case mostly with
| nationalgoals# =. - In some ways, I have changed my mind and I don't think it would be a good idea to work on the club statistics at all and just keep our focus on the international statistics. The reason for that is that the club statistics do indeed follow various methodologies and the numbers seem way more inconsistent within the lists and the infoboxes. Possibly, they could know more about that in the respective portal.
- Secondly, and I am even less certain of it, given that some of the international goals fields in the infobox already link to player-specific lists, providing a link in the cases they don't to the general list could possibly border with WP:EASTEREGG, among some interpretations of that rule.
- The cases where, I think, we should definitely go forward without any qualms are thus List of men's footballers with 100 or more international caps and List of women's footballers with 100 or more international caps, even though those might be the hardest to program.
- Thank you a lot for your help! BasicWriting (talk) 11:12, 4 February 2026 (UTC)
- @BasicWriting As a matter of fact, those are the only two lists I think we can correctly work on with a bot. Because I looked at List of men's footballers with 1,000 or more official appearances and found Tommy Hutchison, for example, in the list table, it says "1,178+", but in his article we only have "983". The other lists cannot work because they will very much require human editorial judgment which would not be appropriate for a bot. Same thing applies to the 300-500 or more goals lists.
- I think we can also work on/with the XX or more international goals since they usually accompany the international caps lists. But just to be clear, did you say you do not want the international goals to be linked if the international caps is linked? Vanderwaalforces (talk) 12:38, 4 February 2026 (UTC)
- No, that's not what I said. What I was saying in the original request was that the international goals themselves already link to player-specific lists, and changing those should be avoided (so the bot should just for numbers above a certain threshold and ones that are not part of a link already. BasicWriting (talk) 13:32, 4 February 2026 (UTC)
- Ah, got it! I also want to say that, the figures in some of the articles are different from the ones in the list. Take Eseosa Aigbogun for example and the List of women's footballers with 100 or more international caps. Vanderwaalforces (talk) 13:59, 4 February 2026 (UTC)
- It does seem that her particular case might be one of a missing update. But this is again why I said the bot ought to first check the actual number in the infobox. If it is above 100 (in the case of the two lists we ought to go through with), and it is a player whose article is being linked by the list, there is a high probability they did appear more than a hundred times and thus the number ought to be linked through. The men's list does mention some of the cases where the number differs and this is due to whether a country's national football team is a member of FIFA or not and different approaches in counting the appearances against them. Something like that might be the case here as well. So, to conclude, in cases like hers, the bot should not link the article to the list. BasicWriting (talk) 15:13, 4 February 2026 (UTC)
Coding... Vanderwaalforces (talk) 16:02, 4 February 2026 (UTC)- @BasicWriting Check these diffs, it worked like magic; Special:Diff/1336581922, Special:Diff/1336581949, Special:Diff/1336581978. Vanderwaalforces (talk) 16:28, 4 February 2026 (UTC)
- These are truly epic! Thank you for your work!
- The handballers might indeed be a similar case, but if we take a look at List of female handballers with 1000 or more international goals, I do notice some eventualities, given for example how Jasna Kolar-Merdan played for multiple national teams and her infobox doesn't provide a sum for the appearances. But that is fine if we forgo the cases where the sum is not shown. The male list, which, by the way, has to be moved to "List of men's handballers with 1000 or more international goals" (which will go on to request myself), seems to have some fringe cases too, like that of Frank-Michael Wahl or Talant Dujshebaev, where the sums are missing (but these instances are few and far between and may be addressed manually). BasicWriting (talk) 22:15, 4 February 2026 (UTC)
- If this is okay, then I can go ahead and file a BRFA for this task. Vanderwaalforces (talk) 16:29, 4 February 2026 (UTC)
- I also observed that List of handballers with 1000 or more international goals might just be in the same situation. Vanderwaalforces (talk) 21:37, 4 February 2026 (UTC)
- It does seem that her particular case might be one of a missing update. But this is again why I said the bot ought to first check the actual number in the infobox. If it is above 100 (in the case of the two lists we ought to go through with), and it is a player whose article is being linked by the list, there is a high probability they did appear more than a hundred times and thus the number ought to be linked through. The men's list does mention some of the cases where the number differs and this is due to whether a country's national football team is a member of FIFA or not and different approaches in counting the appearances against them. Something like that might be the case here as well. So, to conclude, in cases like hers, the bot should not link the article to the list. BasicWriting (talk) 15:13, 4 February 2026 (UTC)
- Ah, got it! I also want to say that, the figures in some of the articles are different from the ones in the list. Take Eseosa Aigbogun for example and the List of women's footballers with 100 or more international caps. Vanderwaalforces (talk) 13:59, 4 February 2026 (UTC)
- No, that's not what I said. What I was saying in the original request was that the international goals themselves already link to player-specific lists, and changing those should be avoided (so the bot should just for numbers above a certain threshold and ones that are not part of a link already. BasicWriting (talk) 13:32, 4 February 2026 (UTC)
- Of the lists you've mentioned: for the first one,
- @BasicWriting For this list and this one too, take anybody listed there as an example, what value in the infobox should be linked to these lists? Please do the same for the last 4 lists you mentioned. Vanderwaalforces (talk) 22:54, 3 February 2026 (UTC)
- I would think so (in case of the international lists, anyway). BasicWriting (talk) 22:33, 2 February 2026 (UTC)
- @BasicWriting I am still exploring other ways to correctly identify which
- @BasicWriting So, I took my time to look at this, and I think it is more technical that it seems; we do not have a clear if–then logic yet. I wanted to ask "Which infobox fields are in scope?" but then I realised that the field is not consistent. For example, Ivan Perišić, the International caps field that needs to be linked is
- I'm not sure how better to explain this. This is a further example of an infobox that does have a link and this of one that doesn't. You may also see some of my mechanical contributions to this end, like the last one here, before I realized the task was too broad. Thanks a lot! BasicWriting (talk) 15:37, 30 January 2026 (UTC)
BRFA filed. Vanderwaalforces (talk) 02:35, 5 February 2026 (UTC)
- @BasicWriting I will not be working with the handballers (men's and women's), I think those are small number of entries altogether and can be manually worked on. Vanderwaalforces (talk) 08:08, 5 February 2026 (UTC)
- Dear Vanderwaalforces, I have finished working on the handballers, and it led me to another (perhaps less complicated) suggestion: we could also try to link all players in these lists to them by the virtue of the "See also" section, where we would list the corresponding lists they're in (again, excluding those links already present). This way we could capture even the lists that are for various reasons we've talked about above not applicable for the infoboxes. BasicWriting (talk) 11:37, 6 February 2026 (UTC)
- @BasicWriting That makes sense. I have added that functionality now. Vanderwaalforces (talk) 08:33, 7 February 2026 (UTC)
- Alas, this was
Not done. Vanderwaalforces (talk) 14:00, 16 March 2026 (UTC)
- Indeed it wasn't! The consensus on the project page was not there. I do still think it was a good idea on the whole.
- The 'See also' section part would definitely not have suffered from WP:EASTEREGG, and could have covered all nine lists that I've originally mentioned. If you think that part is still worth it, you could file a slimmed-down version of our previous request! BasicWriting (talk) 17:30, 16 March 2026 (UTC)
- Alas, this was
- @BasicWriting That makes sense. I have added that functionality now. Vanderwaalforces (talk) 08:33, 7 February 2026 (UTC)
- Dear Vanderwaalforces, I have finished working on the handballers, and it led me to another (perhaps less complicated) suggestion: we could also try to link all players in these lists to them by the virtue of the "See also" section, where we would list the corresponding lists they're in (again, excluding those links already present). This way we could capture even the lists that are for various reasons we've talked about above not applicable for the infoboxes. BasicWriting (talk) 11:37, 6 February 2026 (UTC)
- @BasicWriting I will not be working with the handballers (men's and women's), I think those are small number of entries altogether and can be manually worked on. Vanderwaalforces (talk) 08:08, 5 February 2026 (UTC)
UTM Bot Request
[edit]Hi! I have a bot idea, and since I don't have the knowledge of how to code it, I'll share my idea:
links = document.getElementsByTagName("a");
for (var link of links) {
var href = link.getAttribute("href");
if (!href) continue;
if (href[0] == "#") href = location.href + href;
else if (href[0] == "/") href = location.protocol + "//" + location.pathname;
var source = new URL(href).searchParams.get("utm_source");
if (source !== null) {
// Add page to Category "Pages with utm_source={source}"
}
}Iterate through all namespaces, then through all pages in that namespace. Please share any questions with me. Thanks! SeaDragon1 (talk) — Happy new year! 14:01, 27 January 2026 (UTC)
- @SeaDragon1 Does that category exist? What problem are we solving with this? Vanderwaalforces (talk) 12:55, 29 January 2026 (UTC)
- It makes sorting easier. SeaDragon1 (talk) — Happy new year! 14:20, 29 January 2026 (UTC)
- Besides, I'm pretty sure there are a lot of non-existent categories that pages still put themselves in. SeaDragon1 (talk) — Happy new year! 14:24, 29 January 2026 (UTC)
- I'm not entirely sure what you're sorting. I have a bot task that removes utm_source tags from URLs, is that what you're wanting? Primefac (talk) 20:15, 1 February 2026 (UTC)
- Well, I mean we can know WHICH pages have WHICH
utm_sourcetags, so we don't have to scour every single page. SeaDragon1 (talk) 20:17, 1 February 2026 (UTC) - Hello? SeaDragon1 (talk, contribs, happy birthday!) 17:24, 19 February 2026 (UTC)
- HELLO? SeaDragon1 (talk, contributions) 16:09, 20 February 2026 (UTC)
- @Primefac? SeaDragon1 (talk, contributions) 16:09, 20 February 2026 (UTC)
- I personally don’t understand this request tbh, it’s not well put together at least to me. Vanderwaalforces (talk) 17:44, 20 February 2026 (UTC)
- What I mean is:
- We can organize pages based on what UTM source the links in the article have.
- For example, this would be in Category:Pages with utm_source=chatgpt.com:
[https://example.com/page?utm_source=chatgpt.com]SeaDragon1 (talk, contributions) 18:43, 20 February 2026 (UTC)
- I personally don’t understand this request tbh, it’s not well put together at least to me. Vanderwaalforces (talk) 17:44, 20 February 2026 (UTC)
- @Primefac? SeaDragon1 (talk, contributions) 16:09, 20 February 2026 (UTC)
- Primefac, is there any worry of it removing a useful flag of an AI-generated reference? — Qwerfjkltalk 19:57, 20 February 2026 (UTC)
- I'm not particularly concerned about the content of the utm values, I suspect a large number have been removed but utm tracking doesn't necessarily mean the site itself is AI-generated. Primefac (talk) 23:11, 20 February 2026 (UTC)
- Primefac, no, but it could suggest that the source may not verify the text. — Qwerfjkltalk 10:50, 21 February 2026 (UTC)
- Do we have a consensus, or..? SeaDragon1 (talk, contributions) 14:33, 24 February 2026 (UTC)
- HELLO? SeaDragon1 (talk, contributions) 20:13, 12 March 2026 (UTC)
- Primefac? Qwerfjkl? SeaDragon1 (talk, contributions) 20:13, 12 March 2026 (UTC)
- SeaDragon1, I don't think your task, as you described it, is going to get consensus. — Qwerfjkltalk 21:08, 12 March 2026 (UTC)
- I meant, like... is it going to be WP:BRFA filed or not? That's the sort of consensus I'm talking about (let me guess, it's not going to be filed). SeaDragon1 (talk, contributions) 22:41, 12 March 2026 (UTC)
- @SeaDragon1 As a bot operator, I'm having trouble understanding exactly what your request is, and its purpose. I've seen no proof of consensus that this is something that the English Wikipedia editor community wants. For a bot operator to spend the time to write the bot code, test it, ensure that it is compliant with WP:BOTPOL and file a WP:BRFA, they would need to see evidence that the requested bot is something that the community wants. If there is consensus for this being something that is desired, please link it in this discussion. If no such consensus exists, the BRFA will fail regardless of how good of a bot there is.
- Additionally, we are all volunteers, and most importantly, there is no deadline. Please keep that in mind when requesting that the bot operator community write a bot and file a BRFA for you. phuzion (talk) 00:36, 14 March 2026 (UTC)
- I meant, like... is it going to be WP:BRFA filed or not? That's the sort of consensus I'm talking about (let me guess, it's not going to be filed). SeaDragon1 (talk, contributions) 22:41, 12 March 2026 (UTC)
- SeaDragon1, I don't think your task, as you described it, is going to get consensus. — Qwerfjkltalk 21:08, 12 March 2026 (UTC)
- Primefac, no, but it could suggest that the source may not verify the text. — Qwerfjkltalk 10:50, 21 February 2026 (UTC)
- I'm not particularly concerned about the content of the utm values, I suspect a large number have been removed but utm tracking doesn't necessarily mean the site itself is AI-generated. Primefac (talk) 23:11, 20 February 2026 (UTC)
- Well, I mean we can know WHICH pages have WHICH
- I'm not entirely sure what you're sorting. I have a bot task that removes utm_source tags from URLs, is that what you're wanting? Primefac (talk) 20:15, 1 February 2026 (UTC)
- Besides, I'm pretty sure there are a lot of non-existent categories that pages still put themselves in. SeaDragon1 (talk) — Happy new year! 14:24, 29 January 2026 (UTC)
- It makes sorting easier. SeaDragon1 (talk) — Happy new year! 14:20, 29 January 2026 (UTC)
- ┌───────────────────────────┘
SeaDragon1, it will be filed if someone decides to file it, but I doubt that will happen. — Qwerfjkltalk 13:26, 13 March 2026 (UTC)- Oh, okay. Sorry. SeaDragon1 (talk, contributions) 02:30, 15 March 2026 (UTC)
Useless non-free no reduce tags
[edit]{{Non-free no reduce}} is intended to keep a bot (I think it is DatBot (talk · contribs)) from downsizing large non-free images to no more than 100k square pixels. Therefore, it is redundant and pointless if an image with size ≤100k has this tag. I recently found an image File:Gangbusters title.png that had this tag with a size of 418×239 px (99902 px2). –LaundryPizza03 (dc̄) 21:16, 10 February 2026 (UTC)
- This page is for bot requests. Are you requesting a bot? SeaDragon1 (talk, contributions) 16:11, 20 February 2026 (UTC)
- Yes, because this is an issue that is trivial to handle and likely to recur in the future. –LaundryPizza03 (dc̄) 04:40, 21 February 2026 (UTC)
- So from a bot perspective, how often (generally speaking) does this sort of thing happen? Primefac (talk) 12:56, 21 February 2026 (UTC)
- Looks like there are currently 204 such images, 41 with the latest file revision in 2025 and one so far in 2026. There are also four audio files with the tag: File:J Dilla - Don't Cry.ogg, File:J Dilla - Time, The Donut of the Heart.ogg, File:Jane Remover - Dreamflasher.ogg, and File:Kanye West - Blood on the Leaves.ogg.I note 57 of the 204 are in Category:Sports uniforms, 54 of them last uploaded in 2024. Many appear to be templated images showing two variations of a uniform, in contrast to others (over 100000px2) that have three variations (e.g. File:ECM-Uniform-PHI.png vs File:ECA-Uniform-DET.png), which makes me suspect they started tagging all new template images with {{Non-free no reduce}}. Anomie⚔ 19:55, 21 February 2026 (UTC)
- @27JJ: This is you who uploaded the latest version of sports uniforms. –LaundryPizza03 (dc̄) 19:57, 1 April 2026 (UTC)
- I’ve been applying the tag consistently across template-based uploads to prevent unintended resizing. This was inherited from other contributors. Since it’s unnecessary for images under 100k px, I’ll adjust moving forwards. Feel free to clean up existing files where needed. 27JJ (talk) 20:32, 1 April 2026 (UTC)
- @27JJ: This is you who uploaded the latest version of sports uniforms. –LaundryPizza03 (dc̄) 19:57, 1 April 2026 (UTC)
- Looks like there are currently 204 such images, 41 with the latest file revision in 2025 and one so far in 2026. There are also four audio files with the tag: File:J Dilla - Don't Cry.ogg, File:J Dilla - Time, The Donut of the Heart.ogg, File:Jane Remover - Dreamflasher.ogg, and File:Kanye West - Blood on the Leaves.ogg.I note 57 of the 204 are in Category:Sports uniforms, 54 of them last uploaded in 2024. Many appear to be templated images showing two variations of a uniform, in contrast to others (over 100000px2) that have three variations (e.g. File:ECM-Uniform-PHI.png vs File:ECA-Uniform-DET.png), which makes me suspect they started tagging all new template images with {{Non-free no reduce}}. Anomie⚔ 19:55, 21 February 2026 (UTC)
- So from a bot perspective, how often (generally speaking) does this sort of thing happen? Primefac (talk) 12:56, 21 February 2026 (UTC)
- Yes, because this is an issue that is trivial to handle and likely to recur in the future. –LaundryPizza03 (dc̄) 04:40, 21 February 2026 (UTC)
archive.today cleanup
[edit]Cross-posting Wikipedia:Link rot/URL change requests § Migration away from archive.today, but as a non-standard URL replacement I think this would be more appropriate for a new bot task.
After an RFC, links to archive.today have been deprecated and should be removed as expeditiously as possible. The current instructions are at WP:ATODAY. It would be amazing if a bot could loop through all links to dot-today and:
- Get the page being archived (chopping off the
https://archive.(fo|is|li|md|ph|today|vn)/<numbers>/prefix from archive URLs, like https://archive.today/20120710094053/http://freespace.virgin.net/howard.anderson/loospreparations.htm) - If the URL is available at archive.org, replace the dot-today link with a dot-org link. Dot-org has an API for fetching already-archived URLs, and it also returns the appropriate date for the
|archive-date=(from {{cite xxx}}) or|date=(from {{webarchive}})) parameters. - If the page is not archived at archive.org:
- If the original URL is still live, try to archive it at archive.org (after a lot of digging, I found the API instructions for saving a page). If archiving works, replace the dot-today link with dot-org; otherwise, remove the WP:EARLYARCHIVE.
- If the original URL is dead, tag the link with
{{New archival link needed|date={{subst:monthyear}}|bot=insert your bot's name here}}
Thanks a million, in advance. Best, HouseBlaster (talk • he/they) 21:50, 22 February 2026 (UTC)
- This is not an appropriate task for a bot. sapphaline (talk) 22:08, 22 February 2026 (UTC)
- I disagree; if we manage to run things like IAbot and WaybackMedic, which have to deal with the same hurdles, we can find a way to make it work. Best, HouseBlaster (talk • he/they) 22:11, 22 February 2026 (UTC)
- IABot and WaybackMedic archive live URLs; of course doing so is 1000 times easier than trying to archive dead URLs. sapphaline (talk) 22:14, 22 February 2026 (UTC)
- They... do in fact archive dead URLs? Unless you select
Add archives to all non-dead references (Optional)
, IABot only adds archives to dead URLs. WaybackMedic task 3 and 4 do likewise. Best, HouseBlaster (talk • he/they) 22:23, 22 February 2026 (UTC)
- They... do in fact archive dead URLs? Unless you select
- IABot and WaybackMedic archive live URLs; of course doing so is 1000 times easier than trying to archive dead URLs. sapphaline (talk) 22:14, 22 February 2026 (UTC)
- I disagree; if we manage to run things like IAbot and WaybackMedic, which have to deal with the same hurdles, we can find a way to make it work. Best, HouseBlaster (talk • he/they) 22:11, 22 February 2026 (UTC)
- You have to verify that the content relevant to the citation is actually available at the live url and/or archive.org. This is not always true, and it is not possible to reliably do this by bot. A few days ago I found an instance where the URL was dead and the archive link was of a 404 page, fortunately an older archive snapshot actually had the content saved but this required my abilities as a human to determine. Thryduulf (talk) 22:51, 22 February 2026 (UTC)
- It should be possible to automatically extract and compare content, and replace the archive.today link with the alternative link if the match percentage is very high, but this would certainly be more resource-intensive than a search-and-replace bot. Bit of brainstorming here: Wikipedia talk:Archive.today guidance#Bot for checking for identical text. Dreamyshade (talk) 23:26, 22 February 2026 (UTC)
- To build off of this, Wikipedia talk:Archive.today guidance#Most common links suggests that there are 600+ instances that can be fixed by a bot swapping from archive.today to the Internet Archive along with 1,000+ instances where a page changed urls and could be updated by changing the link with a bot. --Super Goku V (talk) 10:24, 23 February 2026 (UTC)
- It should be possible to automatically extract and compare content, and replace the archive.today link with the alternative link if the match percentage is very high, but this would certainly be more resource-intensive than a search-and-replace bot. Bit of brainstorming here: Wikipedia talk:Archive.today guidance#Bot for checking for identical text. Dreamyshade (talk) 23:26, 22 February 2026 (UTC)
- I think a bot is premature. I'm feverishly trying to edit pages from my watchlist (that contain archive.today) in the hopes I can get them all done before "some bot" comes along and yanks them out from under me. I'm occasionally using the archived archive.today link (on a browser with uBlock Origin) to see the page. Sometimes it's worthless (like a 404 page copied) and I can just remove it from the Wikipedia article, but other times I've been able to use other URLs and title wording found on that archive.today page that have led to me successfully finding (a) actual live pages with newer URLs, and (b) archive.org pages which weren't available to me with only the information found in the Wikipedia article. To resolve about 10% (estim.) of the archive.today links, I've needed to view the archive.today copy. ▶ I am Grorp ◀ 23:14, 22 February 2026 (UTC)
- This proposed bot would not entirely remove any links. It would replace links to dot-org ones, and tag ones that can't be replaced. Best, HouseBlaster (talk • he/they) 23:29, 22 February 2026 (UTC)
- Okay. That's acceptable. ▶ I am Grorp ◀ 23:55, 22 February 2026 (UTC)
- This proposed bot would not entirely remove any links. It would replace links to dot-org ones, and tag ones that can't be replaced. Best, HouseBlaster (talk • he/they) 23:29, 22 February 2026 (UTC)
- A couple big points: We need a bot - the task is too large to do manually, and the RfC close is unambiguous that these links need to go. The bot is technically difficult - We have plenty of comments from people who know what they're talking about indicating that the full task is not possible to automate without many errors. I definitely think we want a bot that gets used in a staged approach. As a first step, peeling the archive off of links we believe are alive leaving the live link solves a large fraction of the problem. After that we can start working on the rest website by website to figure out if other archive sites are consistently good, consistently impossible, or messy, and handle the first 2 automatically while leaving the third group for a manual cleanup. Tazerdadog (talk) 00:09, 23 February 2026 (UTC)
- Remember that while removing archives from a live link may solve today's hot-button issue it is simultaneously creating a different problem for the future if the link is not archived somewhere else. Thryduulf (talk) 01:18, 23 February 2026 (UTC)
- Yeah, a run from IAbot over the affected articles after it stops linking to archive.today would probably be in order. That said, the archive.today issue is absolutely the urgent one. Tazerdadog (talk) 01:56, 23 February 2026 (UTC)
- I strongly disagree that we should wilfully create future problems without regard for how to solve them. The urgency of replacing archive.today is entirely artificial. Thryduulf (talk) 02:01, 23 February 2026 (UTC)
- There is consensus that it should be removed as quickly as possible because it is an unreliable source containing malware. That sounds pretty urgent to me. Best, HouseBlaster (talk • he/they) 02:21, 23 February 2026 (UTC)
- No, there was consensus to remove links "as soon as practicable" due to alleged verifiability issues. Even if there was consensus that the code was malware and/or that it required action (and there was no consensus on that matter alone) the "urgency" is entirely chosen by some of the participants in the RFC. A "solution" that creates new problems without regard to solving them is not a practicable one. Thryduulf (talk) 03:03, 23 February 2026 (UTC)
Many edits might create future problems. But we are worried about the present. I don't see why a solution which might cause isolated problems at some indeterminate point in the future is not practicable. Best, HouseBlaster (talk • he/they) 03:12, 23 February 2026 (UTC)There is a strong consensus that Wikipedia should not direct its readers towards a website that hijacks users' computers to run a DDoS attack
, and then linking to our guideline about malware, sounds like there is consensus that the link is malware. I haven't heard a single argument (other than proof by assertion) to rebut the arguments based on the literal definition of malware. To reiterate, we define malware asany software intentionally designed to cause disruption to a computer, server, client, or computer network
. It is software (obviously), it causes disruption to a computer network (that's what a DDOS is), and it was intentional. It is malware.- Regardless of the technicalities of whether archive.today's code is or isn't malware, simply removing links is more than "might" cause "isolated problems" it's will cause problems of unknown magnitude, potentially tomorrow. WP:V is a core, non-negotiable policy that we absolutely must have regard for because there is a big difference between accidentally creating problems on a small scale through good-faith ignorance and wilfully and recklessly causing known problems on an unknown scale through haste. Thryduulf (talk) 03:23, 23 February 2026 (UTC)
- Correct, and that's why removing links to an unreliable source is urgent. HouseBlaster (talk • he/they) 03:26, 23 February 2026 (UTC)
- Except removing the links without replacement isn't urgent for any reason other than the desire of a few self-selecting Wikipedians, and it isn't beneficial or desirable for any reason at all. Further, if you believe that any of the problems being discussed here is links to an unreliable source then you have grossly missunderstood the issues. The issue at hand is ensuring that WP:V will be continued to be met in the future, long after the present moral panic has blown over and cooler heads once again prevail. This is an issue that we have no choice but to address ideally before but certainly no later than the time we address the removal of the existing archive. Thryduulf (talk) 03:42, 23 February 2026 (UTC)
- Houseblaster responded to my same concerns above:
This proposed bot would not entirely remove any links. It would replace links to dot-org ones, and tag ones that can't be replaced.
▶ I am Grorp ◀ 03:44, 23 February 2026 (UTC)- That's good, but it isn't part of this proposal by Tazerdog,
As a first step, peeling the archive off of links we believe are alive leaving the live link solves a large fraction of the problem.
Thryduulf (talk) 05:19, 23 February 2026 (UTC)
- That's good, but it isn't part of this proposal by Tazerdog,
- @Thryduulf Having material sourced to a website that we know (you get to interact with the evidence of alterered screenshots - specifically compare [2] with with archive.today website, if you'd need to) is a WP:V issue. We need to ensure WP:V is met - but just because I can verify the claim to a screenshot of a newspaper somebody put on reddit, alongside a "trust me bro, this is legit" doesn't mean I it's guaranteed protection under the verification policy. In fact, it isn't - if the claim is related to living people, and I believe it to be contentious, then it's 100% not guaranteed and policy dictates that we remove or better source the material without delay. Now, whether a bot it is the best way to do that, I'll leave to people who actually have experience with dealing with bots on Wikipedia. But to say something to the effect of "we have to keep a website we know has forged material around because they're the only way I can verify these sensitive BLP claims (one of the few things for which we actually require an inline citation)" is, quite frankly, ludicrous. GreenLipstickLesbian💌🧸 05:01, 23 February 2026 (UTC)
- Almost none of that is related to the actual reality of the situation, as opposed to unsubstantiated and unverified claims about what the archive.today might be doing. When all the hyperbole and exaggeration is put to one side, it is equally important that we ensure WP:V is met tomorrow as well as today as it is to remove archive.today links that there is a tiny possibility might not be accurate (and in none of the cases has anyone demonstrated any changes that are relevant to any material verified, I don't even recall any allegations of such but there are so many accusations in so many parallel discussions I may have missed some). Your final sentence also makes it clear you haven't actually read and/or understood much of what I'm saying. Thryduulf (talk) 05:17, 23 February 2026 (UTC)
- @Thryduulf, if you looked through the list at WP:NOTGOODSOURCE, and compared it to what we know about this collection of archives, do you think you'd be left feeling like it's a reliable archive?
- "It has a reputation for fact-checking and accuracy" – Um, nope. In fact, the owner appears to be going out of their way to ruin their reputation.
- "It is published by a reputable publishing house" – Heading determinedly into the "disreputable" territory, wouldn't you say?
- "It has a professional structure in place for deciding whether to publish something, such as editorial oversight or peer review processes" – Professionalism is a good word to describe what's missing here.
- Remember Lugnuts' parting claim that he'd deliberately falsified information in some articles, and how that has haunted some editors for years, even though there's no evidence that Lugnuts ever did that to any article? This situation is worse than that. We have proof that this website is doing that, and the risk is enough to turn most editors against it. We don't meet WP:V's requirements by having an unreliable archived copy of a now-dead website.
- Also, have you noticed the pattern? We say "at least they're not doing X" on one day, and the next day, they do X. We say, "well, they might be doing X, but at least they're not doing Y", and the next day, they do Y. So please think about WP:BEANS before you post any more comments about how they haven't transgressed some further bright line yet. It's already bad enough. WhatamIdoing (talk) 05:44, 23 February 2026 (UTC)
- WAID has, as expected, said this far more eloquently than I could. One additional point, though, @Thryduulf -- I've stuck to what archive.today has done and threatened to do, and taken those threats seriously. I have made more speculative posts, but over all, not that many.. But I could, if you'd like? For example, let's take the statement we got from the WMF:
[3]We know that WMF intervention is a big deal, but we also have not ruled it out, given the seriousness of the security concern for people who click the links that appear across many wikis.
- Hopefully now we, as a
few self-selecting Wikipedians
, have taken steps to limit the spread of links, the WMF won't need to take further action. But they might. Forcibly removing all links is something they 100% have the right - some may argue an obligation - to do. I'd rather that we, as a community, deal with the links first, making efforts to switch the links to alternative archives or alternative sources when possible. On our own terms. Because the WMF has made it clear that they haven't ruled out acting, and I think we'll do a better job. GreenLipstickLesbian💌🧸 06:48, 23 February 2026 (UTC)- I agree with your interpretation of the WMF's statement. However, it's not the only way this decision could be taken out of our hands. Global blacklisting was proposed today at m:Requests for comment/Deprecate archive.today. WhatamIdoing (talk) 06:59, 23 February 2026 (UTC)
- Stupid question - I know the enWiki blacklist and (and the edit filter, in this particular care) ideally doesn't prevent us from editing pages with the blacklisted link, only from adding the link. (Which I've gotten around when requesting cv-revdels by spelling out the "www.example.com" as "www dot example dot com" I think that gets a bit messed up when it comes to reverts. Does anybody know if the global blacklist works similarly, or is the only difference the fact that it impacts all Wikipedia projects? Does it impact all Wikipedia projects, actually? I'm assuming it blocks all namespaces - I think our filter has exceptions for archive .is /today links for project-space areas (which incidentally facilitates cleanup). Have I made a stupid mistake in that assumption? GreenLipstickLesbian💌🧸 08:05, 23 February 2026 (UTC)
- Special:AbuseFilter allows per-namespace options. MediaWiki:Spam-blacklist unfortunately does not. I believe the only thing that's different about the global blacklist is that ours only affects us, and the global one affects all the wikis. WhatamIdoing (talk) 08:19, 23 February 2026 (UTC)
- Stupid question - I know the enWiki blacklist and (and the edit filter, in this particular care) ideally doesn't prevent us from editing pages with the blacklisted link, only from adding the link. (Which I've gotten around when requesting cv-revdels by spelling out the "www.example.com" as "www dot example dot com" I think that gets a bit messed up when it comes to reverts. Does anybody know if the global blacklist works similarly, or is the only difference the fact that it impacts all Wikipedia projects? Does it impact all Wikipedia projects, actually? I'm assuming it blocks all namespaces - I think our filter has exceptions for archive .is /today links for project-space areas (which incidentally facilitates cleanup). Have I made a stupid mistake in that assumption? GreenLipstickLesbian💌🧸 08:05, 23 February 2026 (UTC)
- I agree with your interpretation of the WMF's statement. However, it's not the only way this decision could be taken out of our hands. Global blacklisting was proposed today at m:Requests for comment/Deprecate archive.today. WhatamIdoing (talk) 06:59, 23 February 2026 (UTC)
- @Thryduulf, if you looked through the list at WP:NOTGOODSOURCE, and compared it to what we know about this collection of archives, do you think you'd be left feeling like it's a reliable archive?
- Almost none of that is related to the actual reality of the situation, as opposed to unsubstantiated and unverified claims about what the archive.today might be doing. When all the hyperbole and exaggeration is put to one side, it is equally important that we ensure WP:V is met tomorrow as well as today as it is to remove archive.today links that there is a tiny possibility might not be accurate (and in none of the cases has anyone demonstrated any changes that are relevant to any material verified, I don't even recall any allegations of such but there are so many accusations in so many parallel discussions I may have missed some). Your final sentence also makes it clear you haven't actually read and/or understood much of what I'm saying. Thryduulf (talk) 05:17, 23 February 2026 (UTC)
- Houseblaster responded to my same concerns above:
- Except removing the links without replacement isn't urgent for any reason other than the desire of a few self-selecting Wikipedians, and it isn't beneficial or desirable for any reason at all. Further, if you believe that any of the problems being discussed here is links to an unreliable source then you have grossly missunderstood the issues. The issue at hand is ensuring that WP:V will be continued to be met in the future, long after the present moral panic has blown over and cooler heads once again prevail. This is an issue that we have no choice but to address ideally before but certainly no later than the time we address the removal of the existing archive. Thryduulf (talk) 03:42, 23 February 2026 (UTC)
- Correct, and that's why removing links to an unreliable source is urgent. HouseBlaster (talk • he/they) 03:26, 23 February 2026 (UTC)
- Regardless of the technicalities of whether archive.today's code is or isn't malware, simply removing links is more than "might" cause "isolated problems" it's will cause problems of unknown magnitude, potentially tomorrow. WP:V is a core, non-negotiable policy that we absolutely must have regard for because there is a big difference between accidentally creating problems on a small scale through good-faith ignorance and wilfully and recklessly causing known problems on an unknown scale through haste. Thryduulf (talk) 03:23, 23 February 2026 (UTC)
- No, there was consensus to remove links "as soon as practicable" due to alleged verifiability issues. Even if there was consensus that the code was malware and/or that it required action (and there was no consensus on that matter alone) the "urgency" is entirely chosen by some of the participants in the RFC. A "solution" that creates new problems without regard to solving them is not a practicable one. Thryduulf (talk) 03:03, 23 February 2026 (UTC)
- There is consensus that it should be removed as quickly as possible because it is an unreliable source containing malware. That sounds pretty urgent to me. Best, HouseBlaster (talk • he/they) 02:21, 23 February 2026 (UTC)
- I strongly disagree that we should wilfully create future problems without regard for how to solve them. The urgency of replacing archive.today is entirely artificial. Thryduulf (talk) 02:01, 23 February 2026 (UTC)
- Yeah, a run from IAbot over the affected articles after it stops linking to archive.today would probably be in order. That said, the archive.today issue is absolutely the urgent one. Tazerdadog (talk) 01:56, 23 February 2026 (UTC)
- Remember that while removing archives from a live link may solve today's hot-button issue it is simultaneously creating a different problem for the future if the link is not archived somewhere else. Thryduulf (talk) 01:18, 23 February 2026 (UTC)
- I am currently workshopping a related proposal at the village pump idea lab. tl;dr: the tasks of GreenC_bot/WaybackMedic and IABot can probably do some work on the second part, but it's somewhere between unlikely and unclear whether full automation is technically feasible and desirable. Adding the new cleanup tag specific to this problem (its name may be changed to Template:Deprecated archive per the VPI discussion) and removing problematic links should be the bare minimum and both of those tasks can be automated. mdm.bla 01:13, 23 February 2026 (UTC)
- Also @HouseBlaster: Currently the tag does not have a
|bot=parameter. Can that just be added into the template? mdm.bla 01:15, 23 February 2026 (UTC)- It can. It (probably) wouldn't change the output; it would just alert editors viewing the page source code that the tag was placed by a bot. Best, HouseBlaster (talk • he/they) 01:17, 23 February 2026 (UTC)
Template adjusted. [new archival link needed] now displays properly with the bot parameter. mdm.bla 01:34, 23 February 2026 (UTC)
- I think you misunderstood the way it normally works: usually, the bot parameter has zero effect on the template. It just exists, and doesn't trip the unknown parameter check. Best, HouseBlaster (talk • he/they) 02:31, 23 February 2026 (UTC)
- @HouseBlaster: Thanks for the adjustments; I'm not a prolific template creator and overthought that whole thing wayyyyyy too much. mdm.bla 05:17, 23 February 2026 (UTC)
- I think you misunderstood the way it normally works: usually, the bot parameter has zero effect on the template. It just exists, and doesn't trip the unknown parameter check. Best, HouseBlaster (talk • he/they) 02:31, 23 February 2026 (UTC)
- It can. It (probably) wouldn't change the output; it would just alert editors viewing the page source code that the tag was placed by a bot. Best, HouseBlaster (talk • he/they) 01:17, 23 February 2026 (UTC)
- Also @HouseBlaster: Currently the tag does not have a
Three domain names that we can remove
[edit]As Dreamyshade mentioned, about ~12K of these links go to three websites:
- nytimes.com
- newspapers.com
- washingtonpost.com
These two newspapers maintain their own archives (and so do major public libraries). The middle one is an archive itself. I suggest that unwanted archive links to all three of these could simply be removed by bot. It's only ~2% of the links, but since 100% of the archive links to these domain names are unnecessary for WP:V purposes, then nothing more than a simple removal is needed, and we should get that 2% of the job done. WhatamIdoing (talk) 05:24, 23 February 2026 (UTC)
- If a bot removes archive-urls from articles in my watchlist purview, I'll revert them to restore the archived links. ▶ I am Grorp ◀ 05:43, 23 February 2026 (UTC)
- @Grorp, Newspapers.com is an archive. The source is the newspaper. Newspapers.com is an archive of the source. Archive.today is an archive of an archive of the source. Why do you need an archive of an archive of the source? WhatamIdoing (talk) 05:46, 23 February 2026 (UTC)
- You don't. This subsection and your post above was ambiguous. It seemed to suggest a bot should remove ALL |archive-urls of things like nytimes.com... not just remove archive.today links for nytimes. My bad if I read it as the wrong fork of the ambiguity. ▶ I am Grorp ◀ 05:51, 23 February 2026 (UTC)
- We are only talking about removing "unwanted archive links" (and in my case, only ones that are 100% unnecessary for WP:V purposes). That includes archive.today but also "archive dot lots of other things", since that same owner has multiple domain names. WhatamIdoing (talk) 06:08, 23 February 2026 (UTC)
- Who has conclusively decided what archive links are unnecessary/unwanted? There are multiple Wikipedia pages dedicated to combating link rot, many of which are even encouraging users to archive everything to help preserve verifiability. And evidently according to WP:PLRT the pervasive threat of link rot has become such a concern that all new links added to Wikipedia are automatically archived. This messaging from WP:DEADREF and WP:MDLI is very confusing. --skarz (talk) 18:12, 25 February 2026 (UTC)
- @Skarz, do we need someone to "conclusively" decide every single thing? Or do you think that editors could use their own best judgement to make decisions? Imagine someone saying "Gee, that's a common university textbook. I remember lugging it around in my backpack. Amazon still sells hardback copies. Okay, we probably don't need a link to an online 'archive' for that book". WhatamIdoing (talk) 20:58, 25 February 2026 (UTC)
- You completely sidestepped my legitimate question and gave me some bogus strawman argument. Nice. Reminder that your own words are that you are removing archives that are "100 unnecessary." Nevertheless, I am not aware of entire textbooks being archived on Wikipedia nor do I support that endeavor. I am, however, aware that Wikipedia policy encourages webpage archival to prevent link rot which in turn protects WP:V. Thanks, and have a great night. --skarz (talk) 01:09, 26 February 2026 (UTC)
- Well, unlike you, I have seen whole books get archive links spammed into the citation (and yes, I did complain to the bot op at the time), and I do think those are 100% unnecessary.
- Link rot is a potential problem for websites. Link rot is not a problem for dead-tree media. For example, every source archived in Newspapers.com is a scanned copy of a physical newspaper page. Those sources exist on paper. A URL that allows you to read a dead-tree source online is a Wikipedia:Convenience link. It is not a necessity. WhatamIdoing (talk) 03:01, 26 February 2026 (UTC)
- Here's an example of me removing an archive link for a whole text book. The book's available on Amazon for about $25, and Wikipedia:Reliable sources/Cost applies. WhatamIdoing (talk) 03:28, 26 February 2026 (UTC)
- You completely sidestepped my legitimate question and gave me some bogus strawman argument. Nice. Reminder that your own words are that you are removing archives that are "100 unnecessary." Nevertheless, I am not aware of entire textbooks being archived on Wikipedia nor do I support that endeavor. I am, however, aware that Wikipedia policy encourages webpage archival to prevent link rot which in turn protects WP:V. Thanks, and have a great night. --skarz (talk) 01:09, 26 February 2026 (UTC)
- We also had a formally closed well attended RFC close with a need to remove all of the archive.today links. We're trying to be courteous and do that in a way that minimizes future linkrot, but we do have to do it. In this case, the live links are expected to be stable for these domains. If you want to run behind the bot with a fixed archive bot, great! But standing in the way because removing some links that absolutely need to be removed might cause some issues in the future is unpersuasive when we have the strongest consensus you can get on Wikipedia that these links as is are causing major issues right now. Frankly, how/if/with what priority these links get rearchived is out of scope here - we have to remove these, and the discussion of how to get them rearchived can be separated out. Tazerdadog (talk) 21:47, 25 February 2026 (UTC)
- I am hardly "standing in the way" of anything, I was asking clarification on which archive links are considered unnecessary per WP:PLRT. --skarz (talk) 01:18, 26 February 2026 (UTC)
- @Skarz, do we need someone to "conclusively" decide every single thing? Or do you think that editors could use their own best judgement to make decisions? Imagine someone saying "Gee, that's a common university textbook. I remember lugging it around in my backpack. Amazon still sells hardback copies. Okay, we probably don't need a link to an online 'archive' for that book". WhatamIdoing (talk) 20:58, 25 February 2026 (UTC)
- Who has conclusively decided what archive links are unnecessary/unwanted? There are multiple Wikipedia pages dedicated to combating link rot, many of which are even encouraging users to archive everything to help preserve verifiability. And evidently according to WP:PLRT the pervasive threat of link rot has become such a concern that all new links added to Wikipedia are automatically archived. This messaging from WP:DEADREF and WP:MDLI is very confusing. --skarz (talk) 18:12, 25 February 2026 (UTC)
- We are only talking about removing "unwanted archive links" (and in my case, only ones that are 100% unnecessary for WP:V purposes). That includes archive.today but also "archive dot lots of other things", since that same owner has multiple domain names. WhatamIdoing (talk) 06:08, 23 February 2026 (UTC)
- Some Newspapers.com archive links are necessary because the publication is no longer available there. For instance, WSWG extensively cites a newspaper Newspapers.com only had available for a few days. It is not even searchable now. Sammi Brie (she/her · t · c) 04:55, 24 February 2026 (UTC)
- So? We're not citing the newspapers.com archive. We're citing the newspaper itself. If you're citing The Mulberry Advance, and it happened to be available
|via=the Newspapers.com archive, and now it's not available through that archive, then – who cares? It's still a valid printed-on-paper source, even if you can't see it anywhere online. WhatamIdoing (talk) 21:57, 25 February 2026 (UTC)
- So? We're not citing the newspapers.com archive. We're citing the newspaper itself. If you're citing The Mulberry Advance, and it happened to be available
- You don't. This subsection and your post above was ambiguous. It seemed to suggest a bot should remove ALL |archive-urls of things like nytimes.com... not just remove archive.today links for nytimes. My bad if I read it as the wrong fork of the ambiguity. ▶ I am Grorp ◀ 05:51, 23 February 2026 (UTC)
- @Grorp, Newspapers.com is an archive. The source is the newspaper. Newspapers.com is an archive of the source. Archive.today is an archive of an archive of the source. Why do you need an archive of an archive of the source? WhatamIdoing (talk) 05:46, 23 February 2026 (UTC)
- Yes please! This is a good start - it's a meaningful chunk of sources, and it should help us dial in the process. Tazerdadog (talk) 06:02, 23 February 2026 (UTC)
Reference spam detector
[edit]I would like to see a bot that would detect likely Reference spam, and generate a confidence score internally, the way User:Cluebot NG does. Ideally, at high levels of confidence, perhaps it could just revert as Cluebot does, but in any case, it ought to generate a project page with a table or log of rated edits so that humans could review the results, comment, perhaps define a confidence threshold for auto-reverts, and of course, provide data for refining and tuning the algorithm.
I seem to be spending more and more time analyzing and reverting WP:REFSPAM, and a lot of them are very obvious and really should not need human intervention. If someone is a new editor, adds substantially the same citation to multiple articles, with no added content (or brief, near-identical content), and has few or no edits outside one topic area (i.e., an WP:SPA), odds are very high they are a ref spammer. Mathglot (talk) 01:34, 24 February 2026 (UTC)
- @Mathglot: this is programmatically possible, a little difficult and time-consuming — but possible. I have a few questions, would it be okay if we continue the discussion on your or my talkpage? —usernamekiran (talk) 08:37, 23 March 2026 (UTC)
- usernamekiran, sure, let's move it, but let's find a more public Wikipedia page or WikiProject page where we have a prayer of attracting other interested comment. Perhaps Wikipedia talk:Spam (540 watchers, 56 pageviews/mo.) or Wikipedia talk:WikiProject Spam (1,218 / 1,926), or a subpage of one of them to centralize possibly extended commentary? My knowledge in this field is ancient now, but I wonder if we compiled a grab-bag of possible features (to add to the four I listed), generated a test set of a few hundred human-assessed spam evaluations, and threw a machine learning bot at it with the feature set, whether that might generate a usable model, at least as proof of concept. Likely with all the advances in AI, some of that can be streamlined, maybe even the assessments? That would be a win. Mathglot (talk) 09:19, 23 March 2026 (UTC)
- That's even better, I should have thought of that. I have a primary workflow in my mind, only for creating the report(s). In the early phase, the bot should rely on heuristics instead of machine learning. Once we create a good confidence scoring mechanism, we can move to next phase of reverting the edits. But during first phase, we will need inputs from other users on reports — to cross-verify the suspected spam links. In few hours, I will copy-paste this conversation, and detailed workflow on Wikipedia talk:WikiProject Spam, and notify few relevant venues of the discussion. Once we create a workflow/logic, I can start on concrete programming. During the discussion, I will create code for detecting the URLs being inserted, associating them with users/articles, and other basic necessary stuff. —usernamekiran (talk) 23:28, 23 March 2026 (UTC)
- usernamekiran, sure, let's move it, but let's find a more public Wikipedia page or WikiProject page where we have a prayer of attracting other interested comment. Perhaps Wikipedia talk:Spam (540 watchers, 56 pageviews/mo.) or Wikipedia talk:WikiProject Spam (1,218 / 1,926), or a subpage of one of them to centralize possibly extended commentary? My knowledge in this field is ancient now, but I wonder if we compiled a grab-bag of possible features (to add to the four I listed), generated a test set of a few hundred human-assessed spam evaluations, and threw a machine learning bot at it with the feature set, whether that might generate a usable model, at least as proof of concept. Likely with all the advances in AI, some of that can be streamlined, maybe even the assessments? That would be a win. Mathglot (talk) 09:19, 23 March 2026 (UTC)
- moved. —usernamekiran (talk) 02:28, 24 March 2026 (UTC)
Coding... a base/skeleton code has been created, but given the complexity, this would take at least a month to go fully operational in bot's user-space, and to collect enough data. It would take at least a month or two after that to go live in mainspace/BRFA. —usernamekiran (talk) 08:17, 24 March 2026 (UTC)
Fix disambiguate links after page move
[edit]After a recent page move several pages link to a disambiguation page. Requesting link change from Tennis performance timeline comparison to Tennis performance timeline comparison (women) (1978–present) for pages listed and sorted here. Note:first time performing a request of this nature, which may not be properly worded. 8rz (talk) 11:40, 26 February 2026 (UTC)
Done with AutoWikiBrowser and JWB. phuzion (talk) 14:19, 26 February 2026 (UTC)
- Many thanks. 8rz (talk) 00:09, 27 February 2026 (UTC)
Bloxx website creation Bot
[edit]I am building a website builder and SEO optimization tool, users by default customers bring in their Socials across yelp, google reviews, instragram et cetera by way of google places API.
One thing that signals trust is also a wikipedia page with founding date and basics of the company. For the established businesses with important context, i'd like to automatically create the wikipedia pages via the API with the important details on the company to signal trust. Jamespentalow (talk) 00:13, 24 March 2026 (UTC)
- Hi @Jamespentalow, Wikipedia is not intended as a directory or something to signal trust, but an encyclopedia. As an encyclopedia, there are criteria for inclusion. The criteria for inclusion for companies is at WP:NCORP. This is a generally higher bar, and thus it's harder to get an article on a company than it is to get one on most other topics. We require high-quality, independent, and reliable sources for articles, and right now there is not an AI or other automated software that can gather these, let alone write an article. If you think the company meets WP:NCORP, you're free to take a stab at writing an article about it yourself and going through the Articles for creation process. However, the task is
Impossible for a bot. HurricaneZetaC 01:31, 24 March 2026 (UTC)
Fill "Men's association football players not categorized by position"
[edit]Hello people. I would like to request for a bot to fill the maintenance category Category:Men's association football players not categorized by position. Previously this task was performed monthly by User:RonBot. RonBot was operated by User:Ronhjones who died in 2019. See Wikipedia:Bots/Requests for approval/RonBot 7 and User:RonBot/7/Source1. Robby.is.on (talk) 01:11, 30 March 2026 (UTC)
Coding... Tenshi! (Talk page) 00:49, 3 April 2026 (UTC)
- I see a slight issue here. Given the original category used has been changed in scope to only men's association football players and a second error category was created for women, how would a bot differentiate between men and women association football players? Tenshi! (Talk page) 21:50, 3 April 2026 (UTC)
- I’d say IF the player article has “men's” in the name of any one of its categories, THEN it is a male player. IF the player article has “women's” in the name of any one of its categories, THEN it is a female player. Sounds bizarre right? Vanderwaalforces (talk) 06:47, 4 April 2026 (UTC)
- That hadn't occurred to me, though I should probably know better. Tenshi! (Talk page) 13:09, 4 April 2026 (UTC)
- I’d say IF the player article has “men's” in the name of any one of its categories, THEN it is a male player. IF the player article has “women's” in the name of any one of its categories, THEN it is a female player. Sounds bizarre right? Vanderwaalforces (talk) 06:47, 4 April 2026 (UTC)
BRFA filed Tenshi! (Talk page) 18:50, 6 April 2026 (UTC)
- I see a slight issue here. Given the original category used has been changed in scope to only men's association football players and a second error category was created for women, how would a bot differentiate between men and women association football players? Tenshi! (Talk page) 21:50, 3 April 2026 (UTC)
Bot request: Automated citation metadata verification against external APIs
[edit]I'd like to propose a bot that automatically verifies citation metadata in Wikipedia articles by cross-referencing structured fields in citation templates against external APIs. Specifically:
- DOIs resolved via the CrossRef API, comparing the returned title and authors against the values in the
{{cite journal}}or{{cite book}}template - PMIDs looked up via the NCBI PubMed E-utilities API, with the same comparison
- ISBNs checked against Open Library or WorldCat
The bot would flag cases where:
- A DOI or PMID resolves to a different paper than described in the citation
- A DOI or PMID does not resolve at all
- There is a significant mismatch between the template metadata and the API-returned metadata (e.g. different title, different authors)
It would not assess whether the source supports the claim it is cited for — only whether the citation is internally consistent and points to a real, correctly identified document.
This would be useful for several reasons. First, it addresses growing concerns about LLM-generated articles introducing incorrect citation metadata (wrong DOIs, fabricated PMIDs, misattributed authors). Second, it would improve citation quality across all articles, regardless of how they were written — human editors also introduce typos, outdated DOIs, and copy-paste errors in citation fields. Third, all required APIs are free and public, making implementation straightforward.
The output could be formatted as a report similar to existing cleanup listings, or the bot could add maintenance tags like {{failed verification}} to individual citations where mismatches are detected.
I encountered this issue firsthand while working on Draft:Age and mathematical productivity, where several citation template fields (PMIDs, DOIs) contained minor errors that pointed to different papers in the same journals. These were caught manually, but an automated tool would have flagged them instantly and could do the same across all of Wikipedia's approximately 230 million citations.
Would there be interest in developing or supporting a bot like this? ~2026-20189-58 (talk) 12:27, 2 April 2026 (UTC)
- I don't know that there are
growing concerns about LLM-generated articles introducing incorrect citation metadata
. Rather, there are concerns (not growing, but already in full bloom) about LLM-generated material making its way into Wikipedia, in prose as well as hallucinated sources. - The proposed bot would instead hide evidence of LLM generation. In the case of DOIs and PMIDs, these are typically copied and pasted into the template by the editor, so there would be no typographical errors.
- I would prefer such a bot not correct the DOI, PMID, or ISBN, but flag them with an appropriate superscript label like possibly hallucinated or something more concise. And this should happen in mainspace and draftspace. ~Anachronist (who / me) (talk) 14:20, 2 April 2026 (UTC)
- Good idea. The important part is that suspicious sources are properly flagged to make review easier. ~2026-20189-58 (talk) 14:29, 2 April 2026 (UTC)
- This is something I could do, but it would need to be discussed somewhere first as I wonder how much traction it would gather. Vanderwaalforces (talk) 09:10, 3 April 2026 (UTC)
I would like to see a tool that can legitimately verify the ISBN metadata vs. the citation metadata. However there is no public API for that. Often the APIs conflates work vs. edition. US vs UK edition. Hardcover vs paperback. etc.. untangling these messes is not a simple API query. The more you lean on APIs the more of a mess it creates. It copies mistakes from the API datasets into Wikipedia. CitationBot has been doing this for years, tagging citations with whatever ISBN it can find. Then you have a split brain problem: ISBN points to one edition, metadata points to a different edition. -- GreenC 20:53, 12 April 2026 (UTC)
Change all instances of Hathi Trust to HathiTrust
[edit]Per https://www.hathitrust.org/about/ , the name of the organization is HathiTrust without a space. There appear to be about 2000 occurances in mainspace, which is a bit much for AWB. (HathiTrust is a common reference source for books) Naraht (talk) 14:36, 12 April 2026 (UTC)
- Is this really worth it? I mean, it’s not like it’s a typo or misspelling… Vanderwaalforces (talk) 18:35, 12 April 2026 (UTC)
- Why is 2,000 too much for AWB? It's actually a perfect amount. You can go through the first 200 or so slowly to make sure there are no weird edge cases and once you are comfortable with the data just smash the key through the rest. Bot ops have to do the same thing, carefully checking a couple hundred edits for mistakes. They need to make requests for approval etc.. it takes weeks to get approval, and someone might raise an objection ("why bother"). With AWB you can get it done in a few hours, you don't need to ask permission. -- GreenC 20:36, 12 April 2026 (UTC)
- As a rule of thumb, about 500+ (or maybe even 250+) is when I start thinking about a bot. 2000 is a lot to do with AWB in manual mode. No comment on the task itself. –Novem Linguae (talk) 20:55, 12 April 2026 (UTC)
- I am not even worried about 2k for an amount; I am more concerned about whether or not it is worth it to do the task because of a space in-between the two words and not a typo/misspelling. Vanderwaalforces (talk) 23:10, 12 April 2026 (UTC)
- Are we talking about this appearing in citations? And is it just going to keep on appearing that way in citations? Valereee (talk) 13:50, 18 April 2026 (UTC)
- I am not even worried about 2k for an amount; I am more concerned about whether or not it is worth it to do the task because of a space in-between the two words and not a typo/misspelling. Vanderwaalforces (talk) 23:10, 12 April 2026 (UTC)
- Greenc "Smash the key...". I've run into complaints for speed of edits for AWB even for simple edits like this. "can't possibly be looking at the edits" and "Only Bots should edit that fast". Is there any official guidance on this?Naraht (talk) 15:17, 15 April 2026 (UTC)
- GreenC fix notification.Naraht (talk) 18:32, 15 April 2026 (UTC)
- WP:MEATBOT gives the guidance, which is admittedly somewhat vague but basically boils down to "feel free to edit quickly but if someone complains you should stop and/or file a BRFA". Primefac (talk) 20:39, 17 April 2026 (UTC)
- As a rule of thumb, about 500+ (or maybe even 250+) is when I start thinking about a bot. 2000 is a lot to do with AWB in manual mode. No comment on the task itself. –Novem Linguae (talk) 20:55, 12 April 2026 (UTC)
Did it over the last few days with AWB. Request Withdrawn.Naraht (talk) 04:20, 21 April 2026 (UTC)
Arbitration enforcement editnotices
[edit]I'm seeking comment on an idea for a bot to automatically create editnotices (including templates like Template:Contentious topics/Arab-Israeli editnotice when pages are edited) for pages that are protected as an arbitration enforcement action (for either/both pages that are protected in future & for those currently protected). User:ClerkBot automatically categorises such protections already, over at Wikipedia:Arbitration enforcement log/Protections, and it seems tedious to manually create editnotices for each page, which I believe is the status quo.
I'm not sure if there would be a more efficient way to implement such an idea, such as using a module(?), but if a bot is the right idea, I'd be willing to write it and carry it through consensus-seeking, BRFA, and operation. A couple of questions here:
- Is there a more efficient way to implement the idea?
- The bot would need to create many pages, in the format of this manual edit, meaning WP:MASSCREATE would apply. Is there any technical barrier here, or is it simply the consensus of the community?
- The bot would need pagemover or template editor permissions to override the title blacklist. Would this be a technical issue?
- On the non-technical side: does the community even want this bot...is it a good idea? Sysops, template editors, and pagemovers, would you find a task like this useful? (If this thread is encouraging, my plan is to go to WP:VPPROP for consensus for the task.)
Thanks for your input. Best, Staraction (talk · contribs) 13:47, 14 April 2026 (UTC)
- While this is not an argument for opposing a bot, I would assume many such protection actions are done by Twinkle (I know not all admins prefer it). It might be possible to include this automation in Twinkle itself.
- @Staraction, regarding some of your questions, neither WP:MASSCREATE nor granting pagemover (which I think is technically enough and preferred for such a task) would be any problem, provided consensus for doing the task in the first place is there.
- Notified: Wikipedia talk:TW. ~~~~~/Bunnypranav:<ping> 15:52, 14 April 2026 (UTC)
- Wikipedia:Editnotice § Creating editnotices lists the locations for namespace-specific edit notices. For instance, the one for mainspace is located at Template:Editnotices/Namespace/Main. Its implementation invokes Module:Mainspace editnotice, which could be modified to check for some suitable indicator (if one exists) that a page has been protected for a specific purpose. isaacl (talk) 22:14, 14 April 2026 (UTC)
- I believe this is how Template:BLP editnotice works, where the indicator is "either Category:Living people or Category:Possibly living people". I don't think there's a similar category for the different protection reasons, and the {{ARBPIA}} template might not always be present on the article's talk page. Is there another indicator that might be relevant here, other than the category? Staraction (talk · contribs) 07:33, 15 April 2026 (UTC)
- I'd love to see anything that can manage the tedious clerical tasks at AE. Valereee (talk) 12:51, 18 April 2026 (UTC)
- Could the existing logger bot add these, maybe? ScottishFinnishRadish (talk) 13:33, 18 April 2026 (UTC)
- @ScottishFinnishRadish: I think that'd certainly be easiest, although it would be up to @L235 (courtesy ping!) of course. Staraction (talk · contribs) 18:08, 18 April 2026 (UTC)
It's not uncommon for people to forget to notify the people they are complaining about. Could we have a bot detect if a username is inserted as a parameter in the {{userlinks}} template and the user hasn't already had an ANI notice about that thread? This could make sending notices simpler; instead of opening a thread and then going around to talk pages, you just use the template and the bot notifies people for you. This would save time for the OP and for people who have to notify when the OP forgets. QwertyForest (talk) 14:56, 15 April 2026 (UTC)
- I'm not sure this is a good task. First, people are supposed to notify users that have ANI threads made about them. Which means that very often, if not in most cases, other users that are mentioned are incidental to what's being discussed.
- E.g. User:Bob keeps vandalizing the subpages of User:Example, see Special:Contributions/Bob.
- User:Example here doesn't need to be notified.
- Headbomb {t · c · p · b} 15:15, 15 April 2026 (UTC)
- When a username is put into the {{userlinks}} template, the user is probably not incidental. Even if a bot was going to notify everyone mentioned whether or not the template was used, User:Example may still appreciate an invitation to the discussion about a user they have been affected by. However, if you wanted to limit messages to the people the thread is about, having the bot only do it with {{userlinks}} usernames should work. QwertyForest (talk) 17:14, 15 April 2026 (UTC)
- How exactly is the bot supposed to tell if someone has gotten an ANI notice? Template:ANI-notice is the most widely used method of notifying, but it isn't required to use that specific template. 45dogs (they/them) (talk page) (contributions) 17:36, 15 April 2026 (UTC)
- I think you've found a problem. I was expecting it to work with the normal template, but custom messages would confuse it. Do you see a way around that? QwertyForest (talk) 18:41, 15 April 2026 (UTC)
- I can't think of anything foolproof. The bot looking for links to ANI might be sufficient, but it wouldn't be perfect. 'Notice' is a vague enough term to classify such notifications as "hey you are in a report at ANI" as probably sufficient to meet the threshold, which doesn't use any links. 45dogs (they/them) (talk page) (contributions) 19:52, 15 April 2026 (UTC)
- I think there’s a benefit in human notification in filling out the reason for the notification. That said, the bot could look at mentions on a delay and check the users talk page to see if a link is present. The bot would only notify if the target page has not been linked at the user’s talk page. Dw31415 (talk) 19:57, 15 April 2026 (UTC)
- I think you've found a problem. I was expecting it to work with the normal template, but custom messages would confuse it. Do you see a way around that? QwertyForest (talk) 18:41, 15 April 2026 (UTC)
Remove background color from List of wars involving the Kingdom of France
[edit]Many of the results are simply not linear enough to be assigned to a color. This article should have all background colors removed from the outcome column like on List of wars involving Spain. Otherwise there will always be disagreement on the supposedly "straightforward" outcome of a war. Not to mention the vandalizing no-name accounts like to do on these types of articles (plus the edit war that's literally going on right now). Bubba6t3411 (talk) 17:02, 15 April 2026 (UTC)
- Does this involve a Bot? Dw31415 (talk) 17:52, 15 April 2026 (UTC)
- This is not the correct venue for this request. The article's talk page should be used to achieve consensus. If consensus to remove the backgrounds is achieved, a bot is not needed; a simple find and replace of the background styling should be easy for anyone with a little text-editor experience. – Jonesey95 (talk) 19:16, 15 April 2026 (UTC)
- The talk page for List of wars involving Spain is the right venue. See WP:RFCBEFORE for a list of dispute resolution methods. Dw31415 (talk) 11:10, 16 April 2026 (UTC)