Jump to content

英文维基 | 中文维基 | 日文维基 | 草榴社区

Wikipedia:Merge for now

From Wikipedia, the free encyclopedia

This is a proposal under construction. It is in Essay space in order to collaborate on it with multiple editors more easily. If you just happen to pass along and decided you also want to help, please leave a message at User talk:Marcocapelle.

Update: On 21 October 2023, WP:SMALLCAT was deprecated. All proposals which assume a continued existence or application of SMALLCAT are thus out of date. As of 26 November 2023, WP:MFN still hasn't been approved. So, currently there is no direct replacement of SMALLCAT. Discussions are ongoing on whether WP:MFN should act as a sort of replacement of SMALLCAT, but it seems clear MFN should be further developed first before being approved as a new categorisation guideline.

List of editors of this page

[edit]

History

[edit]

Introduction

[edit]

Wikipedia:Arbitration/Requests/Case/SmallCat dispute/Proposed decision#SmallCat (Finding of Fact no. #1), 26 August 2023:

  • Arbitration Committee (11/11): (...) SmallCat has been part of a guideline since 2006. Originally named "No potential for growth", it was changed after editors were using it to delete categories based purely on numbers. There has been an ongoing desire, never reaching consensus, to apply a strict numerical threshold for SmallCat (jc37 evidence). Use of such numerical thresholds, even if phrased as a "rule of thumb" or similar such phrase, in CFDs is therefore not supported by the guideline. However, reasonable editors can reach differing conclusions about other elements of the guideline, including the potential for growth and whether categories are part of a large overall accepted sub-categorization scheme.

Grown practice

[edit]

The practice to nominate categories "per WP:SMALLCAT" without (explicitly) discussing the potential for growth and focusing on the currently available articles exists for many years (see below with examples from 2020). This created an unintentional case law where categories with less than five articles were understood to be small.

Guideline vs grown practice

[edit]

The WP:OCAT guideline in general, and WP:SMALLCAT in particular, focus on completely and permanently undesirable categories and have therefore very strict thresholds.

The practice mentioned above is not based on WP:SMALLCAT but until very recently we always imagined that it was sufficiently in the spirit of WP:SMALLCAT because the articles were too small to aid navigation. Now we have come to realize that the beforementioned practice is very different from WP:SMALLCAT in two very important aspects:

  1. It is not about completely and permanently undesirable categories. As regularly being added in the nomination, there is no objection to recreate the nominated category in a later stage.
  2. It is generally not about deletion, but about merger to parent categories, i.e. without loss of data. So the effect is far less harmful.

In other words, on the fly, a new and very different criterion was developed at CfD without realizing it was new and different.

Now there is an objection to developing new criteria per se. The unfortunate thing here is that seemingly nobody was aware of it.

The problems of WP:SMALLCAT as formulated

[edit]

Potential for growth

[edit]

"Realistic potential for growth" in WP:SMALLCAT is very subjective. The guideline itself currently gives three examples:

1. Husbands of Elizabeth Taylor

This is probably the only example everyone agrees on. As Elizabeth Taylor died in 2011, she will never be able to marry additional husbands to the 4 she had while alive. While per WP:CRYSTAL we can't exclude the possibility that one day she will return from the dead and marry one more husband to meet a hypothetical threshold of 5 items, this is not a very realistic potential.

2. Catalan-speaking countries

Due to renamings of the Category:Countries and territories by official language tree in 2023, it's now theoretically possible to create a Category:Countries and territories where Catalan is an official language, and populate it with 5 items (Andorra, Spain, Balearic Islands, Catalonia, Valencian Community according to List of official languages by country and territory). Moreover, it's always possible that more countries or territories will recognise the Catalan language as an official language. No Wikipedian has a WP:CRYSTALBALL, so who is to say this is not "realistic"? The opinion of 4 Wikipedians in 2006 are not an eternal truth.

3. Schools in Elmira, New York

This example is very questionable. Why would Category:Schools in Elmira, New York have no realistic potential for growth? The place can grow in size, and/or it can acquire a concentration of schools for some reason or another. So how should we objectively determine whether there is realistic potential for growth, and how much growth/how fast growth is required?

The edit history of WP:SMALLCAT also shows frequent disagreements over good examples of categories which have no "realistic potential for growth", which were added, but later removed.

Large overall accepted sub-categorization scheme

[edit]

The phrase a large overall accepted sub-categorization scheme is part of the WP:SMALLCAT guideline, and has been since it was first developed in December 2006. But it is unclear what it means, why and how this specific wording was developed, and what are good examples. More importantly, there is widespread disagreement whether it is really works to prevent the deletion or merger of certain important/helpful categories, or that it can be employed as a pretext to oppose the deletion or merger of any category whatsoever, no matter how unimportant/unhelpful that category is. As a result, there is no consensus on how to interpret it, and whether it should stay in the WP:SMALLCAT guideline as it is, should be amended to be clear and work as intended, or be removed the WP:SMALLCAT guideline for serving no apparent useful purpose. Update: On 21 October 2023, WP:SMALLCAT was deprecated as a whole.

Lack of explicit cut-off

[edit]
For the risk of gaming by "stuffing", see also User:Nederlandse Leeuw/Emptying categories out of process#SMALLCAT reform
For the risk of gaming by "ECOOPing" (emptying categories out of process, either entirely or with the exception of 1 or 2 items), see also User:Nederlandse Leeuw/Emptying categories out of process#ECOOP by people other than nominator

There is no consensus on whether there should be an explicit cut-off or numerical threshold in order to identify a category as a "Smallcat", or that it should remain an indeterminate amount, handled on a case-by-case basis at WP:CFD. There are pros and cons.

  • The pro side is that a numeral threshold would provide an objective means to identify a category as a "Smallcat", which could prevent potentially endless debates on how many items a viable category should have at the very least.
  • The con side is that any threshold is vulnerable to WP:GAMING: editors could "stuff" or "ECOOP" a category's items just to enable a keep or a delete/merge outcome, thererby disrupting or subverting the CFD process.

Some relevant observations:

  • Nederlandse Leeuw (3 August 2023): [When WP:SMALLCAT was first developed in December 2006, most disagreements] were apparently exactly about examples of what a large overall accepted sub-categorization scheme looks like, and how many items there should be in a category to be exempt, or that this number should remain vague or unspecified.
  • jc37 (24 July 2023) on the risk of WP:GAMING: "In August 2007, Radiant! changed the words "two or three" to "a handful" - [1]. Then "a handful" changed to "a few" (by me), in the next edit - [2]. There were many reasons to change from a set number. For one thing, it had become divisive. Things were getting nominated due to numbers alone, without actually looking to see if it was part of an overall system. (And had also begun to be set for Speedy Deletion.) As can be seen, "Songs by artist" had really become contentious over this. For example, this was the edit right after Radiant! initially added the section. Which was then re-written in the next edit here. Another reason is semi-related - gaming the system. If you set a finite amount, then: "anything over that amount should be an automatic Keep, right?" Or so went the argument. It also was leading to category "stuffing". As it's not that difficult to find anything anywhere that could maybe fit under a category, just to prevent its deletion. So an indeterminate amount, handled on a case-by-case basis at WP:CFD, was seen to be better. That said, there have always been those who want a set amount, because they have the seeming idealistic hope that it would reduce discussions at CfD, or that it might dissuade category creators from making small categories. Neither of which has been proven out over the years."
  • Arbitration Committee (11/11) FoF #1 (26 August 2023): There has been an ongoing desire, never reaching consensus, to apply a strict numerical threshold for SmallCat (jc37 evidence). Use of such numerical thresholds, even if phrased as a "rule of thumb" or similar such phrase, in CFDs is therefore not supported by the guideline.
(Commentary) Therefore, the text of the guideline would have to be amended by consensus to include such a threshold, and such a consensus will have to be obtained first, because there currently is none (and maybe there won't be one).
Update: On 21 October 2023, WP:SMALLCAT was deprecated as a whole.

Principles

[edit]

1. Purpose of categories: easy navigation between pages

[edit]

Rationale

[edit]

Categories help readers and editors of Wikipedia navigate easily between pages. For that navigation function to be useful, a category needs to clear on what its purpose and scope is, why the items in it should be in it, and what the relationship is between that category and its parent category/ies and any potential child category/ies.

Readers and editors should just be able to find what they're looking for quickly. WP:CFD exists in order to continuously help them accomplish that goal. That's because we've decided for technical and practical reasons that anyone should be able to create a category, but not everyone should be able to single-handedly delete, merge, or rename a category. That would cause way too much chaos.

To avoid too large categories (after more than 200 items, they won't show up in a single page, because there is no room), subcategories can be created for a more WP:CATSPECIFIC subset of items. Sometimes these subcategories even should be created. At other times, they probably shouldn't, at least not yet. That's where (up)merging for now comes in.

(Up)merging small subcategories for now

[edit]

Certain categories are permanently undesirable. Most criteria for deletion, merging, renaming or splitting mentioned at WP:OC are for categories that should never exist. However, the "Merge for now" criterion is for categories that don't really aid navigation right now. This is mostly because there are very few items in it, and readers and editors looking for those items are not helped by the fact that those items are put away in a very small, obscure subcategory. That doesn't mean such a category can't be appropriate in the future; it may well be. But for that, there should be a larger number of items in it to have added navigational value. "Very few items" is generally understood to be fewer than 5, but this threshold is not strict. Context should be taken into account. More on that below.

Upmerging redundant layers for now

[edit]

Sometimes, categories appear to have been created for the sake of categorisation itself, instead of aiding navigation between articles. Usually, this means a well-intentioned but perhaps overly ambitious or optimistic editor is creating a whole category tree that they expect to be fully populated at some point in the future, but for now they just want to put 1 or 2 articles in a sub-sub-subcategory.

One example where this might or might not be the case is the following (as of August 2023):

Apart from the question whether "singing" is even an "instrument", is it beneficial and practical to have a WP:NARROWCAT tree which combines "nationality, genre and instrument", and then proceeds with 2 redundant layers, then an under-populated layer with only 3 pages and then a poorly-populated, final subsubsubsubcategory with only 8 pages? Which reader benefits from this detour circus? Which reader is thinking, "I'd like to know more about French folk-pop singers, so of course I'm going to type "Category:Musicians by nationality, genre and instrument" into the search bar"? Probably not many. Perhaps someday in the future there might, maybe, possibly, potentially, be at least 5 biographies of Category:French pagan folk metal electric guitarists or something. Who knows? Nobody (WP:CRYSTALBALL). So until then, Category:French musicians by genre and instrument and Category:French folk musicians by instrument don't really help readers and editors navigate between existing articles. The 2nd and 3rd containercat in this example can therefore be upmerged to the (grand)parent containercat. They don't aid navigation right now, so they can be upmerged for now.

Additional essays

[edit]

Several essays (mostly focused on articles) provide more reasons why certain categories shouldn't be created yet, and are sometimes better "(up)merged for now". (Many of them use the metaphor of building a house, which works quite well for categories, too):

  • Wikipedia:Overcategorization#Narrow intersection (WP:NARROWCAT): the items in the nominated category are too similar to that of a closely related category (often the parent category) to merit a separate (sub)category.
  • Wikipedia:Write the article first (WP:WTAF): although it's often theoretically possible to populate a category with 5 or more items, those items (articles) have often not been written yet. Editors are advised to write the article(s) first before creating a category to house them in.
  • Wikipedia:Don't hope the house will build itself (WP:BUILDER): Building an article or a category (tree) requires appropriate planning and effort. Do it wrong, and we might have wasted time and effort while having to start over. Even if our building is kept, there's a maintenance cost to everything we build. So it's best to make sure we're going to use the building for its intended purpose once it is finished, and not hope one day renters will show up.
  • Wikipedia:An unfinished house is a real problem (WP:REALPROBLEM): An unfinished or under-populated category is not a harmless thing. As such, you should establish its purpose and scope clearly, and properly populate and parent it before moving on to do other things. The community would prefer to keep a newly-built house, but if there are not enough items to populate the house with yet, then it's better to merge it back to the old house where the items were fine, and will be fine for the time being. That's also where readers and editors are more likely to find the items for the time being. It saves everyone time while navigating categories.

2. Merging is not deletion

[edit]

By merging a nominated category into another closely related category (usually merging a category into a parent category, called "upmerging"), the meaningful connection between the items is preserved. It is not deleted or "destroyed", but put into another category space which usually has a somewhat broader scope and meaning that still applies to all items in the nominated category. Merging acknowledges that connection. A reader or editor will probably still be able to find very easily what they are looking for by navigating between the items in the broader category. A category which has been "(up)merged for now" could be re-created in the future if there are enough items (preferably at least 5) within the broader category to merit a subcategory for a more specific subset of items.

3. Gaming is never appropriate

[edit]

"Very few items" is generally understood to be fewer than 5, but this threshold is not strict. Context should be taken into account, and this number should not be gamed (WP:GAMING). E.g. editors may not empty a category out of process just to reduce the number of items in a category below 5 in order to enable its upmerging per this criterion "Merge for now". Such behaviour may be considered Wikipedia:Disruptive editing and is sanctionable as such.

4. Direct cross-linking between articles is sometimes an alternative to categorization

[edit]

In many cases when there would be only a few articles in a category, the articles are already interlinked. In that case categorization does not add any value.

Even when they are not interlinked yet they can be easily interlinked, if only, in a "see also" section. Again, this applies when there are only a few other articles.

5. (Re)creating categories is very easy

[edit]

The only requirement for category pages is that parent categories are made explicit on the category page, all other page content can be added later or not all. So the barrier to create or re-create categories is very low. This is confirmed by the fact that in the last few days no less than about 500 new categories per day have been created (see [3]), large by editors who hardly ever visit CfD. As we are going to propose a possibly temporary merge, the double work that is involved with it is negligible. Besides in most cases re-creation will never happen, or it will take many years before it becomes relevant again.

6. The "large overall accepted sub-categorization scheme"

[edit]

The "large established tree" clause exists so that categorizing editors can add articles to categories without checking beforehand whether that category exists or should exist. Because it is part of an established tree they may rely on the fact that it exists or should exist.

Also for readers it may be confusing to find almost every other sibling category in a tree except for the subcategory that they are looking for.

For example Category:Rivers by country is a well-populated complete tree by country. Should there be a few countries with very few articles about rivers then it may make sense to still keep these, so that for new articles editors can rely on the fact that they can add the new article to a rivers in any country category. For that reason e.g. Category:Rivers of Djibouti may be kept.[according to whom?]

Proposals for WP:MFN

[edit]

Update: On 21 October 2023, WP:SMALLCAT was deprecated. All proposals which assume a continued existence or application of SMALLCAT are thus out of date. As of 26 November 2023, WP:MFN still hasn't been approved. So, currently there is no direct replacement of SMALLCAT. Discussions are ongoing on whether WP:MFN should act as a sort of replacement of SMALLCAT, but it seems clear MFN should be further developed first before being approved as a new categorisation guideline.

1. Proposed base text of WP:MFN

[edit]
  • A category with very few items may be upmerged to its parent category/categories for now, with no prejudice against re-creation in the future if it can be more properly populated.

Rationale of proposed base text

[edit]

These texts are based on how the criterion has often been variously formulated in established WP:CFD practice:

  • Upmerge for now with no / without prejudice against / objection to re-creation / recreation in the future / at a later stage (if it can be populated with at least 3/5/10 items).

The important part is that we prefer upmerging these categories for now; they are not completely and permanently undesirable categories. We might consider this as a kind of WP:TNT convention, but applied to categories instead of articles. If someone starts over and re-creates the category at some point in the future, and can populate it with, say, 20 items, everyone who !voted Upmerge for now without prejudice etc. will not object to its re-creation. But there is no guarantee that anyone will re-create this subcategory ever again, nor that there will be a sufficient number of items to do so. WP:Write the article first is pertinent here. And so, Upmerging per this criterion of SMALLCAT is a temporary and possibly-but-not-necessarily indefinite measure. Upmerge for now without prejudice etc. !voters just prefer to have these items in the parent category/categories for the time being, mostly for navigational reasons, and regard saying so as good practice.

Outdated proposal; SMALLCAT has been deprecated.
  • Proposal A: have WP:MFN as addendum on WP:SMALLCAT, which in turn is part of the WP:OC guideline. Pro: they are related topics
  • Proposal B: have WP:MFN as a separate guideline. Pros: in contrast to WP:OC this is not about deletion but about merging (not loosing data) and in contrast to WP:OC this is not about avoiding categories as a matter of principle.

3. The future of WP:SMALLCAT

[edit]
Outdated proposal; SMALLCAT has been deprecated.

With the introduction of MFN, WP:SMALLCAT will probably become almost a dead letter, but there is no need to abolish it per se. It might even be improved along the same lines as MFN, with respect to proposals 3, 4, 5, and 6.

4. Introduce a cut-off

[edit]

The alternative texts are written for WP:MFN, but we may apply them to SMALLCAT too.

Proposed text A (keep the base text, no numerical threshold)

[edit]
  • A category with very few items may be upmerged to its parent category/categories for now, with no prejudice against re-creation in the future if it can be more properly populated.
    • Against a threshold: numerical thresholds may be more vulnerable to WP:GAMING.

Proposed text B (introduce numerical threshold of 5 items)

[edit]
  • A category with fewer than 5 items may be upmerged to its parent category/categories for now, with no prejudice against re-creation in the future if it can be more properly populated.
    • Favouring a threshold: the lack of a numerical threshold may lead to endless discussions about how small "small" is.

Proposed text C (compromise)

[edit]
  • A category with very few items may be upmerged to its parent category/categories for now, with no prejudice against re-creation in the future if it can be more properly populated. Historically, many editors used "fewer than 5 articles" as a cut-off.

A tentantive proposal is to introduce a numerical threshold of at least 5 items for each category at all times. This will establish an objective criterion for (up)merging for now by default in scenarios where only 0 to 4 items are present in a category, and there appears to be no way of populating it to 5+ items without inappropriately "stuffing" it beyond its purpose and scope (merely to secure a Keep closure by people who just like the category in the absence of valid arguments, or make a mistake). This has been an informal rule of thumb for many years at CFD in practice, even though it was never formalised, and never part of WP:SMALLCAT (see SmallCat Dispute FoF #1). One of the main reasons no numerical threshold was maintained was the risk of WP:GAMING by "stuffing" or "ECOOPing". But both can be somewhat mitigated if all CFD regulars install the User:Nardog/CatChangesViewer script. (This script didn't exist back in December 2006, but it does now.) That way it can be detected more easily if anyone has been stuffing or ECOOPing a category in order to game the nomination. It may not be possible to prevent all gaming, but it will be easier to detect it. Both "stuffing" (by people who just like a category, or make a mistake) or "ECOOPing" (by people who just don't like a category, or make a mistake) have been issues the community has had to deal with anyway, often in SMALLCAT-related cases, and the script is a relatively adequate tool to address both.

5. Address "part of large scheme" ambiguity

[edit]
Outdated proposal; SMALLCAT has been deprecated.
  • Proposal A: keep the base text and do not include the large overall accepted sub-categorization clause in WP:MFN
  • Proposal B: expand the base text by a large overall accepted sub-categorization scheme clause but with a much better explanation of what this means, for example per the text below
  • Proposal C: expand the base text by the same large overall accepted sub-categorization scheme clause as currently in SMALLCAT

Expanded text proposal (elaboration of proposal B)

[edit]

(This expanded text is written for MFN, but may be applied to SMALLCAT too.)

A large overall accepted sub-categorization scheme fits at least one of the following conditions:

  • It has been a fully diffused scheme for years, with only broad overview articles in the parent category. For example Category:Rivers by country has been fully diffused for years and Category:Rivers does not contain any article that is specific to a single country. An RFC is required to end the status of a fully diffused scheme by this criterion.
  • Most potential subcategories already exist and most existing categories are not small. This would exclude e.g. Category:Rivers by city because the number of cities is almost infinite. It would also exclude Category:River ports by country because the very few subcategories already exist. It would also exclude Category:Cheeses by country in its current state because the tree contains a whole lot of subcategories of 1, 2, 3 or 4 articles.

Presumably the text below can be shorter by quite a bit, after the above addition.

There are two solutions to resolving the ambiguity of the phrase a large overall accepted sub-categorization scheme:

  1. The phrase should be amended to be clear, and to work as intended as part of the WP:SMALLCAT guideline, in order to prevent navigational disruptions. Proponents have claimed its intention is to prevent the deletion or merger of certain important/helpful categories, threatened by arbitrary nominations which fail to consider the consequences (e.g. breaking navigational routes readers and editors are likely to use). If so, the text should make that clear, and explain how it does so, or change the rules in order to actually make it do so.
  2. The phrase should be removed from the WP:SMALLCAT guideline for serving no apparent useful purpose, in order to prevent obstruction of the CFD process. Critics have claimed that it can be employed as a pretext to oppose the deletion or merger of any category whatsoever, no matter how unimportant/unhelpful that category is in practice. Concerns of unforeseen consequences are held to be overblown; these are held in check by other policies such as WP:SUBCAT, and there is always the option of a WP:DRV. Especially in "merge for now" (MFN) scenarios, the connection between the items is preserved in another, broader category, and the nominated category may still be re-created without prejudice in the future if more properly scoped and populated.

Until here shorter.

6. Address "potential for growth" ambiguity

[edit]
Outdated proposal; SMALLCAT has been deprecated.

In practice, from 2006 to present, WP:SMALLCAT has invoked a lot of examples of categories which supposedly had "no realistic potential for growth" (and therefore deleted), which were actually based on inappropriate WP:CRYSTAL assumptions. These assumptions had more to do with the limited imagination of certain editors that there could never be more than [number] of [items], until we found out there actually could.

  • Proposal A: keep the base text and do not include the potential for growth clause in WP:MFN
  • Proposal B: expand the base text by a potential for growth clause but with a much better explanation of what this means.

Hypothetically there could be a proposal C to expand the base text by the same (vague) potential for growth clause as currently in SMALLCAT, but that would go entirely against the spirit of MFN.

There are two solutions to resolving the ambiguity of the phrase a category which does have realistic potential for growth, such as a category for holders of a notable political office, may be kept even if only a small number of its articles actually exist at the present time.

  1. The phrase should be amended to be clear, and to work as intended as part of the WP:SMALLCAT guideline, in order to prevent demolishing a category (tree) which is still under construction. Proponents have claimed its intention is to prevent the deletion or merger of certain important/helpful categories, threatened by arbitrary nominations which fail to take into account that the category (tree) has only just been created, and the creator and wider community haven't been given the appropriate time to populate the category properly, invoking Wikipedia:Don't demolish the house while it's still being built (WP:DEMOLISH). This would be disrespectful to the work done by editors, even if it's not quite finished yet.
  2. The phrase should be removed from the WP:SMALLCAT guideline for serving no apparent useful purpose, in order to prevent obstruction of the CFD process. Critics have claimed that it can be employed as a pretext to oppose the deletion or merger of any category whatsoever, no matter how unimportant/unhelpful that category is in practice, and no matter how long ago the category has been created and remained under-populated.

Post-depreciation

[edit]

Much of the above talks about fixing SMALLCAT; SMALLCAT is no longer a guideline.

HouseBlaster's thoughts

[edit]
  1. If we are clear to the wider community that CfD regulars have been using five articles as a rule of thumb (i.e., it is the status quo), I think it will receive a good deal of support per WP:AINTBROKE.
    Something like: Categories with fewer than five members are usually upmerged into a parent category, in the understanding that it can be recreated in the future if needed
  2. Keep the a large overall accepted sub-categorization scheme bit, but shorten to an large overall accepted sub-categorization scheme. We can give some examples, but make it clear that CfD is the place to determine whether something is an accepted sub-categorization scheme. Maybe create a talk page template that says something along the lines of this is part of [sub-categorization scheme]?
  3. Get rid of no potential for growth. I do not think there are categories that are useful for navigation that are neither sufficiently large nor part of a broader categorization scheme. Merge them for now, and recreate them if and when they actually grow.


Footnotes

[edit]