Page MenuHomePhabricator
Search Advanced Search
    • Task
    • ·Closed
    Deepcat documentation is missing. Please fix the redlinks at https://wikitech.wikimedia.org/wiki/Nova_Resource:Catgraph/Deepcat. They go to a subpage of [[https://de.wikipedia.org/w/index.php?title=User:Christoph Fischer (WMDE)]] Requested per [[help talk:cirrusSearch]] https://www.mediawiki.org/wiki/Topic:Tk1o95fvoqgkneol
    • Task
    • ·Closed
    • Task
    Unlike other search parameters, //morelike// is only working alone. //Morelike// already creates a unique //morelike area// comparable in usefulness to a category or a [[//en.wikipedia.org/wiki/wp:outline|subject outline]], but it gives many many more pages, and they're of unspecified characteristics. A //morelike area// might find new use as a stepping-off point targeting title, template, category, file, and other page characteristics in that unique area of interest. Also, this singular-standalone trait is awkward to try to rationalize in Search documentation. The cool things about //morelike areas// is that they're not manual, and they're not biased. Compared to categories and outlines, //morelike areas// are more complete; they list missing and orphaned articles. They're easier; they are not hampered by the problems of of subcategories (T37402), and input pagenames (finding category pagenames, and outline pagenames). The area of interest is definitely fast to create, just not subsequently workable like it is with //incategory//. A workaround for wikiprojects or a user with there own unique area of interest is to cut and paste a large, reference-set of words into the query box, and then filter this with other search parameters to get the desired a meaningful page count and reusable search link. The [[//en.wikipedia.org/wiki/Wikipedia:Help_desk/Archives/2016_January_31#Looking_for_a_Bot_to_list_imageless_articles_in_a_given_category| need was raised at]] the Wikipedia help desk.
    • Task
    • ·Closed
    For sites that install the [Translate extension](//mediawiki.org/wiki/Extension:Translate), searching translation-subpage content is problematic: * regex crawl through subpages * hastemplate usage-counts are skewed (if they worked, see T125926) * namespace counts are skewed. For example, on MediaWiki.org there are not 5100 uniquely defined template //usage// definitions ([Template:](//mediawiki.org/wiki/Special:Search/template:)) * //intitle// results are drowned out * //prefix// results are drowned out * where translations are in-progress, word and phrase searches get Search noise, and whatLinksHere gets [Template:TNTN](//mediawiki.org/wiki/Template:TNTN) noise **See also: ** * {T121826}
    • Task
    The [[//mediawiki.org/wiki/Extension:Translate|Translate extension]], comes with [[//mediawiki.org/wiki/Template: translatable template|{{Translatable template}}]], but a translated template is no longer searchable with CirrusSearch hastemplate search parameter. Hastemplate can find template usage where the target is a secondary template, and this ability should also be able to find where the target template is passed as a parameter. Currently hastemplate doesn't recognize the parameter list as "a place for template names", as it does in template code, where it the target template as a secondary. The only workaround is a set of case-insensitive regex searches, for all possible combination of aliases, and even after all that work, it still sacrifices the visibility of secondary template. For example, these two queries should have the same count, but hastemplate is way off: * [hastemplate: ApiEx](//mediawiki.org/wiki/Special:Search/all:hastemplate: ApiEx) * [insource:/TNT *\| *ApiEx/i insource: "tnt apiex"](//mediawiki.org/w/index.php?search=all:insource:/Tnt *\| *ApiEx/i+insource:+"tnt+apiex"&title=Special:Search&go=Go) For a target template with two aliases, six queries are required, and even then no secondary template-usage is found: * [hastemplate:Documentation](//mediawiki.org/wiki/Special:Search/all:hastemplate: Documentation) * [insource:/TNT *\| *Documentation/i insource: "tnt documentation"](//mediawiki.org/w/index.php?search=all:insource:/TNT+*\|+*Documentation/i+insource:+"tnt+documentation"&title=Special:Search&go=Go) * [insource:/TNT *\| *Doc/i insource: "tnt doc"](//mediawiki.org/w/index.php?search=all:insource:/TNT+*\|+*Doc/i+insource:+"tnt+doc"&title=Special:Search&go=Go) * [insource:/TNT *\| *Template doc/i insource: "tnt template doc"](//mediawiki.org/w/index.php?search=all:insource:/TNT+*\|+*Template+doc/i+insource:+"tnt+template+doc"&title=Special:Search&go=Go) * [insource:/Translatable template *\| *Documentation/i insource: "Translatable template documentation"](//mediawiki.org/w/index.php?search=all:insource:/Translatable template+*\|+*Documentation/i+insource:+"Translatable template+documentation"&title=Special:Search&go=Go) * [insource:/Translatable template *\| *Doc/i insource: "Translatable template doc"](//mediawiki.org/w/index.php?search=all:insource:/Translatable template+*\|+*Doc/i+insource:+"Translatable template+doc"&title=Special:Search&go=Go) * [insource:/Translatable template *\| *Template doc/i insource: "Translatable template template doc"](//mediawiki.org/w/index.php?search=all:insource:/Translatable template+*\|+*Template+doc/i+insource:+"Translatable template+template+doc"&title=Special:Search&go=Go)
    • Task
    • ·Closed
    Because CirrusSearch does not always index words separated solely by a colon, such words are hardly searchable. An exact phrase search or an insource search cannot find the words at all, they can only find the one token that is the two words plus the colon. Because it treats an unspaced colon as an alphanumeric character, the following common use cases cannot in general be searched in wikitext: file links, template usage, namespace or interwiki linkage, parser function usage and categories usage. Anything with a colon after its name is not indexed unless the optional space is put after it. For example, the wikitext `[[special:preferences]]` will not index //special// or //preferences//, and so those words cannot be found in this sandbox: * ["special:preferences"](//mediawiki.org/w/index.php?search="special:preferences"+prefix:user:cpiral) (A) * [insource: special:preferences](//mediawiki.org/w/index.php?search=insource:+special:preferences+prefix:user:cpiral) (A) * [special](//mediawiki.org/w/index.php?search=special+prefix:user:cpiral) (A, B, S) * ["special"](//mediawiki.org/w/index.php?search="special"+prefix:user:cpiral) (Nothing) * [preferences](//mediawiki.org/w/index.php?search=preferences+prefix:user:cpiral) (A) * ["preferences"](//mediawiki.org/w/index.php?search="preferences"+prefix:user:cpiral) (Nothing) * [insource: special](//mediawiki.org/w/index.php?search=insource:+special+prefix:user:cpiral) (Nothing) * [insource: "special"](//mediawiki.org/w/index.php?search=insource:+"special"+prefix:user:cpiral) (Nothing) * [insource: preferences](//mediawiki.org/w/index.php?search=insource:+preferences+prefix:user:cpiral) (Nothing) * [insource: "preferences"](//mediawiki.org/w/index.php?search=insource:+"preferences"+prefix:user:cpiral) (Nothing) Neither the insource nor the searches with quotation marks (exact phrase searches) ever found the words. When the option to not-use the space is taken, as usual, for example `{{namespace:pagename}}` (instead of the perfectly valid `{{namespace: pagename}}`), those two words becomes lost because of the analyzer. For example, `File:` is lost in `File:siamese cat`, which becomes as if to any general search query equal to filesiamese. So the following class of questions are unanswerable, Where are any file links? namespace links? interwiki links? external links built by parser functions (T121379). Any parser function usage is in the dark, for example "Where is urlencode used?" Insource cannot say because the unspaced colon morphs the word away. Existing usage on the wiki, of files, templates, namespaces, and parser functions is in the dark unless 1) we run bare regular expressions (bare, meaning no regex filter, no possible insource filter to provide the indexed-search-provided search domain) 2) we ask for new parameters like `hasfile:` and `hasparserfunction:` 3) we use external tools. IMHO, none of these are advisable, but we must advise the workaround "run bare regexp". Insource is especially missed: * No finding counts or existence usage of file:pagename or urlencode:url, category:pagename, namespace:pagename, template:pagename. * Regex searches are missing their ideal companion filter, so most Search magic is snuffed out. * One may not answer simple questions like "Do any incategory:dogs lack insource:[[files?" * Cannot find external links that use parser functions
    • Task
    [[//mediawiki.org/wiki/Help:CirrusSearch#Words.2C_phrases.2C_and_modifiers|Greyspace phrases]] (words connected by greyspace characters), function as search terms and are accepted by incategory, hastemplate, and intitle. Only intitle handles them improperly, and only sometimes. For three-word phrases, it's OK: [intitle:back_door_album](//en.wikipedia.org/w/index.php?search=intitle:back_door_album) (Proper behavior.) [back_door_album](//en.wikipedia.org/w/index.php?search=back_door_album) (Proof intitle works with greyspace phrases.) [intitle:green_tree_frog](//en.wikipedia.org/w/index.php?search=intitle:+green_tree_frog) (7 pages all proper) [intitle:"green tree frog"](//en.wikipedia.org/w/index.php?search=intitle:+%22green+tree+frog%22) (any way you phrase it) [intitle: greenTreeFrog](//en.wikipedia.org/w/index.php?search=intitle:+greenTreeFrog) (any way you phrase it) [green_tree_frog](//en.wikipedia.org/w/index.php?search=~green_tree_frog) (96 pages otherwise) But for two-word phrases: [intitle:door_back](https://en.wikipedia.org/w/index.php?search=intitle:door_back) (Three mismatches.) [intitle:"door back"](https://en.wikipedia.org/w/index.php?search=intitle:%22door back%22) (Zero, as expected.) [intitle:frog_tree](//en.wikipedia.org/w/index.php?search=intitle:frog_tree) (Another mismatch, but again, all words must also be found in the title) [intitle:"frog tree"](//en.wikipedia.org/w/index.php?search=intitle:%22frog tree%22) (Zero)
    • Task
    The truth logic of boolean searches using both AND and OR on simple word and phrase terms make no sense to me. (When AND or OR are used alone on simple word or phrase searches, they do make sense.) I have double-checked the data in the table below. It has links to actual searches so you too can verify, in one minute, the table is accurate. Then you too can confidently explore the reality of this table and conclude "no pattern". There are four basic search terms (word or phrase), shown in bare and in Boolean searches. There are six pages in the search domain: S, A, B, C, E, and F. | Search | R|e|s|u|l|t| | [1](//mediawiki.org/w/index.php?title=Special:Search&profile=default&search=1+prefix:user:cpiral&fulltext=Search) | S | A | | C | E | | | [2](//mediawiki.org/w/index.php?title=Special:Search&profile=default&search=9+prefix:user:cpiral&fulltext=Search) | | A | B | C | | F | | [3](//mediawiki.org/w/index.php?title=Special:Search&profile=default&search="the number"+prefix:user:cpiral&fulltext=Search) | | | | | E | F | | [4](//mediawiki.org/w/index.php?title=Special:Search&profile=default&search=number+prefix:user:cpiral&fulltext=Search) | S | A | | | E | F | | [3 OR 4](//mediawiki.org/w/index.php?title=Special:Search&profile=default&search=number+OR+"the+number"+prefix:user:cpiral&fulltext=Search)| S | A | | | E | F| | [3 AND 4](//mediawiki.org/w/index.php?title=Special:Search&profile=default&search=number+AND+"the+number"+prefix:user:cpiral&fulltext=Search)| | | | | E | F | | [1 AND 2 OR 3](//mediawiki.org/w/index.php?title=Special:Search&profile=default&search=1+AND+9+OR+"the+number"+prefix:user:cpiral&fulltext=Search)| S | A | | C | E | | |--------------|---|---|---|---|---|---| | [2 AND 1 OR 3](https://www.mediawiki.org/w/index.php?title=Special:Search&profile=default&search=9+AND+1+OR+"the+number"+prefix:user:cpiral&fulltext=Search) | | A | B | C | | F | |--------------|---|---|---|---|---|---| | [1 OR 2 AND 3](//mediawiki.org/w/index.php?title=Special:Search&profile=default&search=1+OR+9+AND+"the+number"+prefix:user:cpiral&fulltext=Search) | | | | | | F | |--------------|---|---|---|---|---|---| | [2 OR 1 AND 3](https://www.mediawiki.org/w/index.php?title=Special:Search&profile=default&search=9+OR+1+AND+"the+number"+prefix:user:cpiral&fulltext=Search) | | | | | E | | |--------------|---|---|---|---|---|---| | [2 OR 1 AND 4](//mediawiki.org/w/index.php?title=Special:Search&profile=default&search=9+OR+1+AND+number+prefix:user:cpiral&fulltext=Search) | S | A | | | E | | |--------------|---|---|---|---|---|---| | [3 OR 1 AND 2](//mediawiki.org/w/index.php?title=Special:Search&profile=default&search="the+number"+OR+1+AND+9+prefix:user:cpiral&fulltext=Search) | | A | | C | | | |--------------|---|---|---|---|---|---| | [3 OR 2 AND 1](//mediawiki.org/w/index.php?title=Special:Search&profile=default&search="the+number"+OR+9+AND+1+prefix:user:cpiral&fulltext=Search) | | A | | C | | | |--------------|---|---|---|---|---|---| | [1 AND 2 OR 3 AND 4](//mediawiki.org/w/index.php?title=Special:Search&profile=default&search=1+AND+9+OR+number+AND+"the+number"+prefix:user:cpiral&fulltext=Search) | ||||E| | |--------------|---|---|---|---|---|---| | [1 OR 2 AND 3 OR 4](//mediawiki.org/w/index.php?title=Special:Search&profile=default&search=1+OR+9+AND+number+OR+"the+number"+prefix:user:cpiral&fulltext=Search) |A |B|C||| F| |--------------|---|---|---|---|---|---|
    • Task
    • ·Closed
    When two, line-break, <br /><br />, tags are used in wikitext, then for smaller font sizes, the line break produces too-large a vertical spacing between the lines. The effect doesn't depend on which Wikipedia skin is in effect. Here are some examples to try on the wiki: `;Example 1` `This is a paragraph with default font size.<br /><br />Note the acceptable paragraph spacing between this line, started with two line-breaks, and the previous line; both inside a normal font. And I'm hoping if I add enough text this line will wrap to show that two paragraphs really can be in one line.` `;Example 2` `<font size="-3">This is a paragraph with a &lt;font size="-3"&gt;.<br /><br />Note the unacceptably large spacing between this line, started with two line-breaks, and the previous line. It's the same spacing as in the first example. It should have becoming proportionally smaller vertical spacing because the font is smaller.</font>`
    • Task
    • ·Closed
    The search box //for the search results page// needs to be wider. It is only 43 characters wide. * regex [are easily 80 characters](//en.wikipedia.org/wiki/Template:Val/units/test), plus they need filters * in addition to words and phrases, there are [nine search parameters](//mediawiki.org/wiki/Help:CirrusSearch), some of which take several page names * these all need a place for refining the search results "[Widen the search box in the Vector skin](//en.wikipedia.org/wiki/Special:Preferences#mw-prefsection-gadgets)", widens the search box //for the skin// from to 45 to 55 search characters. The old Search relied mostly on (automated) page ranking software, but with eight new parameters, we are able to specify entirely new kinds of searches, the kind that need more work area. The user should be able to set the search-box width.
    • Task
    If there are two spaces in wikitext where one font is wider than the other, the //narrow// space is always chosen automatically, but the //wider// space should always be chosen instead (automatically). For publications I try to remember to manually add &nbsp, on both side of the entry, because currently the inline entries look cramped to me. Please preview this on the wiki: `See how this<kbd> phrase is staged </kbd>and again, how it looks<kbd>&nbsp;with manually added&nbsp;</kbd>overrides that could be made effortless.` Manually added whitespace properly //stages// such entries. Reported at Village pump "[[//en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)/Archive_142#Fixed_font_transitions_don.27t_space_correctly|Fixed font transitions don't space correctly]]"
    • Task
    • ·Closed
    `deepcat:"Classical mechanics stubs‎"` reports correctly, but randomly reports zero. 'CatGraph could not find the category 'Classical_mechanics_stubs‎'. The following reproduces the problem: `deepcat:"Classical mechanics stubs‎"` `deepcat:Physics` `deepcat: "Disambiguation pages"` `deepcat:"Classical mechanics stubs‎"` And then these also begin to do the same (begin to report zero) `deepcat: "Optics stubs"` `deepcat: Physics` And then this fixes "Physics", but not "Classical Mechanics stubs" `deepcat: "Disambiguation pages"` `deepcat: Physics`
    • Task
    It's already true that the "empty bullet in wikitext" correctly disappears on the screen or in print. Thanks. But there is still an "empty bullet in print" problem to address. * [Many thousands of template intitle](//en.wikipedia.org/w/index.php?title=Special:Search&search=all:+hastemplate:+"in+title"+insource:/\*[+']*\{\{+*[Ii]n+*title/&ns0=1&fulltext=Search). * [Many more thousands of template lookfrom](//en.wikipedia.org/w/index.php?title=Special:Search&search=all:+hastemplate:+"in+title"+insource:/\*[+']*\{\{+*[Ii]n+*title/&ns0=1&fulltext=Search). * [More than ten new instances per day](https://en.wikipedia.org/wiki/Template_talk:In_title#How_to_alter_this_template_and_the_wiki). * [Sanctioned practice on Wikipedia](//en.wikipedia.org/wiki/WP:Manual_of_Style/Disambiguation_pages#Ordering). * There are more than five discussions involving this on Wikipedia, cooperating with this phabricator report, waiting for a response. Please, //is it feasible to expect a software resolution to the "empty bullet in print"//? Thanks.
    • Task
    The ability to locate internal HTML links is valuable, but currently if that link uses URL-related parser functions , it's hardly possible. * What-all links //to// a section allows us to change the section title. * Finding any bare URL in ref tags is a style-correction issue. * Related functionality, LinkSearch, was a top concern in the recent [[//meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey/Search#Improve_Special:LinkSearch|recent Wikimedia survey]] (["to search for external links to pages on this site..."](//en.wikipedia.org/wiki/Special:LinkSearch|stated function of LinkSearch) ). * WhatLinksHere cannot find internal URL. * The //linksto// search parameter cannot find internal URL. (Linksto tracks [[square brackets]], but only if they are not redirects.) * The //insource// search parameter is the only way. It's nearly impossible because the only approach is running a large set of queries. Currently a probability is the best we can do when reporting URL internal linkage. **Background** To link, we can create URL-style internal link to one point in such a //generous// number ways, that there can be literally hundreds of ways, each of which are significantly different enough that Search can need a hundred queries to find that one URL internal link. To start the picture, here are just five of the many text patterns that can link to a point: # `[//wikipedia.org/wiki/namespace:pagename]` # `{{canonicalurl:namespace:pagename}}` # `[{{fullurl:namespace:pagename}}]` # `[{{SERVER}}{{localurl:namespace:pagename/}}]` # `[{{SERVER}}/wiki/namespace:pagename/]` It is not only the generous number of [[//mediawiki.org/wiki/help:magic_words|magic words]] that confounds Search but their interplay, whitespace, and letter case. Here is an example of how one parser function accepts spacing. # `{{fullurl:namespace:pagename}}` # `{{fullurl: namespace:pagename}}` # `{{fullurl:namespace: pagename}}` # `[{{fullurl: namespace: pagename}} link label]` That single, parser-function characteristic alone quadruples the number of insource queries. But their are many more like it, in that they each multiply the number of queries needed to find a link, not simply add to them. Each of the many queries needed to find a single link would also require a regexp, to find for example, that only the last one is verified as an actual link. There has to be yet another insource query for all of the following different text patterns, all multipliers of the numbers of queries needed to find a single URL: * [[//mediawiki.org/wiki/help:magic_words#Namespaces|Namespace]] can be said subjectspace, articlespace, or talkspace. * [[//mediawiki.org/wiki/help:magic_words#Page_names|Pagenames]] can be said basepagename/subpagename, articlepagename, subjectpagename, talkpagename, rootpagename, or pagename. These six can also take a `:fullpagename` as a parameter so that's a multiplier of twelve. * equal magic word name for server, servername, or scriptpath. That's three more. * most of these can equally name content by using a `:fullpagename` colon parameter. * several magic words (*URL ones) take parameters like `|path` or `|wiki` * "Fullurl" and "canonicalurl" also accept "urlencode" or "anchorencode" forms. * many take an "EE" form * There's also things like `{{NS:{{NAMESPACENUMBER}}}}` that needs to be searched to see if it goes to the page name in question. * The URL can equally be in HTTP POST or HTTP GET forms so many of these can equally name content by using a `|query` on the path: server/w/query where query is `index.php?title=` or `index.php?pageid=` Given a page name, each of these must be tried. That's hundreds of insource queries given a single page name. So we probably do five queries instead and say "probably" what links there". **Foreground** Spacing and case can be significant for insource, and a varying regexp is required for each query for matching multiple patterns. So we can search for URL in only an //ungenerous// way. For each single external link construct, Search is narrow and specific. Each of these characteristics multiplies the number of searches required many fold: * Search is camelCase sensitive, but namespace names and parser functions are not. * Insource treats an unspaced colon : character like:this as a letter, where the non-indexed strings "like" and "this" cannot be found unless with a regex. Insource is the only option, because page visibility of the sought construct is, although possible, not likely. To find non-indexed strings, a regex needs a filter. As just explained, their is no filter possible. Each search would need its own separate regex for verification purposes. For an insource search the non-spaced colon is no different from a letter or a number. If there is a space after it, the alternative without the space will not match. * A namespace with two aliases adds triples the number of insource searches. * You really need yet another variable, a /regex/ just to look for the opening [ bracket. It would need to accompany each one in its own distinct, unique form. Yet //still// it could not prove a closing ] bracket existed, because the dot **.** metacharacter represents any character //including a newline//. * Insource doesn't take OR. It is a series of searches few could understand. A template could try to offer to report URLs to a given canonical name of any of: a section, a fullpagename, a prefix, or a namespace. There is no closer-to-singular process available. A way for end-users to find URLs. What is a workaround, please?
    • Task
    On Wikipedia, km² is impossible to target in a search, yet Goggle reports [[//www.google.com/search?q="km²"+site:en.wikipedia.org | "km²" ]] on well over 120,000 pages. But Unicode **digits** - **have not been normalized**. Basic search `"mm3"` or `"km2"` find no normalized ² or ³ character in the index. - **are treated like punctuation**. Basic search `"mm³"` finds `mm`. - **fail in regex strings greater than two chars**. `/mm³/` or `/km²/` are missing out. Major templates such as Convert and Val supports unicode digits in either form `km²` or `km2`. In mainspace, 5% of pages who use `<sup>2` also use `²`. Confusingly, km² is recognized by the **highlighter**, but when you remove the //actual// matches (single unicode strings) `²|³`... nothing. For example, [see `insource:/²|³|km²/ prefix:Chem`](https://en.wikipedia.org/w/index.php?title=Special:Search&profile=default&search=insource:/²|³|km²/+prefix:Chem&fulltext=Search). Also the //typeahead analyzer// works fine for or mm³ or km². To see how two is ok but three fails, and without running bare regex on millions of pages, [here's a small domain with some /²|³/ hits.](https://en.wikipedia.org/w/index.php?title=Special:Search&profile=default&search=insource:/²|³/+prefix:Chem&fulltext=Search) T41501 says unicode quotes are not normalized, and this one says ² and ³ are not normalized. But //digits are indexed// and quotes are not. T95849 considers analyzers, filtering, and fields, and shows enwiki page mapping properties while troubleshooting the unicode ★ character. But the black star, although not found in indexed searches, [is not impossible to find using regex](//en.wikipedia.org/w/index.php?search=insource:/"{{Unicode|★}}+||+U%2B2605"/+prefix:Miscellaneous&title=Special:Search&go=Go), and other unicode characters are also found in regex strings.
    • Task
    • ·Closed
    # [[https://en.wikipedia.org/w/index.php?title=Special:Search&profile=advanced&search=phenomenal+inhouse&fulltext=Search&profile=advanced| Search for the two words `phenomenal inhouse']] on Wikipedia. # On the search results page, set Advanced to all namespaces. You get six hits, in five different namespaces. See data on the template composition below. # Insert <code>boost-templates:"Resolved|200%"</code> in numerous and various ways, forms and positions. Try the front, the back, the number, the percent sign, the pipe, the quotes. # Nothing happens to the order of the search results. Here is the data on the six results. # William H. Stewart ## Date: 16:02, August 5, 2015 ## Words: 1,326, inhouse:1, phenomenal:1 ## Templates: Authority control, Birth date, DEFAULTSORT, US Surgeons General, cite news, cite web, death date and age, other people ## Links: 85 # Wikipedia:Reliable sources/Noticeboard/Archive 2 ## Words: 37,953, inhouse:1, phenomenal:1, phenomena:2 ## Date: 09:22, September 22, 2010 ## Templates: resolved:3, tl:7, user, verify credibility ## Links: 884 # File:West Pointer 09JUN11.pdf ## Date: 10:09, November 9, 2012 # Talk:New anti-Semitism/Archive 9 ## Words: 38,470, inhouse:1, phenomenal:1, phenomenon:23, phenomena:2, phenom:2 ## Date: 05:27, October 12, 2010 ## Templates: talkarchive, user:2 ## Links: 849 # File:A Dictionary of Christ and the Gospels Volume 2.pdf ## Date: 04:04, July 24, 2015 # User talk:Jimbo Wales/Archive 44 ## Date: 12:38, February 26, 2009 ## Words: 37,200, inhouse:1, phenomenal:1 ## Templates: user, archive-navigation, ec:2, hidden-begin, hidden-end, talkarchive ## Links: user talk:403, user:444
    • Task
    • ·Closed
    Not only do the namespace prefixes "all:" "All:" and "All: " all behave differently, but they also differ from the setting of at Special:Search Advanced "All" # [[https://en.wikipedia.org/w/index.php?title=Special:Search&profile=advanced&search=phenomenal+inhouse&fulltext=Search&profile=advanced| Search for the two words `phenomenal inhouse']] on Wikipedia. # On the search results page, set Advanced to All namespaces, and press Search. You get six hits, in five different namespaces. # In the search box insert "all:" with or without a space. You lose the two from the //File// namespace. # Now try "All: " with capital "A" and a space. You lose article space. # Now try "All:" with //no// space. You lose your query. Should this be documented, or is it unintended?
    • Task
    • ·Closed
    The following Wikipedia project pages are out of service for search suggestions: - CAT: for shortcuts to the Category namespace: 80 pp. - H: for shortcuts to the Help namespace: 462 pp. - MOS: for shortcuts to the Manual of Style: 640 pp. - P: for shortcuts to the Portal namespace: 747 pp. - T: for shortcuts to the Template namespace: 74 pp. That's two thousand helpful search-box hints not being suggested. Auto-suggestions that drop down from the search box have two telling flaws: # All shortcuts to project pages are out of service. # Suffixing "WP:" or "Wikipedia" puts the shortcut names in service. To see this pick a [[https://en.wikipedia.org/wiki/Wikipedia:Namespace#Pseudo-namespaces| Wikipedia pseudo-namespace]], say the Wikipedia Manual of Style: # Go to [[https://en.wikipedia.org/w/index.php?title=Special:PrefixIndex&prefix=MOS:&namespace=0| Special:All pages with prefix/MOS:]] on Wikipedia. You will see many shortcuts to the Manual of Style. # Type in a partial prefix, say MOS:DA. That should obviously drop-down several suggested pagenames, but is silent. # Now type WP:MOS:DA into the search box. You get pagenames suggested. Select one. You go to a non-existent page. Shortcuts are the pagenames of the most-needed, most oft-used project documentation for editors and administrators. Suggestions are synergistic with shortcuts, amplifying the intention of shortcuts, which is to facilitate productivity tasks.
    • Task
    Please requisition a new document ''How to edit a page with translations". Current documentation is oriented towards translations administrators, translation extension managers, and translations applications for translators. - It should be compact and concentrate on the syntax and grammar of <translate> <tvar>, <!--T:##--> and other translations markup. - It should focus on the needs of editors. Perhaps a section in Help:Editing. Translations markings on a page discourage editing. There is no documentation for translations whose audience is editors. - It should bring pages such as Help:CirrusSearch, back to life, "edited by anyone, mercilessly" as the proven saying goes. Help:CirrusSearch is a ghost town, yet Search is a hot technology -- you can get a college degree in it -- and Elastica is a hot open source movement. Help:Searching on Wikipedia has little or nothing on CirrusSearch, perhaps as a result of Help:CirrusSearch paralysis. As an editor at Help:CirrusSearch I struggled with 1) the documentation on translations and 2) with unresponsive translations "editors" for months before becoming fed up with it and removing it. Documenting how to edit a page with translations will prove that translations and translators care about editing, not just translations.
    • Task
    A titles-only search result, with zero extra post processing (no page ranking, no snippet, no highlighting, no page info, no parenthetical saying it matched the category or the redirect or the section) seems like an attractive, win-win feature. The old search depended on page ranking, offering only intitle and incategory for parameters, and pouring out myriad results. Now we have many new Search features which eschew page-ranking, enabling us to finally target a specific set of pages with precision for the purpose of definitely handling each and every one. Providing this would encourage: * post-processing those pages, targeting those titles in AWB * presenting those titles elsewhere, adding markup for wikilinks or checklists * comparing one query to another. A quick diff or cmp helps learn Search, diagnose Search, develop Search documentation Other attractions are * five times more information per page * clean, relevant, uncluttered results compete with SpecialPages in general, PrefixIndex, WhatLinksHere, Categories in particular * an intitle search with titles only I realize that the end-user who are post-processing can do it themselves. But currently there is no obvious way to grep titles from Search results. (Personally I have [javaScript add an edit tag](//meta.wikimedia.org/wiki/User:Cpiral/global.js) to each title, and can grep on that.) Currently the end user must do the search, go to the bottom of the page and select a higher number of results (than the twenty), have the search done again, and then undo 1) the snippet 2) the document size and date (or category member counts), and 3) the parenthetical saying whether it matched in a category box, or matched in a redirect.
    • Task
    Highlighting should work //with// the query, showing why the page matched **not** //in addition to// the query, showing, BTW also where the word is stemmed, even though the search query syntax explicitly requested non-stemmed. When stemming is turned off by placing a word in double quotes, the pages are correctly listed, but on the search results page, sometimes stems in the snippet are still indicated in bold. This only happens when turning off stemming of the //base// word. Matching works perfectly when turning off stemming for the word that is //not// a base word. Searching for `"clouds"` (in quotation marks) gives cloud clouded and **clouds** as it should. But searching for `"cloud"` (in quotation marks) shows **cloud clouded** and **clouds ** as seen at [[https://www.mediawiki.org/wiki/special:search/%22cloud%22%20prefix:user:cpiral | "cloud" prefix:user:cpiral ]]. Because match highlighting is used not only for location, but also for learning and teaching, documenting and bug-finding, this will eventually need fixing in order to teach what "exact phrase" means. Match highlighting in general is esp. important for trials in a sandbox.
    • Task
    In the search box at Wikipedia this should work, but does not: `insource:/\{[Ii]nfobox unit.*inunits1 *= *(\{\{)?[^#<>[\] {}]+\|?[^|][0-9]/ prefix:A` But 1) It works if prefix is at least "prefix:As", (for "Astronomical unit") And 2) It also works if hastemplate:"infobox unit" is added. The two things that make it work don't necessarily make sense, do they?
    • Task
    On Wikipedia: `hastemplate:"Val" insource:/\{\{[Vv]al\|[^}]*m\// prefix: :` runs in a snap. NOW ADD AN "s" at the end for "m\/s", and it takes so long that it times out, saying, consistently " There were no results matching the query." That is a problem, because it is obviously in error: There are plenty of pages that match that search without the "s", many of them showing the "s" clearly in the match. It should report a timeout error or some other kind of error. NOW ADD ONE MORE CHARACTER in the pattern, such that we no longer have the "/s at the end" problem: `hastemplate:"Val" insource:/\{\{[Vv]al\|[^}]*m\/s[|}]/ prefix: :` then it runs in a snap again. Mainspace has 1275 pages with template Val. Userspace has 626 pages. (Change to `prefix:User') That can't possibly be too many pages to "crawl character-by-regexp-character" through. Userspace has 10 pages that match the target: Val .*m/s Mainspace has 73 pages that match the failed search. (It works by adding another character to the pattern.) This seems to be a throttle limit, because it takes a long time (it's slow), but most telling, it works for prefix:User:. The only difference being the number of pages. It barely works for user space. But then why not an error msg? I refer you to https://discuss.elastic.co/t/how-to-protect-an-es-cluster-from-searches-that-would-kill-it/25487
    • Task
    The search `Prefix:-` or `prefix:_` or `prefix:'` all produce the same results, a set of all titles that start with - or '. For example the search `Prefix:_F` on Wikipedia: https://en.wikipedia.org/w/index.php?title=Special:Search&profile=all&search=prefix:_F&fulltext=Search It's also impossible to find titles that begin with double quotes: `prefix:"`. (Yet I can ''create such a page, and they do exist.) All the other characters work fine.
    • Task
    • ·Closed
    There is no way to filter insource:/"<big></big>"/, for example. Filters for regex are very important to have. The search "big big" will not look insource, so cannot act as a filter. (But it can find repeating words.) The search insource:"big" and insource:"big big big" are equivalent, and a weak filter. Unlike "big big big", which uses proximity zero, insource turns ''off'' proximity. If insource at least had proximity zero, it could find phrases that would greatly increase the filtering needs of regex. As it is the quotes are misleading, as insource cannot even find two words next to each other. Insource:"big big" is the same as insource:big insource:big. i is the inability to find repeating words. Maybe it's just a feature request, but we're talking about supporting a rare public tool, regex. These things would run much much faster if they had the ability to filter word phrases. Insource should at least set proximity to zero.
    • Task
    • ·Closed
    Turns out when a template's edit tab says "View source" instead of "Edit", ya hafta read the /doc page directly if you want to edit ''sections'' of what you read. If you just want to edit the entire /doc, you can do that from the main page by pressing "edit" in the doc box; otherwise to edit sections you have to press "view" in the doc box and read the /doc directly. So it's a bug. Grate!