May 25 - May 29
- Had meeting with Nikerabbit and Nemo_bis to discuss mainly T98665
- Work in progress for T100175 (reuse TUX elements) and case sensitive search T100013
- Investigating better approach to display translated and untranslated messages:
- Fetch all resultset at once without size parameter: Easy for displaying translated messages. May prove to be inefficient as documents grow. We may get result within 1st 100 resultset or so, then fetching remaining resultsets is not useful.
- Fetch 100 results(apply size parameter), if a minimum no. of translations(i.e. 25) are not found then search for next 100 and so on: Will not make unnecessary fetch but may require quite number of search iterations in the worst case.
- Third approach Data Denormalization better than the above two: Add new field to documents for list of languages for which translations exist. We can use the same field as missing query for finding untranslated messages. This would also require code changes for translations update.
- As per discussion, review in progress for submitted patch T97961.
- Sorted the group list based on search result count instead of alphabetically T100393
June 1 - June 5
- Added support for translated and untranslated messages in Special:SearchTranslations. Though this process was efficient but due to space consumption by indexing extra translated and untranslated language codes for each message, this process for now would not be an option.
We should try to use MessageCollection if we could not come up with any better approach.
June 8 - June 12
- The above mentioned procedure would require update of more than 300 documents for a translation, in comparison to one document update.
- Discussion with the mentors for a better solution: If we are unable to retrieve the required results with ES indexed documents, then we may at last use MessageCollection( which requires db hit ) to collect data.
- Algorithm currently working on to filter out translated and untranslated messages( no extra fields are required ):
- Use "Filtered query" to search for a string in a source language and collect 'localid' and 'scores'( keep scores for documents to not lose its relevance )
- Find translated messages from the list of 'localid' and use Function score script to replace the scores of the documents returned with the scores from step 1.
- The list of id's from step 2 not found in the list from step 1 are the untranslated messages.
- Here the fuzzy messages are counted in untranslated messages because currently we do not keep the indexed documents for fuzzy messages. So to keep track for fuzzy messages we have to filter out messages from untranslated messages retrieved in step 3. We can achieve this by using MessageCollection->filterFuzzy(). The main problem lies in efficiency to retrieve fuzzy messages which involves steps 1->2->3.
- I think what might work is to keep fuzzy messages instead of deleting it from the index and to help us filter out fuzzy messages from translated ones, add a extra field fuzzy with no data.( If we are totally against adding an extra field due to space constraint, we may try using different format for localid , e.g. fuzzy-MediaWiki:Config, though not sure as it may break some functionality )
June 15 - June 21
- Code review in progress T97943
- Read documentation for MessageCollection and its usage.
- submitted patch to support cross language search to display translated messages using MessageCollection https://gerrit.wikimedia.org/r/#/c/218859/
June 22 - June 28
- An important unplanned task was completed. https://gerrit.wikimedia.org/r/#/c/219388/
- API for search translations T100176
- Allow wildcard search T100345
- Support for untranslated messages using MessageCollection https://gerrit.wikimedia.org/r/#/c/220447/
- Investigating parent/child mapping to find a better way to index related documents and to provide an user friendly faceted navigation.
June 29 - July 3
- Midterm evaluation T103155
- Integrate search features here
- Deploy code in Labs-instance.
- User Interface enhancements.
- Read elasticsearch documents.
July 4 - July 10
- Updated patch to include URL parameter 'sourcelanguage'. https://gerrit.wikimedia.org/r/#/c/222074/
- Advertised Search Translations T105140
- Toggle group filter. https://gerrit.wikimedia.org/r/#/c/223733/
- Allow search for messages that contain all the words of the search string, i.e., AND operator. T100346
- Designed a mockup to provide an interface for different search parameters. here
July 13 - July 19
- Allow autocompletion for search operators T98559
- Get outdated messages T101221
- Customized facets based on translated and untranslated messages https://gerrit.wikimedia.org/r/#/c/225277/
July 20 - July 26
- Allow highlighting for multi-fields, merged with the wildcard search patch https://gerrit.wikimedia.org/r/#/c/220097/
- Search messages containing exact phrase. https://gerrit.wikimedia.org/r/#/c/226289/
- Search with exact title match T62570
- Updated Search Translations
July 27 - Aug 2
- Review in progress for T100345
- Developed API module for translated, untranslated and outdated messages T106931
- Worked on UI improvement T106319
Aug 3 - Aug 9
- Merged patch for T100393
- Review in progress for https://gerrit.wikimedia.org/r/#/c/220447/
- Learned to create a dependency in gerrit and updated all the patches dependent on others.
- Code submission for T100175
Aug 10 - Aug 16
- Created CrossLanguageTranslationSearchQuery class for cross language search. https://gerrit.wikimedia.org/r/#/c/230769/
- Merged patches for T101220, T62570
- Improved code for various patches.
- Rebased patches.
Aug 17 - Aug 23
- Had meetings with @Nikerabbit and @Nemo_bis to improve user interface.
- Developed the feature to allow users to search for results containing all query words. T100346
- User interface improvements T98560
- Added 'filter' search operator T97944
- Merged patch for T100175, T106931
- Documentation added at https://www.mediawiki.org/wiki/Help:Extension:Translate/Search
- Writing wrap-up report
- Updating Labs-instance
- Merged patch for T100013
- Small fix: https://gerrit.wikimedia.org/r/#/c/233097/ to show the selected language at the top left, even though it has no results.