Page MenuHomePhabricator

Scale AFT tables to support 10% deploy
Closed, ResolvedPublic

Description

Author: tchay

Description:
Worried about the load on the DB for something that will generate as much traffic as the revision text. Currently revision text data scales only because old revisions get archived into External Storage (http://www.mediawiki.org/wiki/External_Storage).

First we need to estimate how much load/data we are generating currently. To do that Asher needs the following numbers:

  • proposed site %: 10% on July 3
  • current % of site that has AFTv5 on
  • current rate of new feedback creation
  • an estimate of the % of feedbacks that have comments > 255 characters

Since it was determined to 255bytes is not an acceptable comment input limit that can be surfaced in the UI. We consider the following possibilities

  1. If the rate is small enough, it may be possible to close this bug without addressing the issue at all
  2. Most likely, we should scale therefore we propose the following changes be added to the code before July 3 ramp-up
  • Add table af_article_answer_text with id's aa_article_answer_text_id
  • Change the schema of aft_article_answer to add a row aa_article_answer_text_id of type integer unsigned (foreign key into above). is_null is allowed
  • Change type of aa_response_text to varchar(255)
  • Change code to do a join against the above
  • When saving, if text > 255 bytes, save to new table and add link plus empty aa_response_text field to the db
  • (Optional: migrate/upgrade old text files > 255 bytes)
  1. If (2) isn't good enough then we're s.o.l. until August when sharding becomes available.

Version: unspecified
Severity: normal

Details

Reference
bz37707

Event Timeline

bzimport raised the priority of this task from to Unbreak Now!.Nov 22 2014, 12:21 AM
bzimport set Reference to bz37707.

tchay wrote:

I'm adding a dependency on Bug #37616 because that bug can go out first, but this one should be addressed before July 3.

tchay wrote:

I'm adding a dependency on Bug #37616 because that bug can go out first, but this one should be addressed before July 3.

Thanks, Terry, much appreciated!

Here are some quick answers to your questions (I will also add them to the Bugzilla ticket)

  • current % of site that has AFTv5 on: 0.65% (22,000 articles on en-wiki)
  • current rate of new feedback creation: about 20,000 feedback posts per month

I am looping Dario in, so he can give us his estimate of the % of feedback posts that have comments that exeed 255 characters.

According to Dario, 2% of article feedback posts include a comment that exceeds 255 characters.

Here are some estimates of overall volume for the next two releases, based on this data.

Limited Release:
July 3, 2012
10% of en-wiki
200,000 posts / month
140,000 posts w/ comments
4,000 comments over 255 characters

Full Release:
Sep. 4, 2012
100% of en-wiki
2,000,000 posts / month
1,400,000 posts w/ comments
40,000 comments over 255 characters

What scalability plan do you recommend, based on these estimates?

"- When saving, if text > 255 bytes, save to new table and add link plus empty
aa_response_text field to the db"
-> I didn't put an empty response_text field, but truncated the text to 255 so we can also get a small part in that column.
This would allow us to easily drop the join to this new table on overview pages, where we don't really need the full description.

Other than that, executed as suggested (incl. update of existing large texts); on Gerrit (https://gerrit.wikimedia.org/r/#/c/12385/) & prototype.

tchay wrote:

That's a great idea.

afeldman wrote:

Agree, that's a great idea.

Can we also get profiler hooks added to the extension around major functions? MobileFrontend is a good example of a well profiled extension, or see the FileBackendStore class in core.

Thanks, Matthias!

I notice that on the feedback page, some comments exceeding 255 characters are truncated in mid-sentence, which is a bit confusing.

For the next deployment, could we add "... More >" after the 255th character for these long comments?

The 'More' link would link to the permalink page (just like 'Details'), which will make it easier for the reader to see the complete comment (they might not think to click on 'Details', which is several lines above the comment and not as obvious as it once was).

Other than that, I am glad we were able to fit this in before the wider deployment. Thanks for doing it!

tchay wrote:

Apparently AFTv5 currently has the More link at 500 characters so it needs to be moved up so it actually kicks in on the list view. :-)