Page MenuHomePhabricator

Improve GUC database handling
Closed, ResolvedPublic

Description

Ref T182916: Database error: Unable to connect to s7.web.db.svc.eqiad.wmflabs

  • Ensure all PDO statement objects are dereferenced after use so that early close is actually possible.
  • Track connection count and log it.
  • Add method for closing a connection.
  • Add early close where possible
    • Use s1 for meta_p.wiki query (documented to be present on available on all db hosts)
    • Add explicit close at end of web request.
    • Add early close for CentralAuth db, only used for one query.
    • Sort all-wiki queries by slice so that we can open/close per batch and never have a conn open to more than one. Not done for now, given that the IP-based solution has already reduced connections from 9 to 1. At a later time when there is more than 1 minimum connection required, this can be revisited.
  • Figure out a way to make centralauth use s7, without hardcoding it. https://gerrit.wikimedia.org/r/408751

Event Timeline

Krinkle renamed this task from Improve database connection handling to Improve GUC database handling.Feb 4 2018, 5:02 AM
Krinkle triaged this task as High priority.

Change 408025 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[labs/tools/guc@master] Consistently dereference PDO statement objects after use

https://gerrit.wikimedia.org/r/408025

Change 408025 merged by jenkins-bot:
[labs/tools/guc@master] Consistently dereference PDO statement objects after use

https://gerrit.wikimedia.org/r/408025

Change 408719 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[labs/tools/guc@master] Implement App::closeDD() method and use to close 'centralauth' conn early

https://gerrit.wikimedia.org/r/408719

Change 408722 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[labs/tools/guc@master] Add debug logging for connection counts

https://gerrit.wikimedia.org/r/408722

Change 408719 merged by jenkins-bot:
[labs/tools/guc@master] Implement App::closeDD() method

https://gerrit.wikimedia.org/r/408719

Change 408722 merged by jenkins-bot:
[labs/tools/guc@master] Add debug logging for connection counts

https://gerrit.wikimedia.org/r/408722

Current worst-case connection stats (e.g for a user that has non-zero edits on at least one wiki on each section)

Connections opened: 9
Highest connection count: 9
Finish

Let's start optimising :)

Change 408724 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[labs/tools/guc@master] Close 'centralauth' connection early

https://gerrit.wikimedia.org/r/408724

Change 408724 merged by jenkins-bot:
[labs/tools/guc@master] Close 'centralauth' connection early

https://gerrit.wikimedia.org/r/408724

Change 408725 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[labs/tools/guc@master] Explicitly try to close connections before end of response

https://gerrit.wikimedia.org/r/408725

Change 408726 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[labs/tools/guc@master] Close connections earlier (after getData, instead of after output)

https://gerrit.wikimedia.org/r/408726

Change 408725 merged by jenkins-bot:
[labs/tools/guc@master] Explicitly try to close connections before end of response

https://gerrit.wikimedia.org/r/408725

Change 408726 merged by Krinkle:
[labs/tools/guc@master] Close connections earlier (after getData, instead of after output)

https://gerrit.wikimedia.org/r/408726

Krinkle updated the task description. (Show Details)

Change 408745 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[labs/tools/guc@master] Omit USE/$dbname for multi-wiki and 'centralauth' queries

https://gerrit.wikimedia.org/r/408745

Change 408745 merged by jenkins-bot:
[labs/tools/guc@master] Omit USE/$dbname for multi-wiki and 'centralauth' queries

https://gerrit.wikimedia.org/r/408745

Change 408751 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[labs/tools/guc@master] Re-use connections based on IP instead of hostname

https://gerrit.wikimedia.org/r/408751

Change 408751 merged by jenkins-bot:
[labs/tools/guc@master] Re-use connections based on IP instead of hostname

https://gerrit.wikimedia.org/r/408751

Change 408752 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[labs/tools/guc@master] Improve logging about closed connections

https://gerrit.wikimedia.org/r/408752

Change 408752 merged by jenkins-bot:
[labs/tools/guc@master] Improve logging about closed connections

https://gerrit.wikimedia.org/r/408752

Change 408755 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[labs/tools/guc@master] Fix bug with overly eager connection closing

https://gerrit.wikimedia.org/r/408755

Change 408755 merged by Krinkle:
[labs/tools/guc@master] Fix bug with overly eager connection closing

https://gerrit.wikimedia.org/r/408755

253523 renamed this task from Improve GUC database handling to ☬ʄທയஆടஷະ☬のລັບ☠ ☬ ʄທയஆടஷະ┅─ ★͜͡ ʄທയஆടஷະ┅─.Feb 12 2018, 10:28 AM
253523 reopened this task as Open.
253523 raised the priority of this task from High to Needs Triage.
253523 set the point value for this task to 408751.
Aklapper renamed this task from ☬ʄທയஆടஷະ☬のລັບ☠ ☬ ʄທയஆടஷະ┅─ ★͜͡ ʄທയஆടஷະ┅─ to Improve GUC database handling.Feb 12 2018, 1:25 PM
Aklapper closed this task as Resolved.
Aklapper triaged this task as High priority.
Aklapper removed the point value for this task.