Tue, May 21
Mon, May 20
Sun, May 19
Thu, May 16
Wed, May 15
@Gilles done, let me know if you need more help.
@Krinkle sorry for the delay in merging this. LGTM now, please reopen if there are issues.
Tue, May 14
Mon, May 13
I am stalling this for now until we see how T220811 pans out.
@Krinkle After a little more digging, looks like nginx in hassium is using HTTP/1.0 when forwarding requests to mwdebug*. My theory is that when HHVM returns a 500 and proto version is HTTP/1.0, it add the Content-length: 0 header. Varnish in turn sees Content-length: 0, so it returns its own generic error since it believes there are no data to serve.
Fri, May 10
After some debugging on hassium, I found some interesting things:
We have upgraded php7 on beta, so now it looks like async jobs are running. We will leave it as is until next week, where we will assess if it worked out.
@Dzahn THANK YOU! 😍
Thu, May 9
What is a bit more weird is that this works as expected with PHP7.
Wed, May 8
It looks like deployment-prep has an older php7.2 version than production, which is something we should fix as well
Tue, May 7
@Gilles All packages have been rebuilt and added to buster-wikimedia main repo. Please reopen if we have any issues.
Mon, May 6
Sat, May 4
Fri, May 3
@Lucas_Werkmeister_WMDE We will look into it, it only happened on a single server so we believe, for now, that it could not be related to the change per se.
Thu, Apr 25
Wed, Apr 24
We have pushed https://gerrit.wikimedia.org/r/502986 and (its update) https://gerrit.wikimedia.org/r/505383/ to production. Are we going to proceed with session and mysql settings or should we mark this task as resolved?
@Dzahn I need to to talk with our team before I green light this, also mentioned in T221132. Is it Possible to revisit this in a week from now? Thank you!
Apr 23 2019
All traffic is served by haproxy. If we have any issues, this can be easily reverted. Closing for now.
@Gilles ping me when you think we are ready to build packages for buster, we would do it soon either way
I am happy to do it, but: I am afraid it will have to wait for next week, as this is a short week for me:)
@Krinkle I have stopped and disabled xenon-log, excimer-log, and apache on mwlog* servers, and I have removed the arclamp-generate-svgs cron job. Let me know if there is anything else related to change 503675
Apr 22 2019
@Gilles thank you! I added the relevant codfw ones
Apr 21 2019
I am afraid I do not know much either about the services on this server so to perform any actions. @Krinkle Is there something we can do for the time being? What problems are we having while this server is in this state?
Apr 20 2019
Apr 19 2019
@Gilles This is fixed now, I will though revert back to nginx for the weekend. We do have data we can work with from today.
Apr 16 2019
@CDanis Thank you! I am resolving this for now.
Apr 15 2019
@JoKalliauer Since last week all servers are using librsvg 2.40.20-3 which fixes some svg rendering issues (and possibly introduces others).
Apr 10 2019
@Krinkle yeah we will wait for sure, meanwhile, we are exploring:)