Added pana to cron jobs on dev and prod; kicked off a run on prod to get it populated. Deployed the new index page to production. Check back in a few hours (both @MBinder_WMF and @JAufrecht) see if it works.
Tue, Jul 17
Mon, Jul 16
Duplicate isn't a separate status in Phabricator; it's just "resolved" as far as the status is concerned.
On Dev, the report hasn't run since 2016. On Prod, the report last ran Jul 15, 2018, the velocity chart goes back to Oct 2017 (in Quarter view), and the database contains data back only to 2017-10-02. A lot of the charts are broken, but a) that's a separate issue and b) it's probably due to lack of chartable data as configured, because the 'show hidden' charts, with more data, do work.
Debugging notes: Task 182319 has status resolved for all dates. The only sources in the data for status are, a March 7 2018 transaction changing it from open to resolved, and status_at_load = resolved in the maniphest_task row. Interpretation: the range of dates being loaded doesn't include the creation transaction, so Phlogiston has no record of the task's status prior to March 7, and so backfills the missing status with the status_at_load, which is the wrong value for this purpose.
Fri, Jun 29
May 30 2018
May 4 2018
May 2 2018
Apr 30 2018
Confirming: automated Phlogiston run from last night has fresh data.
Apr 25 2018
@Dzahn The automated nightly dump doesn't seem to be working, as the file didn't update Tuesday or Wednesday:
Apr 24 2018
@Dzahn I see the fresh date in the file; starting a data run now to test the contents. Thanks.
Apr 23 2018
passed basic test
Re-testing showed it was not a data processing error. Code investigation identified the problem: the maniphest_task table was moved from reconstruction to load, but the load tables are dropped and rebuilt on each load. The maniphest_task table is a cumulative reconstruction table. Moved the table back from load to reconstruction fixed the problem; confirmed after testing.
The dump hasn't been updated since April 1, 2018. Perhaps it wasn't automated?
Apr 19 2018
- is because there are no transactions of interest to Phlogiston (edge changes, status changes)
- has been narrowed down to the SQL function fix_status(), and appears to be because of sloppy syntax on a SQL query.
At least two different problems here:
- 190686 has no transactions loaded in phlogiston, although there are transactions in the dump
- All of 190686's task_on_date entries have status = resolved, even though there are no transactions and the status on load is 'open'
Apr 18 2018
After multiple rebuilds and the fixing of unrelated bugs, 190765 now shows as open, but 190686 is still falsely shown as resolved.
Apr 13 2018
Check on dev again.
This seems to be an accident of partial data processing on the dev server rather than a data-deleting bug. Re-doing complete runs to confirm.
Apr 5 2018
Apr 4 2018
These all appear to be fixed.
Apr 2 2018
I think this fix has passed enough spot testing to be generally considered resolved. Thanks, @mmodell.
This task appears to be Screep but is listed as In-Scope: https://phabricator.wikimedia.org/T175877
My understanding in this case is that anything added to the Kanbanana board after 2017-12-14 should show as Screep, and this was added 2017-12-20.
It shouldn't make any difference; the code works with both formats. I haven't yet identified anything that can't be done with the new format, but I'm still checking.
I've fixed a number of obvious bugs in the new code; still quality-testing the results.
Mar 27 2018
Mar 26 2018
I found the missing indexes and put them in and tested over the weekend; processing time is back to what it was before the change. Some of the data looks normal on dev and some doesn't, but this may be due to project-specific norms. Production ran but the data doesn't look right at all. Will be following up with active users to debug.
Mar 23 2018
I believe I have it correctly importing on the dev server, handling both forms of edge transaction. A total of three transactions don't fit the pattern and I'll follow up on those separately. I am as yet unable to validate through to the end because several of the post-processing steps became 10 to 100x longer for no obvious reason, so I'm working on optimizing again.
Mar 20 2018
Mar 19 2018
incorrectly closed. Programs do have objectives but not a table like this. let's consider this.
This was completed by the deadline.
Mar 16 2018
It's much easier to change Phlogiston than the dump, because it runs
outside of any security profile and doesn't require an admin. so I would
vote to do all that in phlogiston.
If you can throw in the new column, I can modify the script to work with
either kind of data.
Mar 15 2018
For an experiment, I tweaked the load so that it parsed all of the "bad" transactions, treating the list of PHIDs as a list of projects the task belonged to at that date (which was true for at least the sample task 187512). Phlogiston ran to completion and generated a report on dev; I used Reading-Web as the sample:
You probably can't find it in the code because ... it's temporary error-handling code that I added to the dev server and never committed. On production, the load log looks like:
Mar 14 2018
Steps to Reproduce:
- Download a current dump from https://docs.google.com/spreadsheets/d/1yVx3Fmmxccu2noU18IBztcKs88TDRt2E57XGAFYLb6c/edit#gid=327722724
- Run a standard Phlogiston data loading and reporting run. Alternately, examine the dump file as described below.
Mar 13 2018
What kind of information would be helpful? I could examine the dump file to give more information over what's above. I could look to see how the apparent 100,000 cutoff manifests in the dump. I could try to define a filter that would cut the number of tasks for reporting below 100,000 to see if the problem then (temporarily) goes away.
Mar 9 2018
cancelled - no demand.
done in other work threads
Mar 7 2018
Closed in the absensce of a Team Practices Group.
Mar 2 2018
Feb 28 2018
Working with Lynette on Phab basics, we edited the title for clarity and assigned it to Rachel.
All of the current reports are failing every day because they basically have no data and Phlogiston (or R) doesn't have enough error handling to continue functioning. The chart above was captured two weeks after the apparent change but before Phlogiston gave up; you can see that the task counts dropped from ~300 to maybe two or three tasks. When I reported the task count of 179,180, that was a literal count of objects (
) so there are still >100,000 tasks in the dump, but most of the edge data seems to be missing. It's hard to be more precise because I don't have a "before" dump handy, but the first thing that caused Phlogiston to outright crash was that some of the transactional core:edge data started showing up empty. Is that helpful or should I try to provide more precise information about what's missing in the dump?
Feb 27 2018
I still see 179180 tasks in the dump, which is presumably all of the public tasks. What does seem to be missing in the dumps is some, but not all, of the core:edge transaction content. For the missing ones, there's still a record for the transaction, but the content of the transaction is missing. I haven't spotted other things specifically missing from the dump yet.
Feb 26 2018
Have dug in deeper; pretty sure this is a change in the data files provided to Phlogiston, and not a code problem introduced recently in Phlogiston alone.
Feb 23 2018
I also checked in new code on Feb 15, which is a superficial clue. I will continue investigating. Do you think the schema change affected the dumps?
A preliminary inspection suggests that the dumps are still coming, still fresh, but missing all edges (aka which tasks belong to which project). While I investigate further, a question: Did anything change in the dumps around Feb 15th?
Feb 20 2018
Feb 9 2018
Test task: T182319. Status is blank, but in Phab it is open.
Feb 8 2018
Pivoting since Oct 2018 date is postponed.
Test case T182639 had no parent in get_status_report. This was due to a design conflict. The stored procedure determines parenthood by checking the maniphest_blocked table for the final date of the report. However, the maniphest_blocked table, which is regenerated with each daily load, does not reconstruct historical blocking relationships from transactions. Instead, it gets the current parent state from the task data, and loads that into maniphest_blocked with the load date. So these relationships will only work in Status Reports if the report is run on the most recent day. This was fixed by having the stored procedure get_status_report not filter by date, so that it does get the parent match.
Feb 7 2018
More test data: T182197 has Scope = Unknown bug in report but it does have a status in the report.