Updated table list and definitions.
https://www.mediawiki.org/wiki/Phlogiston/Data_Model
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Jul 25 2018
One hypothesis for why one of the counts (either the burnup line or the backlog areas) is wrong is that the recently_closed table may be mixing data. It has one column for date, one column for (first day of) week, one column for (first day of) month, and one column for (first day of) quarter, plus two dependent data fields. However, the week, month, and quarter should each have a separate row, since they each represent different time periods and presumably have different aggregade data. So, there could be mingling of week, month, and quarter data, which would explain variation in the discrepency.
Implemented on dev, tested with vec, and rolled to production. Documented here:
added to cron on dev and prod. Full run on prod, looks fine. Abbreviated run on dev, which needs to be rerun when nothing else is happening, but I don't think is a blocker.
Initiated full runs on dev and prod yesterday, and added to cron. Looks good this morning; please confirm and close.
Jul 24 2018
Checking today, this task is correctly categorized as TR1: Releases, so presumably some other fix took care of this as well.
Is this request still valid? And do I understand it correctly:
This is by design, because those reports are not currently being generated. Is there a need for those reports?
Is this still a live request?
No clear need or sponsor.
Updated configuration for WMF FY18–19. Need to do a complete rebuild when dev server is available.
Is there any need for this or any related FR-Tech Phab report? If not, can we close this?
These reports are up and stable and have been for most of the last year.
Is there any value/is this the right time to update Phlogiston reporting for Search? E.g., to track a new FY19 (Jul 2018–Jun 2019) Annual Goal? If so, we need details: which Phab project/column/parent tasks are relevant? If not, let's close this.
Fixed on production.
Jul 23 2018
Still appears fixed.
Correction: Duplicate is a separate status in Phabricator, and Phlogiston does preserve this.
Debugging notes.
This was fixed on dev via a full database rebuild of 'cot'. Now initiating same on prod.
Ran Sunday night on both dev and prod, has data. I think you can close it.
Jul 20 2018
Added pana to cron jobs on dev and prod; kicked off a run on prod to get it populated. Deployed the new index page to production. Check back in a few hours (both @MBinder_WMF and @JAufrecht) see if it works.
Jul 17 2018
Jul 16 2018
Duplicate isn't a separate status in Phabricator; it's just "resolved" as far as the status is concerned.
On Dev, the report hasn't run since 2016. On Prod, the report last ran Jul 15, 2018, the velocity chart goes back to Oct 2017 (in Quarter view), and the database contains data back only to 2017-10-02. A lot of the charts are broken, but a) that's a separate issue and b) it's probably due to lack of chartable data as configured, because the 'show hidden' charts, with more data, do work.
Debugging notes: Task 182319 has status resolved for all dates. The only sources in the data for status are, a March 7 2018 transaction changing it from open to resolved, and status_at_load = resolved in the maniphest_task row. Interpretation: the range of dates being loaded doesn't include the creation transaction, so Phlogiston has no record of the task's status prior to March 7, and so backfills the missing status with the status_at_load, which is the wrong value for this purpose.
Jun 29 2018
May 30 2018
done.
May 4 2018
May 2 2018
Apr 30 2018
Confirming: automated Phlogiston run from last night has fresh data.
Apr 25 2018
@Dzahn The automated nightly dump doesn't seem to be working, as the file didn't update Tuesday or Wednesday:
Apr 24 2018
@Dzahn I see the fresh date in the file; starting a data run now to test the contents. Thanks.
Apr 23 2018
passed basic test
Re-testing showed it was not a data processing error. Code investigation identified the problem: the maniphest_task table was moved from reconstruction to load, but the load tables are dropped and rebuilt on each load. The maniphest_task table is a cumulative reconstruction table. Moved the table back from load to reconstruction fixed the problem; confirmed after testing.
The dump hasn't been updated since April 1, 2018. Perhaps it wasn't automated?
Apr 19 2018
- is because there are no transactions of interest to Phlogiston (edge changes, status changes)
- has been narrowed down to the SQL function fix_status(), and appears to be because of sloppy syntax on a SQL query.
At least two different problems here:
- 190686 has no transactions loaded in phlogiston, although there are transactions in the dump
- All of 190686's task_on_date entries have status = resolved, even though there are no transactions and the status on load is 'open'
Apr 18 2018
After multiple rebuilds and the fixing of unrelated bugs, 190765 now shows as open, but 190686 is still falsely shown as resolved.
Apr 13 2018
Check on dev again.
This seems to be an accident of partial data processing on the dev server rather than a data-deleting bug. Re-doing complete runs to confirm.
Apr 5 2018
Apr 4 2018
These all appear to be fixed.
Apr 2 2018
I think this fix has passed enough spot testing to be generally considered resolved. Thanks, @mmodell.
This task appears to be Screep but is listed as In-Scope: https://phabricator.wikimedia.org/T175877
My understanding in this case is that anything added to the Kanbanana board after 2017-12-14 should show as Screep, and this was added 2017-12-20.
It shouldn't make any difference; the code works with both formats. I haven't yet identified anything that can't be done with the new format, but I'm still checking.
I've fixed a number of obvious bugs in the new code; still quality-testing the results.
Mar 27 2018
Mar 26 2018
I found the missing indexes and put them in and tested over the weekend; processing time is back to what it was before the change. Some of the data looks normal on dev and some doesn't, but this may be due to project-specific norms. Production ran but the data doesn't look right at all. Will be following up with active users to debug.
Mar 23 2018
I believe I have it correctly importing on the dev server, handling both forms of edge transaction. A total of three transactions don't fit the pattern and I'll follow up on those separately. I am as yet unable to validate through to the end because several of the post-processing steps became 10 to 100x longer for no obvious reason, so I'm working on optimizing again.
Mar 20 2018
Mar 19 2018
incorrectly closed. Programs do have objectives but not a table like this. let's consider this.
This was completed by the deadline.
Mar 16 2018
It's much easier to change Phlogiston than the dump, because it runs
outside of any security profile and doesn't require an admin. so I would
vote to do all that in phlogiston.
If you can throw in the new column, I can modify the script to work with
either kind of data.
Mar 15 2018
For an experiment, I tweaked the load so that it parsed all of the "bad" transactions, treating the list of PHIDs as a list of projects the task belonged to at that date (which was true for at least the sample task 187512). Phlogiston ran to completion and generated a report on dev; I used Reading-Web as the sample:
You probably can't find it in the code because ... it's temporary error-handling code that I added to the dev server and never committed. On production, the load log looks like:
Mar 14 2018
Steps to Reproduce:
- Download a current dump from https://docs.google.com/spreadsheets/d/1yVx3Fmmxccu2noU18IBztcKs88TDRt2E57XGAFYLb6c/edit#gid=327722724
- Run a standard Phlogiston data loading and reporting run. Alternately, examine the dump file as described below.
Mar 13 2018
What kind of information would be helpful? I could examine the dump file to give more information over what's above. I could look to see how the apparent 100,000 cutoff manifests in the dump. I could try to define a filter that would cut the number of tasks for reporting below 100,000 to see if the problem then (temporarily) goes away.
Mar 9 2018
cancelled - no demand.
done in other work threads