Page MenuHomePhabricator

Running `arc diff` twice on the same commit creates another redundant diff within the same differential
Closed, DeclinedPublic

Description

So when one is not sure which parts have already sent for review, we can no longer push all of them, as Phabricator will no longer automatically filter doublettes.

Reported upstream at https://secure.phabricator.com/T4953 .

Details

Reference
fl228
TitleReferenceAuthorSource BranchDest Branch
Update mathjax-node and texvcjsrepos/mediawiki/services/mathoid!3physikerweltT137787main
Customize query in GitLab

Related Objects

View Standalone Graph
This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.

Event Timeline

flimport raised the priority of this task from to Low.Sep 12 2014, 1:32 AM
flimport set Reference to fl228.

Rush wrote on 2014-04-29 16:56:30 (UTC)

hey chris, can you help me understand what you mean?

Is it...

a) branch foo

b) commit1...commit2...arc diff master = differential 1

c) no changes

d) arc diff master = differential 2

?

that seems like a bug if so. I don't believe it limits how you structure commits in a differential, but in that case if you have 2 commits with a differential and you do another arc diff that includes them. Typically with no changes it's a noop, say you have a third commit in this branch and you do arc diff master it should update the existing differential unless you specifically tell it to create a new one?

I'm not sure what the behavior you are seeing is

qchris wrote on 2014-05-01 16:02:31 (UTC)

@Rush

Since we both seem to have problems making sense of the other's post, let me give an example that created D26 and its /two/ Differential Revisions, although I did not change the code between the two runs of arc diff:


christian@nepomuk // jobs: 1 // time: 16:59:40 // exit code: 0
cwd: ~/tmp/arcanist-demo-repo/fabtest
git checkout -b foo15
Switched to a new branch 'foo15'

_________________________________________________________________
christian@nepomuk // jobs: 1 // time: 16:59:52 // exit code: 0
cwd: ~/tmp/arcanist-demo-repo/fabtest
echo "Foo15" >foo15.txt

_________________________________________________________________
christian@nepomuk // jobs: 1 // time: 16:59:59 // exit code: 0
cwd: ~/tmp/arcanist-demo-repo/fabtest
git add foo15.txt

_________________________________________________________________
christian@nepomuk // jobs: 1 // time: 17:00:02 // exit code: 0
cwd: ~/tmp/arcanist-demo-repo/fabtest
git commit -a -m "Add foo15.txt"
[foo15 4ac6e5d] Add foo15.txt
 1 file changed, 1 insertion(+)
 create mode 100644 foo15.txt

_________________________________________________________________
christian@nepomuk // jobs: 1 // time: 17:00:28 // exit code: 0
cwd: ~/tmp/arcanist-demo-repo/fabtest
arc diff
Waiting for Emacs...
Linting...
No lint engine configured for this project.
Running unit tests...
No unit test engine is configured for this project.
Updating commit message...
Created a new Differential revision:
        Revision URI: http://fab.wmflabs.org/D26

Included changes:
  A       foo15.txt

_________________________________________________________________
christian@nepomuk // jobs: 1 // time: 17:00:53 // exit code: 0
cwd: ~/tmp/arcanist-demo-repo/fabtest
arc diff
Waiting for Emacs...
Linting...
No lint engine configured for this project.
Running unit tests...
No unit test engine is configured for this project.
Updated an existing Differential revision:
        Revision URI: http://fab.wmflabs.org/D26

Included changes:
  A       foo15.txt

So although the arc diff invocations occur directly one after the other, Phabricator creates two separate Differential Revisions for them in D26. I'd expect to see only a single Differential Revision in D26 as they should be the same commit/diff.

Rush wrote on 2014-05-01 16:16:26 (UTC)

I finally understand what you mean :D

Sometimes I'm slow on the uptake.

Ok so in this case, the lingo confused me. When you say it created two Differential Revisions I thought you meant it was creating two Differentials over the same commits, like D26 then no changes then D27.

In this case, you mean it is creating another revision base within the D26 differential even though you made no changes. I hope?

So...I guess this could be considered a bug as it should be an arc noop? But I don't actually understand the harm? It isn't creating a duplicate differential for review, so running arc diff over the same commits isn't harming anything, it's just adding useless base revisions to the existing differential, which I guess you could say you are telling it to do actually?

What behavior do you expect here? No new changes so it just kicks back with a 'no changes?'

qchris wrote on 2014-05-01 18:33:55 (UTC)

Ok so in this case, the lingo confused me.

Sorry. The Phabricator terms are still new to me.
Given your response, it seems that not the “Diff 1” within D26 but the
whole of D26 would be called a “Differential Revision”. And from

[...] you mean it is creating another revision base [...]

I take that the groups of commits that are labelled „Base”, „Diff 1”,
and „Diff 2” underneath “Revision Update History” in D26 are called
“revision base”?

In this case, you mean it is creating another revision base within the D26 differential even though you made no changes. I hope?

After revisiting the terms I used ... I guess: Yes.

So...I guess this could be considered a bug as it should be an arc noop?

I'd hope it would be a noop. Yes.

But I don't actually understand the harm?

It would send unwarranted emails to reviewers.
Reviewers would look at he new base revision, although nothing
changed.
So it would just cost time and add noise.

Besides, you always have to remember whether or not you uploaded this
or that commit already. If it would be a noop, you could just try
pushing it. If it has not been uploaded before, a Differential Revision
would be created, if it has been uploaded before, nothing would change.

What behavior do you expect here? No new changes so it just kicks back with a 'no changes?'

Yes. I'd at least expect no new base revisions getting added.

Just like for example like git push-ing a ref's current value to it
again does not amend the ref. Or like git rebase also does not
create unneeded new revisions: Running git rebase master for

. * 8fb60ad (HEAD, demo2) +demo2
. * 766c965 (demo1) +demo1
. * 7d3a59b (origin/master, origin/HEAD, master) Add foo17.txt

does not change the commit hashes for the commits pointed by demo1 and
demo2. They stay the same. After the git rebase master, you're still
at the same

. * 8fb60ad (HEAD, demo2) +demo2
. * 766c965 (demo1) +demo1
. * 7d3a59b (origin/master, origin/HEAD, master) Add foo17.txt

.

mattflaschen wrote on 2014-05-01 20:39:21 (UTC)

Rush:

What behavior do you expect here? No new changes so it just kicks back with a 'no changes?'

Yes, and git-review does this with Gerrit.

Rush wrote on 2014-05-02 15:42:14 (UTC)

I gotcha now, I have no idea how to solve this, most definitely an upstream issue, but I think you guys are going to hate me....because I like this behavior :D

No intention of convincing anyone they should like it too. But to help explain...a situation I recently got myself into with gerrit as an example, and I'm sure there is some much smarter way to do this. I have a branch I needed to push that once merged would be on the same branch in the repo, but initially I did this wrong and it was showing it would merge into master. I tried a bunch of stuff to fix it and update gerrit but ran into a problem with the differential.

my chat was something like:

14:37 chasemp: have brach debian, git-review, makes a changeset https://gerrit.wikimedia.org/r/#/c/129835/1
14:37 chasemp: oops, says branch master, need to update .gitreview for this branch
14:37 chasemp: abandon bad change, update .gitreview
14:37 chasemp: git-review
14:38 chasemp: doesn't see new changeset update, just leaves abandoned one
14:38 chasemp: so remote gerrit side did nothing
14:38 chasemp: even though technically now it was new (metadata wise)
14:38 chasemp: it would be branch debian

Eventually I did:

14:44 chasemp: so....I rebase -i and amended nothing onto the commit
14:44 chasemp: and then it seemed to work
14:44 chasemp: https://gerrit.wikimedia.org/r/#/c/130145/
14:44 chasemp: that is a wtf for me, but thanks man!

I'm really not trying to call out gerrit for bad behavior here, it was a mess I got myself into by not understanding things.

But, for me, and in the case of I am "arc diffing" over an already existing range of commits, I am explicitly telling arc to update no matter if git thinks there are changes or not. Meaning, I want to tell the tool, not have the tool tell me. So this behavior I am ok with. But I understand if it's unintuitive to others, I do feel like this isn't major enough to be a blocker for implementation?

I guess the editorial note would be I have never had a situation where I was lost on, have I made this a differential yet? It is for me easy to look and see and/or if I did force it to update to be sure I guess I wouldn't mind in that case.

Maybe @epriestly could weigh in

qchris wrote on 2014-05-02 17:32:07 (UTC)

@Rush

I have no idea how to solve this, most definitely an upstream issue, [...]

Harr :-(

[ Gerrit and pushing same commit to different branches ]

That's one of the Gerrit bugs that bother me a lot.
You can find it in different formulations all around the net.

Using the browser, you can work around it by using the “Cherry Pick To” button.

Using only git, the easiest workaround is to get a new commit hash (which it
seems you did, as the Patch Set 1 of the changes you linked have different
commit hashes) and then push that one.

qgil wrote on 2014-05-05 20:18:31 (UTC)

From the report upstream:

This is intended: it doesn't do any damage, and the user explicitly told us to do it. They may reasonably want to pick up side effects in the environment which are not captured in the commit hash (for example, a change to configuration, unit tests, or lint). They may also want to test Herald rules. In particular, I do this routinely during development.

While we shouldn't privilege development workflows over user workflows, I can't recall regular users ever reporting difficulty with this.

We could issue a warning ("You didn't change anything, are you sure you want to update with the same changes?") but I'm not sure when a user would try to do this without understanding what they were doing. Can you provide some context? Specifically, where in your workflow would you run arc diff on changes which might or might not have been sent for review already without knowing?

greg lowered the priority of this task from Low to Lowest.Nov 4 2015, 7:16 PM
greg moved this task from To Triage to Backlog on the Differential board.

I personally don't see this as a blocker for migration. Is it annoying in some cases? maybe. Is it debatable of what "correct" is here? Yeppers.

greg added a subscriber: epriestley.

Also from upstream on this issue, in response to why you might actually want this to be the case:

am not 100% sure why you'd want to resubmit in the 'environment side-effects' case.

Specifically, the workflow I imagine is:

  • You run arc diff.
  • Unit tests fail because of some error in the harness itself.
  • You explain "I'm pretty sure this test failure isn't my fault, I'm going to go talk to Joe and see if he can help me fix this. Does the rest of this look OK?" when prompted.
  • You go talk to Joe, and he commits a fix to a separate library (libunittests) which fixes the test harness.
  • Back on your machine, you update libunittests to pick up Joe's fix and run arc diff again on the exact same changes.
  • This time, unit tests pass.
  • You write "Update with clean test run, failure wasn't related to my change (see rXyyy for discussion)."

Remember, arc diff sends along the results of the local unit test run that show up on the Differential summary, so this is something you indeed may want to do: re-run arc diff, without changing your code, and getting new info (unit test results) to show up in the Differential (eg Dwhatever).

Given:
A) there's a legitimate use case for the current behavior in arc/Differential, and
B) even if run accidentally the only harm is superfluous emails (iow: it shouldn't be harmful to anyone who is reasonable with their colleagues'/collaborators' mistakes), and
C) the only other argument (afaict) is "git-review/gerrit does it this way"

Therefore: I am going to remove this from the Gerrit-Migration project (ie: remove it as a blocker to the migration).

I'll leave it open as a bug in Differential in case we can come up with a sufficiently universal solution (ie: modifying arc diff slightly to prevent accidental re arc diff'ing but not preventing people from doing it in all cases (eg: the legitimate example from Evan above)).