Using Firefox (55.0.3 (64 bit)) is impossible to edit image captions without opening the Image dialog. In Chrome the caption can be edited as a normal text, but in Firefox the cursor stays at the beginning of the caption.
When I tested this, I could actually edit the caption, but it behaved so strangely it was mostly unusable. I could type at the start of the caption, and cursor past the existing one, but not edit inside it. I could delete the whole caption, but it'd add a bunch of spaces where it shouldn't and then break things up onto multiple lines. After enough fiddling around, it wouldn't even move the cursor any more.
Looks like a browser bug, if we don't set -moz-user-select: none on ve-ce-focusableNode * and then reset it to text on ve-ce-activeNode it works fine. Strangely it works in standalone, where the CSS is very similar.
Looks like this is a regression caused by https://gerrit.wikimedia.org/r/#/c/366898/ (pinging @Arlolra), which changed figcaptions to be display:block inside display:table, instead of display:table-caption. We discussed the invalidness of this at the time, but it didn't appear to be problematic. Now it does...
If putting display:block inside display:table is going to trigger mystery browser bugs we should definitely avoid it. So I'd suggest we revisit that decision on the figure DOM structure, and attach the zoom icon back to the figcaption. IIRC the only downside to that was that the icon wouldn't show if the caption was empty, but that could either be ignored or worked around.
I haven't looked into this yet, but as a general first comment, that patch is about the border-bottom, not the magnify link.
The reason for not using table-caption, as before, is that it implies the styling for the border-bottom needs to go on the figcaption and is therefore not present when that's omitted, as the commit message says. A solution there would be to always emit a figcaption, even when empty.
I don't think there's anything inherit in the figcaption being displayed as a block nested in a table that's preventing this from working. An isolated test case of just that works fine. I think it's the opposite, there's something special in it being displayed as a table-caption that gets it to work at all. My hunch is that because the table-caption is displayed outside the table, it isn't subject to some inherited property of some box above it. Firefox seems to have issues with inheritance of the -moz-user-select properties, judging from some comments in the VE source.
If you remove those -moz-user-selects entirely, both the none and text, it also works.
Anyways, above is a patch that gets it to work. Not sure how acceptable it is though. I can dig further next week.
MozText, // Like TEXT, except that it won't get overridden by ancestors having ALL.
So, that kind of points at the problem.
If block were really the issue, then switching to say, table-cell or table-row (something ostensibly more semantically correct) for the figcaption would resolve it, but it doesn't. Along the lines of the hunch above, the magic of table-caption is that it probably changes what's considered an ancestor.
A less hacky solution might be to look at where moz-user-select all is being set (grep tells me there're a bunch of places) and deduce from there.
Okay found the steps, it happened when I tried to add a caption to an existing image. It is a bit difficult to reproduce at this moment because of T175943, but found an older revision of my user page with an image with no caption on Beta cluster and it was happening there too. So this particular case of editing empty caption field is happening for both Beta Cluster and test2.
Change 379613 abandoned by Arlolra:
Guard against empty nodes
Yup, confirmed this is fixed by https://github.com/wikimedia/VisualEditor/commit/3a64969b9a3ee595c915549b38c7312c4034f441