Page MenuHomePhabricator

"Unexpected object type from git cat-file" errors in various imported Gerrit repositories
Open, LowPublic

Description

Thanks to https://we.phorge.it/T15661 deployed in https://gitlab.wikimedia.org/repos/phabricator/phabricator/-/commit/cb556bb27d70ee1f926670e90a1f08e4c597baca we know which Git repositories (mirrored to Diffusion) trigger exceptions in our Phabricator error_log (because mirroring into Diffusion) and thus welcome fixing.

Upstream source code: https://we.phorge.it/source/phorge/browse/master/src/applications/diffusion/query/lowlevel/DiffusionLowLevelResolveRefsQuery.php

I think nearly all are in Gerrit and not (yet) in GitLab thus tagging it.

git fsck and git gc might come handy.
Further debugging in Phabricator would come handy, see hashar's comments below.

Event Timeline

Aklapper created this task.
hashar subscribed.

[2024-01-26 06:11:01] Unexpected object type from git cat-file in rMW: tag: 1.33.3 missing,

It is definitely in Gerrit:

$ ssh gerrit1003.wikimedia.org sudo -u gerrit2 git -C /srv/gerrit/git/mediawiki/core.git/ cat-file tag 1.33.3
object c56e4c5bd4b558c818578408accaeee65cff56a7
type commit
tag 1.33.3
tagger Reedy <reedy@wikimedia.org> 1585245864 +0000

Tagging 1.33.3
-----BEGIN PGP SIGNATURE-----

iF0EABECAB0WIQQdmIZ+gpgsj+Crwl+babMQnTu3sAUCXnzurQAKCRCbabMQnTu3
sAFAAKDVTypBE44brFvJVT8iWnF6slb6fwCfbYl5nB3jYRIv0riEoKrfNebPt2k=
=uxtS
-----END PGP SIGNATURE-----

[2024-01-30 14:56:11] Unexpected object type from git cat-file in rEADL: tag: 0.3 missing, referer: https://phabricator.wikimedia.org/diffusion/EADL/browse/tag%253A%25200.3/

The tag is present, though that is an unannotated / unsigned tag, so that is really just the reference refs/tags/0.3 and it is there as well:

$ sudo -u gerrit2 git -C /srv/gerrit/git/mediawiki/extensions/AdminLinks.git/ show-ref refs/tags/0.3
8b5b9a72b0e318f0a352e9a9e2d8304dee7f4e9b refs/tags/0.3

2024-01-28 20:39:17] Unexpected object type from git cat-file in rMEXT: HEAD -> master missing,

https://phabricator.wikimedia.org/source/extensions/branches/ prefixes the master branch with to indicate it is the default branch / HEAD. So I don't get what that message is about.

I am moving this task from #gerrit#phabricator.

[2024-01-21 15:50:48] Unexpected object type from git cat-file in rMW: Omega Pixel Cube missing,

I am wondering what is feeding the code with Omega Pixel Cube. Something is calling DiffusionLowLevelResolveRefsQuery->withRefs() with some list of references which is funky. I could not find anything matching Omega Pixel Cube, I don't know from where that might come.

Aklapper renamed this task from Fix "Unexpected object type from git cat-file" errors in various Gerrit repositories to "Unexpected object type from git cat-file" errors in various imported Gerrit repositories.Mar 20 2024, 9:54 AM
Aklapper updated the task description. (Show Details)

For the message HEAD -> master missing my guess is that it comes from the output of git branch -r, possibly by Diffusion when it is trying to list the branches of a repository. On mediawiki/core that would starts with:

origin/HEAD -> origin/master
origin/REL1_23
origin/REL1_25

The output in cgit is in builtin/branch.c:

strbuf_addf(&local, "%%(refname:lstrip=2)%s%%(if)%%(symref)%%(then) -> %%(symref:short)%%(end)",
            branch_get_color(BRANCH_COLOR_RESET));
strbuf_addf(&remote, "%s%%(refname:lstrip=2)%s%%(if)%%(symref)%%(then) -> %%(symref:short)%%(end)",
            quote_literal_for_format(remote_prefix),
            branch_get_color(BRANCH_COLOR_RESET));

The -> shows up when the reference (eg origin/HEAD) is a symbolic reference (%(symref) is set).

Being a symbolic link, that output is not recognized by the code in DiffusionLowLevelResolveRefsQuery and lands in $unresolved which is then passed as-is to git cat-file which causes the error message.

It is a bit magic, HEAD is a special branch :]

@Aklapper from where do those messages come from? I can't find an error_log file on phab1004. If we had a stacktrace that might help as well maybe.

that output is not recognized by the code in DiffusionLowLevelResolveRefsQuery and lands in $unresolved

Thanks, that's helpful!

from where do those messages come from?

https://logstash.wikimedia.org/app/dashboards#/view/AWt2XRVF0jm7pOHZjNIV

From a stacktrace:

2024-03-22 23:44:16
Unexpected object type from `git cat-file` in rEMEM: HEAD -> master missing 
...
referer https://phabricator.wikimedia.org/diffusion/EMEM/browse/HEAD%2520-%253E%2520master/
HEAD%2520-%253E%2520master
HEAD%20-%3E%20master
HEAD -> master

I then went to the Apache 2 accesslog dashboard, looking for url.path:*EMEM*:

url.path/diffusion/EMEM/browse/HEAD%20-%3E%20master/
referrerhttps://phabricator.wikimedia.org/diffusion/EMEM/browse/HEAD%2520-%253E%2520master/

Which is certainly forged. Maybe the method can be given a hint that the looked up string comes from an arbitrary source and might well not exist. In which case it is fine to display an error message, but there is probably not much point in logging a server side error.

So I guess it is purely logspam.

Another one I have investigated:

[2024-02-02 15:06:01] Unexpected object type from git cat-file in rEBTX: tag: 3.1.8 missing, referer: https://phabricator.wikimedia.org/diffusion/EBTX/browse/tag%253A%25203.1.8/

If you head to rEBTX , then to the Tags tabulation, browse down to 3.1.8, there is a little browse button which points to https://phabricator.wikimedia.org/diffusion/EBTX/browse/master/;3.1.8 .

So maybe tag:xxx is an obsolete scheme or that got fixed upstream?