Hello,
so far we don’t have full inderstanding of why the problem happens but have some ideas about it.
What happens?
When you merge a merge request, a “git push” happens inside GitLab. A new commit is created (e.g., 5341d6bace0087fa4df7629d879e679cfd7b64ce). Then SubGit walks through all new commits (each time it’s just one commit: e.g. 5341d6bace0087fa4df7629d879e679cfd7b64ce) and tries to resolve SVN revision for them to distinguish between old and new commits (old commits have SVN revisions attached).
The SVN revisions are stored externally in Git notes format but in refs/svn/map
reference. Only the latest commit under refs/svn/map
is used and this commit has a tree looking like (level 1):
040000 tree 2c6f59d7ef2e97b4b89712a88460c67b9d6211c1 00
040000 tree bec321fcd00c34920a34e0b57d548c9375793482 01
040000 tree cf731982dfe7a9e09fd89250bd8225787079d256 02
040000 tree ce6bc1b304c15395edd5069b65990705f0dbcec6 03
040000 tree 4b543061ef143c974dbc9b845d1b8ffd1ac86bc5 04
...
040000 tree ec1ac8deaac4d7eef5b4c466d7c2097e59d3f341 fa
040000 tree f3d70df766e8b37233f8cc7434b04ecc06b5954c fb
040000 tree 988203b784aa17cc24219633753fe388c0bf3a33 fc
040000 tree 77fff17f4d8a33117e411e410a4532786581247f fd
040000 tree 207c4dbdf8903660bb9219ea7150ff647fab8fa2 fe
040000 tree c743ea28588bfb2f592310915165d8b219e724c2 ff
The paths are just 2 hexadecimal digits corresponding to the first 2 digits of commits and the tree points to another tree object of format (level 2)
100644 blob fa0346acf34120e7905f466013102a02eb6c2ed4 0b12238c3212e759f5b056ea1b1088f9160247
100644 blob efa25af5e5fb12eae4f8c5ad8c66878b6a1b8602 2400b60aedd7fe51069a9177c71a25ac1411da
100644 blob 6f8a2ac65c0501085d1447ff73fa38977996ec6b 2693ef30617d71f16b014d219acea0e2066f72
100644 blob 13f4060bfc686510e2aaadce1b62c2ccd8b08e9f 2c2061611ea8938f2ce8fc971c146acd35da48
100644 blob 9d14407b9d481e83b9b750563da3d6624b297383 3479f8c98a5eee70d6a528f2299d54277a1bfa
100644 blob 0ea40048b40f3fd4bb4c7b72868ec90324d73917 37adf5641e0e5c027e850a9be15fd94820d464
...
where the paths are the remaining 38 digits of the commit and the blobs contain content like
$ git show f17dbb7d
r5160 trunk/fw
i.e. contains the revision number and the name of the branch.
In your case SubGit tried to resolve SVN revision for newly pushed commit 5341d6bace0087fa4df7629d879e679cfd7b64ce. To do that it looked inside refs/svn/map tip commit to find a record corresponding to “53” (the first 2 digits of the commit SHA-1) – this corresponds to level 1:
...
040000 tree 540c504c8bd266e7d16948a050bf79849b8c5b81 53
...
Then it tried to load the tree 540c504c8bd266e7d16948a050bf79849b8c5b81 to find there the rest of the SHA-1 (41d6bace0087fa4df7629d879e679cfd7b64ce, i.e. 5341d6bace0087fa4df7629d879e679cfd7b64ce without “53”). And surprisingly SubGit cannot find this object 540c504c8bd266e7d16948a050bf79849b8c5b81 in the repository.
This is strange because in the output of “git cat-file --batch-check --batch-all-objects” command the object is present and all other commands succeed (they would fail if the repository were inconsistent). So the object is clearly present in your repository, at least according to Git command but SubGit just doesn’t see it.
Why this happens?
We don’t know for sure but we suspect this could be related to GitLab’s deduplication feature. Maybe the objects were moved away to another repositories (maybe so called “pool repositories”) and these repositories could be linked to this one through object/info/alternates
files. One more argument in favor of this guess: this “53” directory (540c504c8bd266e7d16948a050bf79849b8c5b81 tree) could exist for a while without changes, and usually such objects are the best candidates to be archived, rather than recently added objects.
So I would ask you to check:
- If you have
object/info/alternates
file and what it contains.
- If you have 540c504c8bd266e7d16948a050bf79849b8c5b81 object on the filesystem, ie.
objects/54/0c504c8bd266e7d16948a050bf79849b8c5b81
file other the filesystem inside the Git repository directory.
- If you don’t have it, check whether it is packed using
for file in $(find objects/pack -name '*.idx'); do git verify-pack -v "$file" | grep 540c504c8bd266e7d16948a050bf79849b8c5b81 && echo "$file"; done
one-liner from the Git repository directory. The script will print the name of the pack file index (.idx file) where the object resides. If it doesn’t find anything and the object is not on filesystem and you have alternates, it would make sense to look the object up in that alternate directory. We just want to find out where the object actually is to realize why SubGit doesn’t see it.
Once you find out where the object resides and whether alternate object directories are used, we could proceed to resolve the issue.
If you know something relevant about your GitLab settings related to deduplication or “pool repositories”. Or if you know that this repository was forked or you have some backup system that could enforce deduplication, please share your knowledge, this could be helpful as well.