Mirroring fails with error "Short read of block"

This has been going on for a few days. I don’t know if something bad happened in svnmirror or in the svn server, but we’re a bit stuck. We had to turn mirroring off to continue.

When mirroring is turned on, and someone tries to commit, they get the message:

[mn@gpu-P1080-3 clean]$ git push
Enumerating objects: 236, done.
Counting objects: 100% (169/169), done.
Delta compression using up to 4 threads.
Compressing objects: 100% (21/21), done.
Writing objects: 100% (40/40), 4.62 KiB | 4.62 MiB/s, done.
Total 40 (delta 34), reused 24 (delta 19)
remote: error: Short read of block.
remote:
remote: Fetching revisions from SVN repository:
To http://githost:7990/scm/dp/d2s_sw.git
! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to ‘http://mn@githost:7990/scm/dp/d2s_sw.git

The “Short read of block” shows up many times in the log files, too:

2019-02-04 05:39:02,110 sync - ######### ERROR REPORT ############
2019-02-04 05:39:02,110 sync - file ‘/mnt/ssd-raid/var-atlassian/application-data/bitbucket/shared/data/repositories/13/subgit/error’ does not exist
2019-02-04 05:39:02,110 sync - ######### GIT TO SVN FAILURE REPORT END ############
2019-02-04 05:39:02,110 sync - about to refresh repository
2019-02-04 05:39:02,110 sync - Posix: Non-JNA platform is chosen.
2019-02-04 05:39:02,110 sync - svn: E204900: Short read of block. org.tmatesoft.translator.util.f: svn: E204900: Short read of block.
at org.tmatesoft.translator.util.f.c(SourceFile:109)
at org.tmatesoft.translator.util.f.b(SourceFile:75)
at org.tmatesoft.translator.m.ad.c(SourceFile:974)
at org.tmatesoft.translator.m.ad.a(SourceFile:994)

I’ll attach the relevant svnmirror.log, in case that helps. Any advice for what to look at would be deeply appreciated.

–Harold Z.

Hello Harold,
Thanks for the log. According to it, the error happened while reading one of the Git object files: the add-on has read a header of the object (which contains the object length) but when it tried to read the body, the body appeared to be shorter that it was.

First of all I would recommend you to run “git fsck” command. I would recommend you to run
{code}
git fsck --full --strict
{code}
Then also try
{code}
git fsck --full --strict --unreachable
{code}
The idea is to find and delete the broken object. The way SVN Mirror add-on works: it creates unreachable objects first and only later puts some references on them, so it’s expected that there’re some (many) unreachable objects, and one of them might be broken. It’s important to remove the broken object, because otherwise it won’t be overwritten: Git/add-on never overwrite existing objects.

If all reachable objects are ok (the first command), there’s one command (two commands, to be precise) that removes all(!) unreachable objects and repacks the objects. Run them one after another, but before that stop all processes working on the Git repository (better disable the add-on temporarily).

{code}
git -c gc.autoDetach=0 -c gc.reflogExpire=0 -c gc.reflogExpireUnreachable=0 -c gc.rerereresolved=0 -c gc.rerereunresolved=0 -c gc.pruneExpire=now gc --prune --aggressive

git -c gc.autoDetach=0 -c gc.reflogExpire=0 -c gc.reflogExpireUnreachable=0 -c gc.rerereresolved=0 -c gc.rerereunresolved=0 -c gc.pruneExpire=now prune
{code}

As a positive side effect of this command: it makes the repository much more compact (often 4x more compact).

After that you will have only valid reachable object (unreachable objects are removed by these long commands).

I would like to understand why the problem happens, so I’ll ask you several questions:

  1. What SVN Mirror add-on version are you using?

  2. Are there’re big (> 40Mb ) files among these:
    {code}
    ======
    r46295
    M trunk/fermi/exefermi/test/script/fermi_verification.checker
    A trunk/fermi/exefermi/test/micron_training
    A trunk/fermi/exefermi/test/micron_training/micron_training.oas
    A trunk/fermi/exefermi/test/fermiTest2_data/fermiTest2_pattern.oas
    A trunk/fermi/exefermi/test/fermiTest2_data
    M trunk
    ======
    {code}

  3. Is your Git repository on NFS (I’m thinking of network problem causing writing a header only, though theoretically it should fail earlier)?

  4. Are you using Data Center or Standalone version of Bitbucket?

In what git tree do I run the “git fsck…” commands? It seems to me that running on my local cloned repository wouldn’t help anything on the server. I don’t know where (if anywhere) BitBucket stores its data as a normal git repository.

A1. SVN mirror 3.4.5 under BitBucket 5.11.1.
A2. Yes, there are bigger files there. fermiTest2_pattern.oas is about 100Mb.
A3. No, our git repository is local to our BitBucket/git host.
A4. We’re not using the Data Center version, so I guess it’s the Standalone version.

You can find the path to the Git repository on the Git repository settings page of Bitbucket Server.

Okay.

[root@githost 13]# git fsck --full --strict
fatal: object 24ea29d1c7547e30d1109366e8c733820ae71217 is corrupted
[root@githost 13]# git fsck --full --strict --unreachable
fatal: object 24ea29d1c7547e30d1109366e8c733820ae71217 is corrupted
[root@githost 13]#

I checked both this tree and my own local clone to see if I could find an object with that hash, but it wasn’t there:

[root@githost 13]# git ls-tree -r HEAD | grep 24ea29
[root@githost 13]#

The object in question is size 0. It looks like the standard procedure for such things is to just delete the object, right? I just rm the file?

That led to another zero-length file that was similarly corrupted, and then another…
After removing a dozen or so of these empty object files, the git fsck is busily chugging away.

Yes, delete them. I wouldn’t call that a standard procedure because it’s not a standard situation. These files probably were created while translating an SVN revision to Git.

This situation could happen if the server machine were out of disk space (especially taking into account the presence of large files in this revision). If this is so, the 2 commands in my first command should significantly help to reduce its usage unless you have something else to remove. But they are very sensitive to other activities on these 2 repositories.

Also there’s a chance that the problem could have affected the previous revisions. So if what you did doesn’t help (though I believe it should, but if some files you deleted were parts of previous revisions, there could be "missing unknown " errors), you could either 1) re-create a Git repository from scratch, or 2) contact us again, we’re preparing a new build with the ability to rebuild Git repository starting from certain revision, so you could re-translate e.g. the latest 10 or 100 revisions.

The server machine is nowhere near being out of disk space (17% usage out of 1TB), so that’s not the problem.

The first fsck took about a half hour to run. It produced about 80 dangling blobs, 7 dangling trees, 4 dangling commits, and 2 missing trees.

The second fsck (done with a later version of git) produced one “invalid reflog entry” and about the same number of unreachable trees, commits, and blobs.

Poking around the net, I get the impression that dangling stuff is quite common in normal use. Missing trees is a bit odd… might those come from this svn issue?

BitBucket is supposed to run various git gc options itself. I take it that the extra garbage collection here is to clean up things that may be left behind by svnmirror?

These days, we’re making changes in git and propagating them back to svn. I’m not sure that anything is being done on the svn tree itself.

Okay… I finished everything, restarted the server, and reenabled svnmirror.

It keeps putting things into the Unsynced tab. If I try to merge things, as described, the git push usually complains about pre-receive hook declined, for one reason or another. I can then shut down svnmirror and successfully push the Unsynced changes back… but when I reenable svnmirror, a different set of Unsynced changes reappears and git isn’t working again.

Any other ideas?

Hello Harold,
as you couldn’t find 24ea29d1c7547e30d1109366e8c733820ae71217 object in your working tree, probably it comes from SVN and I believe, other dangling objects come from SVN as well. Dangling objects are ok. Missing objects can be explained: you removed invalid objects, so now they are missing.

These missing objects can be a problem but also there’s a possibility that they don’t cause problems (e.g. if a dangling tree refers to a missing object — it would be removed anyway).

Regarding “git gc” (and also “git prune”): there’re multiple ways to run this command. The long commands above can just remove all dangling objects, otherwise it would take a month for them to be removed. But this “git gc” is pure optional, it’s just a way to free disk space.

Regarding Unsynced issue: the answer depends on the current state of your repository. The easiest way to solve the problem is to re-translate the repository starting from some revision. There’s a tricky way to do that with SubGit command line tool (from subgit.com) but we’re preparing a build that would allow you to do that using graphic interface. Would it be ok for your to wait for the build?

Otherwise, you could solve this unsynced issue by merging branches one by one. You wrote that pre-receive hook declines pushes for one or another reason. Could you give us the examples/screenshots/logs?

Finally, you could ignore those unsynced commits and re-apply Git only changes manually. The details are below.

Just to understand why the unsynced problem happens. If you disable SVN Mirror add-on for a while and someone commits independently to SVN and Git, upon enabling SVN Mirror cannot solve this conflict automatically (e.g. it cannot decide in which order the commits should be or which of the conflicting changes to choose). What it does: it restores everything to the state reflecting SVN repository and moves away Git-only changes to refs/subgit/unsynced/… namespace. Then you can merge or cherry-pick those Git-only changes on the top of the state corresponding to SVN. Unless you do so, you get a state fully corresponding to SVN without Git-only changes. Alternatively you can abandon these unsynced changes and re-push them again by pushing them from local working trees.

I’ve gone through this sequence a number of times:

o Enable repository syncing.
o Read a message about unsynced changes.
o Follow the instructions to merge the changes into my local tree (fetch, merge).
o Do a “git push”, which gives me an error like this:

harold@gpu-P1080-1:~/d2s_sw$ git push
Enumerating objects: 1666, done.
Counting objects: 100% (688/688), done.
Delta compression using up to 4 threads.
Compressing objects: 100% (200/200), done.
Writing objects: 100% (468/468), 6.02 MiB | 17.35 MiB/s, done.
Total 468 (delta 334), reused 356 (delta 255)
remote: Resolving deltas: 100% (334/334), completed with 112 local objects.
remote: error: Cannot create 'shelves/release': it already exists in the repository
remote: 
remote: Fetching revisions from SVN repository:
remote:   up to date
remote: Sending commits to SVN repository:
To http://githost:7990/scm/dp/d2s_sw.git
 ! [remote rejected]   master -> master (pre-receive hook declined)
error: failed to push some refs to 'http://harold@githost:7990/scm/dp/d2s_sw.git'
harold@gpu-P1080-1:~/d2s_sw$ 

o Disable synchronization in the GUI.
o Do the “git push” again, which now works:

harold@gpu-P1080-1:~/d2s_sw$ git push
Enumerating objects: 1666, done.
Counting objects: 100% (688/688), done.
Delta compression using up to 4 threads.
Compressing objects: 100% (200/200), done.
Writing objects: 100% (468/468), 6.02 MiB | 17.50 MiB/s, done.
Total 468 (delta 334), reused 356 (delta 255)
remote: Resolving deltas: 100% (334/334), completed with 112 local objects.
To http://githost:7990/scm/dp/d2s_sw.git
   5f9ad540..919080f8  master -> master
harold@gpu-P1080-1:~/d2s_sw$ 

How long will it be to wait for the build? I can’t seem to turn svn mirroring back on in the meantime.

Hello Harold,
due to issues we postpone creation of the build (it’s not stable enough yet). But I can propose you another work-around.

Instead of following instructions for Unsynced commits, you could mark them resolved (so the conflicts will disappear without resolution). But note: this means that Git changes that are not present in SVN will be lost and you’ll have to push them again from your local working copies. So create a backup of the Git repository first.

So the solution would be to have a Git repository perfectly matching SVN repository and then to push all other changes from working copies on the top of that.

What about “Cannot create ‘shelves/release’: it already exists in the repository”, please attach the logs (global and per-repository svnmirror.log). Otherwise it is difficult to understand what’s wrong. It would be even better if before attaching the log, you could go to SVN Mirror global settings (Administration | SVN Mirror ) and enable debug logging there. Then the logs will be more verbose.

Attached are the two logs in question.

Hello Harold,
Thanks for the logs. In the logs I also see “Permission denied” error on creating a file:
{code}
Caused by: java.io.IOException: Permission denied
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createNewFile(File.java:1012)
at org.eclipse.jgit.util.FS_POSIX.createNewFile(SourceFile:367)
at org.eclipse.jgit.internal.storage.file.LockFile.lock(SourceFile:174)
at org.eclipse.jgit.internal.storage.file.RefDirectoryUpdate.tryLock(SourceFile:86)
at org.eclipse.jgit.lib.RefUpdate.updateImpl(SourceFile:708)
at org.eclipse.jgit.lib.RefUpdate.update(SourceFile:582)
at org.eclipse.jgit.lib.RefUpdate.update(SourceFile:563)
at org.eclipse.jgit.lib.RefUpdate.forceUpdate(SourceFile:543)
{code}

So I believe something is fundamentally wrong with your filesystem and probably this was also a cause of zero-size files. SVN Mirror add-on expected all files in the repository to be readable and writable on behalf of the same system user as SVN Mirror add-on (i.e. Bitbucket Server) is running.

Is it a option for you to re-create the Git repository from scratch and to switch to it? This would be the most straightforward work-around if you haven’t fixed the problem yet.

I see the “Permission denied” log entries, but the most recent was from Feburary 5th. It’s not part of the current (what was Feb. 8th) problem.

I’d already run into the “Permission denied” issues and fixed them. When you told me to do all the “git gc” and other modifications on the local repository, I did so in the window I was working in, which was a root login. Obviously (in hindsight), those interactions should have been done as the atlbitbucket user. I have subsequently chown’d the repository back to atlbitbucket, and I’m no longer seeing those sorts of errors.

I don’t think there’s a filesystem error. We don’t have any unexplained oddities in the system that I’m aware of.

Recreating the git repository is not a great option. My company has thirty developers that have been using BitBucket for 8 months. The svnmirror was originally critical during the transition, but now it’s mainly to keep alive a couple of scripts that haven’t been updated to work with git. I don’t want to lose the git transaction history by rebuilding from svn.

I tried looking in the repository. I found two directories named “shelves/release”:

bash-4.1$ ls -al svn/refs/svn/root/shelves/release/
total 8
drwxr-x---  2 atlbitbucket atlbitbucket   58 Sep 13 13:24 .
drwxr-x--- 48 atlbitbucket atlbitbucket 4096 Feb 11 11:57 ..
-rw-r-----  1 atlbitbucket atlbitbucket 3408 Feb  1 14:07 .rev_map.bcff3c60-1c3d-11dd-a115-e45b1b6ae10f

and

bash-4.1$ ls -al refs/svn/attic/shelves/release
total 4
drwxr-xr-x  2 atlbitbucket atlbitbucket    6 Feb 11 11:59 .
drwxr-x--- 30 atlbitbucket atlbitbucket 4096 Feb 11 11:59 ..
bash-4.1$ 

I tried removing both of them and redoing the “Enable repository” step mentioned above. There was no change from my past attempts; it created unsycned changes, and when I tried to fix them, I went through the same error and cleanup sequence mentioned above.

Hello Harold.

In this case, the best option indeed is to rebuild the repository, not wholly, but from some a little earlier revision (3 or a little more). Here are the steps on how to perform the rebuild with SubGit command-line tool:

  1. Make sure the mirror is disabled.

  2. Create a backup of that repository (a regular copy of the repository directory).

  3. Make additional backups of the following directories, we’d need to restore them later on:

     cp -R REPO_PATH/subgit subgit_backup
     cp-R REPO_PATH/hooks hooks_backup
    

    REPO_PATH here stands for the path the Bitbucket repository on the filesystem; it can be found in the Repository Settings - Repository Details page in the UI.

  4. Edit the following configuration file:

     edit REPO_PATH/subgit/.run/config
    

    a) Make sure there’s svn.auth = default option in this file:

     [svn]
     ...
     auth = default
    

    b) Edit [auth] section, so it looks as follows:

     [auth "default"]
       passwords = subgit/passwd
    

    c) Edit daemon.classpath option:

     [daemon]
     ...
     classpath = subgit/lib
    
  5. Copy REPO/subgit/.run/config file to REPO/subgit/config.

  6. After that add Subversion credentials into passwd file:

     edit REPO_ID/subgit/passwd
    

    This file uses simple format, and consists of the lines in the following format:

     svnUserName svnUserPassword
    
  7. Download and unpack the latest version of SubGit standalone tool from http://subgit.com/download

  8. Run the following command:

     SUBGIT_DIR/bin/subgit install --rebuild-from-revision <REV_NO> REPO
    

    This command will reset the mirror to REV_NO and then fetch all the missing revisions committed after that revision.

  9. When the command completes, keep the installation log at some location outside of REPO/subgit directory (it makes sense to review it afterward):

     mv subgit/logs/subgit-install-XYZ.zip subgit-install-XYZ.zip
    
  10. Restore subgit and hooks directories we’ve copied at step 3:

    rm -rf REPO/subgit
    mv subgit_backup REPO/subgit
    
    rm -rf REPO/hooks
    mv hooks_backup REPO/hooks
    

    This way we remove adjustments made in subgit/config, subgit/.run/config and subgit/passwd files, as well as restore the standard Bitbucket hooks overwritten by SubGit.

  11. That’s it, at this point you can enable the mirror back through the Bitbucket UI, the sync should be recovered.

I get stuck at the subgit install step. I don’t have a password for my account on svn, and there’s no easy way for me to change it, since the svn admin isn’t available. Is there a way to get subgit to work with a specific username with no password?

Okay, the svn admin appeared after all. Now I get this message at the install step:

bash-4.1$ subgit-3.3.5/bin/subgit install --rebuild-from-revision 46292 13
SubGit version 3.3.5 ('Bobique') build #4042

About to shut down background translation process.
Background translation process is not running.

INSTALLATION FAILED

error: Could not update 'refs/svn/root/branches/2017/MDP.2017.04' to ea47fd7f32f987820b8f666cc3e228b28aeaa82d
error: Unexpected error has occurred; please report along with the logs ('/mnt/ssd-raid/var-atlassian/application-data/bitbucket/shared/data/repositories/subgit-install-20190212-113158.zip')
error:   to http://issues.tmatesoft.com/, thank you!

I’ll restore the original repository while I await your reply.

Hello Harold,
SVN Mirror add-on tries to update refs/svn/root/branches/2017/MDP.2017.04 reference but fails. I see 2 possible reasons:

  • there’s a .lock file (/mnt/ssd-raid/var-atlassian/application-data/bitbucket/shared/data/repositories/13/refs/svn/root/branches/2017/MDP.2017.04.lock);
  • there’s a directory with the same name (/mnt/ssd-raid/var-atlassian/application-data/bitbucket/shared/data/repositories/13/refs/svn/root/branches/2017/MDP.2017.04), so the file can’t be created.

Could you please check that neither case is your?
There’s also a way to manually check that the reference can be updated:

  1. Get the old value:
    {code}
    git rev-parse refs/svn/root/branches/2017/MDP.2017.04
    {code}

  2. Try to update the reference to some ID.
    {code}
    git update-ref refs/svn/root/branches/2017/MDP.2017.04 SOME_COMMIT_SHA_1
    {code}

  3. Update it back (to the SHA-1 value from step 1):
    {code}
    git update-ref refs/svn/root/branches/2017/MDP.2017.04 ORIGINAL_SHA_1
    {code}

It’s better to do that with SVN Mirror add-on disabled because it constantly scans the references and sends the difference between
{code}
refs/heads/release/2017/MDP.2017.04
{code}
and
{code}
refs/svn/root/branches/2017/MDP.2017.04
{code}
to SVN (when someone pushes, refs/heads/… get changed while refs/svn/root/… always reflectes SVN state). In general one cannot update these references manually without corrupting the repository. But if the add-on is down and you return it to its original value, it’s ok.

Note that I had to put the “13” repository back in service after the last e-mail. That being said, there is nothing currently in the …/13/refs/svn/root/branches/2017 directory:

 bash-4.1$ ls -a 13/refs/svn/root/branches/2017/
.  ..
bash-4.1$ 

Also, the git reference stuff doesn’t work:

bash-4.1$ git rev-parse refs/svn/root/branches/2017/MDP.2017.04
refs/svn/root/branches/2017/MDP.2017.04
fatal: ambiguous argument 'refs/svn/root/branches/2017/MDP.2017.04': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions
bash-4.1$ pwd
/mnt/ssd-raid/var-atlassian/application-data/bitbucket/shared/data/repositories/13
bash-4.1$