Checksum mismatch when fetching from SVN

Hello Nikita,

thank you for the logs, we’ll try to find out what happened there. May we also have the SVN repository log (svn log -v output)?

Hello.

Sorry, but for security reasons, we can’t share svn logs. What exactly you want to check there? Maybe we can check it by our-selfs.

Hello Nikita,

I’m afraid I cannot say for sure what to look for in the logs. The “Checksum mismatch” error itself means that SubGit expected to find specific hash for a file, but the hash differs from expected. It’s not clear which of the two is correct and how has the second been changed; it can be caused by SVN database corruption, for example, but of course, it’s not the only reason. It’s also hard to tell for sure where exactly (at which revision) has it been changed and thus it’s hard to tell which part of log is of particular interest and which revision to rebuild the repository from (if any). This issue is pretty complex, there are number of possible causes and it requires a thorough investigation to find the cause and that is what we were hoping to start doing having the logs.
Another possible approach is to have a meeting with our devs to check both repositories together. I’m not sure if that is acceptable from the security point of view, but I’m afraid hardly we would be able to find the cause having only SubGit logs.

Hello,

I have to return to you for help because we didn’t manage on our own. You need the full output of the command svn log -v, or the logs of those revisions that caused the mismatch will be enough(and, for example, neighboring ones - by revision before and after)?

Hello Nikita,

we would need the full output of svn log -v; I’d like to note also that we may ask you for additional information it it will be discovered during the investigation.
And do I understand you right that you know the revision number that causes the mismatch? Could you please advise how have you come to this conclusion? Also, do I understand correctly the repository is still unsynchronized up to now? Have you tried to rebuild it from a revision a bit earlier than the problematic one you found? If yes, what was the result?

Okay, I will answer about full svn log -v in a while.

Yes, you understood almost right. But it isn’t one revision, it’s many revisions that appear every few days, mostly in a specific branch.
About this conclusion and the state of synchronization. Rebuild from revision doesn’t help, so I google how I can skip the revision and found the answer here. So, the conclusion was: "If the branches-maxRev and the tags-maxRev contain last successful(I mean the one which was successfully synchronized) revision, the number of the corrupted revision = branches-maxRev + 1 = tags-maxRev + 1 " and it seems that I was right because it helped.
But, as I already said, we have new corrupted revisions every few days, so we must skip them to keep synchronization alive and it is, obviously, not good.
About corrupted revisions: they are no different from those that synchronized successfully. For example, I checked even the EOLs of these files. The only thing that stands out is almost always the same branch, but this branch also has a lot of synchronized commits

Hi Nikita,

thank you for the detailed explanation.
But “new corrupted revisions every few days” looks strange for me, could you please describe this in a little more details? Do I understand correctly you are receiving a new “Checksum mismatch” error every few days? Could you please describe your setup, is that a regular SubGit ‘SVN repo <–> Git repo’ mirror or are there any additional repositories/hooks/scripts?

Hi!

Yes, you understood right. Our setup is done just after this documentation’s guide.
Sorry, I don’t how to describe the situation in more detail than in the previous answer. Do you have any other guesses (or leading questions)?

Hi Nikita,

this is exactly what needed to know, if the steps in the documentation were followed exactly, then I have no more questions about the setup itself, yet I would ask you about the GitLab server – is that just a regular server and SubGit install on that server or is that a Docker container, for example? Is SubGit installed on the same machine (or in the same container) as GitLat?
Additionally, may I ask you to collect fresh SubGit logs, especially hooks and daemon logs?

We use Gitlab CE installed on regular server(Centos 8). SubGit is installed on the same machine.
Here are fresh subgit-logs.zip (3.0 MB), but they may not contain useful information, since synchronization was restored recently and there were no errors in it yet.

Hi Nikita,

thank you for the logs.
Could you please also advise what was the revisions number you tried the rebuild the repository from? And is there any news about the svn log?

Hi!

We tried some different revisions, I don’t remember exactly, but starting from about r140000.
About the full svn log - there are almost 1,800,000 lines, and that’s why I should ask once again, are you sure you need the full one?

A few hours ago there was another synchronization failure, so I can provide you fresh logs, including daemon, hooks, fetch, and svn(there are 2 revisions that caused desynchronization), hope it will help us.
logs-after-crash.zip (2.5 MB)

Hi Nikita,

thank you for the logs.
Regarding the ‘svn log -v’ output – may we have the output for the “5.1.115.128” branch?

Hi, is it will be okay if I’ll send it to you by email? (To avoid this file appearing on the forum)
And if it’s okay, could you give me this email address?

Hi Nikita,

sure, email is ok; better to send it to support@subgit.com.

I just sent an email with the same subject as the title of this topic.

Hi Nikita,

yes, we have received the log, thank you for that.
I’d like to ask you about the rebuilding attempts you made: could you please advise what exactly happened when you tried to re-build the repository, especially in the case when you tried to rebuild it from the revision 140000? What was the exact command you used for that and what was the result and messages/errors along the way of the rebuild?

Hello.

I’ve already described what happened after I tried the rebuild from revision in the 7th reply. I think I’ll try one more time right now and will give you know what will happen.

BTW, after I’ll run the rebuild from the revision, what will happen to the revisions we skipped with the .metadata file? Will they still be missed or SubGit will try to restore them?

Hello Nikita,

sorry for asking that question too often, I just had an impression that the rebuild attempts you mentioned in the recent reply were not those you mentioned earlier.
If the rebuild starts from a revision earlier than the skipped one, than SubGit will re-download the revision during the rebuild thus rebuilding it. But that, actually, is what we would like to recommend doing in this case: skipping a revision in a running mirror is not recommended approach, especially to resolve the “Checksum mismatch” issues as that approach may cause other issues like this in the future. The most reliable way is to rebuild repository, at least from the r157964, or from earlier revision, like r140000 you mentioned before. Use the following command for the rebuild:

subgit install --rebuild-from-revision 157964 /path/to/git/repository