Hook not always executed

dlkmp · May 23, 2019, 12:38pm

We’re using subgit together with gitlab. We’ve set up a user-post-receive-hook on the git side, which reads the SVN_LAST_FETCHED_REVISION-environment variable from subgit and does some work based on this. So if there is an individual commit in SVN, it is transferred to git by subgit and the custom hook is called afterwards, which is what we expect. However, if there are two commits in a very small time range (a couple seconds), we observe that one of these commits “wins” and the hook is executed only once for either the first or second commit. This seems to happen consistently every time when commits follow each other sufficiently quick.

Is there any way to guarantee hook execution in above setup? Thanks for any advice.

ildarhm · May 23, 2019, 2:45pm

Hi David

we would need to investigate that issue a bit more to answer. Do I understand correctly that a commit that “fails” never appears neither in SVN nor in Git? Could it be possible to provide subgit logs (daemon and hooks logs) from the repository where it happened?

dlkmp · May 24, 2019, 6:21am

Hi Ildar,

it’s not the commit, it’s the post-receive hook that is failing, the commit itself is visible in both SVN and git. Here is an example, we have two commits in SVN (Rev. 199603 and 199604), 12 seconds apart:

SVN-Log

Both of these are successfully transferred to git, here is the output of cat subgit/logs/daemon.1.log | grep <rev>:

199603:

[2019-05-23 18:00:37.056][daemon][22] Revisions: <fetched=199602; last=199603>
[2019-05-23 18:00:40.043][daemon][22] SET_PATH '' 199603 not empty depth=infinity
[2019-05-23 18:00:40.048][daemon][22] fetching: branch = refs/svn/root/trunk, revision = 199603, receivedFileCount=0
[2019-05-23 18:00:40.203][daemon][22] fetched: hash = e948026109e42a809e3738b47ce765e79baeee64, branch = refs/svn/root/trunk, revision = 199603
[2019-05-23 18:00:40.205][daemon][22] Updating latest fetched revision for svn-remote "svn" to r199603
[2019-05-23 18:00:40.206][daemon][22] SET_PATH 'trunk' 199603 not empty depth=infinity
[2019-05-23 18:00:40.313][daemon][22] Executing [/var/opt/gitlab/git-data/repositories/<myrepo>/custom_hooks/user-post-receive]; environmentVariables={SVN_LAST_FETCHED_REVISION=199603, GIT_DIR=.};workingDirectory=/var/opt/gitlab/git-data/repositories/<myrepo>

199604:

[2019-05-23 18:00:40.206][daemon][22] SET_PATH '' 199604 not empty depth=infinity
[2019-05-23 18:00:40.212][daemon][22] fetching: branch = refs/svn/root/trunk, revision = 199604, receivedFileCount=3
[2019-05-23 18:00:40.245][daemon][22] fetched: hash = 2667e31fe4ae567c4c4582e932f9379784cf1b4d, branch = refs/svn/root/trunk, revision = 199604
[2019-05-23 18:00:40.246][daemon][22] Updating latest fetched revision for svn-remote "svn" to r199604
[2019-05-23 18:00:53.132][daemon][22] Revisions: <fetched=199604; last=199604>

As the log clearly shows, the hook is executed for the first commit but not the second. This only happens if the time between the two commits is sufficiently short.

Edit: I’m realizing just now that the issue is even worse. When looking at the timestamps, you can see that the hook for rev. 199603 is triggered after rev. 199604 has been fetched. Now the hook is called with the git-hash corresponding to svn rev. 199604 (2667e31f) while the environment variable ‘SVN_LAST_FETCHED_REVISION’ is still set to 199603. This causes our hook to produce inconsistent data as the environment variable does not match the git hash given as input argument.

semyon.vadishev · May 24, 2019, 8:05pm

Hello David,

Thank you for the details on the issue.

Please note that by default SubGit polls Subversion server once every minute. It may happen that multiple revisions appear in SVN repository within this time frame and those revisions may update a few different branches or tags. In this case, SubGit may fetch multiple revisions at once and apply the new changes to a number of refs of the synchronized Git repository.

When svn.triggerSvnPostReceive option is set to true, SubGit may fetch multiple revisions, but it triggers user-post-receive hook only once for all the updated refs and it passes all the updated Git refs into stdin of the hook in this case.

This behavior is similar to the post-receive hook mechanism: Git triggers the hook once for all the updated refs rather than running the hook for every single commit pushed to a Git repository.

I believe this is what happens in your case: SubGit fetches two revisions: r199603 and r199604, creates two commits: e9480261 and 2667e31f and then triggers user-post-receive hook with the following input:

2667e31f refs/heads/master

By design, SubGit has to pass 199604 as SVN_LAST_FETCHED_REVISION in your scenario, however, due to a bug
SubGit passes 199603. This happens when a new revision gets committed right after the start of the sync operation, so it seems to be a race condition (in most cases SubGit should pass a correct revision as the environment variable).

We’re going to fix this bug in the upcoming version, but I have to discuss the issue with my colleagues before giving you any estimates on the bugfix release date. Please expect more details early next week.

Kind regards,
Semyon