Restarting daemon: `subgit install` works; `subgit fetch` fails

manualdidact · May 29, 2023, 12:54am

It’s been a long time since i last experimented with SubGit, and I’m pleased that the svn+ssh problems I was having a couple years back have (apparently) been resolved. I’m once again trying to set up a GitLab server, and I am trying to use SubGit to provide bidirectional mirroring ongoing, for users to transition. We’ll be operating this way for quite some time, so I need a reliable solution.

What I have now actually works on my test server, and I’m pretty impressed with it. Commits are mirrored in both directions and though there is a delay, it’s not unmanageable for our environment.

The problem happens when I reboot. The daemons don’t restart, and when searching for a solution I’ve discovered that I have to do this manually, or write a script or startup routine to do it. We have many repositories here (over 100) so this is something I’ll have to work on.

I’ve found an existing SO thread that suggests using subgit fetch --async and another thread here (forum doesn’t let me include more than two links, post slug is how-to-auto-start-mirroring-service-on-linux-host) that suggests just subgit fetch without --async. Both of these fail on my test server, with a message:

Scheduling sync… error: Failed to launch background translation process: timeout waiting for pid file

I’ve found another thread (cannot-commit-timeout-waiting-for-pid-file) on this error, which suggests setting launchTimeout under [daemon] to a large number. I’ve tried setting it to 600 (10 minutes) and it hasn’t helped. Eventually I still get the timeout when running subgit fetch. The system should be idle during this time (it’s a test system, no one else is using it) and I can confirm by observing via htop or other tools. Very low CPU utilization and memory consumption.

Curiously, subgit install always succeeds (though it takes a seemingly random amount of time to do so, unexplained by apparent CPU or memory utilization in the meantime). I’ve run that multiple times after rebooting and after editing the config file, and have not seen it time out or produce any other error. The longest it’s taken (by its own report) is 66 seconds and the shortest is 7. Usually it’s less than 20. The daemon is always running afterward (ps -ef | grep subgit) and it always successfully mirrors commits to either repository.

I suppose I could just write a startup routine that invokes subgit install for each repository after boot instead of fetch, and as long as the idleTimeout = infinity that should work (assuming the daemons don’t crash – maybe there’s some systemd wizardry I could apply to watch and restart them…) Still, I am not sure if this is the right strategy and I’d like to know if I’m doing something wrong.

The subgit fetch error message suggests to upload a logging zip file, which I’ll attempt to attach here.

subgit-fetch-20230529-001914.zip (4.3 KB)