Subgit Install Git Repo (java heap space)

Hello,

I contacted you about an issue that I had when I run the command subgit install.

I got to 58% of the translations and after I got the message:
INSTALLATION FAILED

error: Java heapspace

I can send you the log file if needed.

I modified my configuration and I set httpSpooling to true as requested by one of your colleagues and I launched again subgit install on the same git repository to continue the translation but I still have the same error message afterwards : java heap space.

Thank you in advance for your assistance.

Best regards,
Fahd

Hello Fahd,

Since the spooling is enabled, it would require a deeper investigation, could you please collect and upload the install log from the affected repository to this topic?

Hello,

Thank you for your reply.

Here the link the link to the install log file frome my drive : https://drive.google.com/drive/folders/1Qx-xtlfDwHYzDIrYl8twDfJMOjw1-JLU?usp=sharing

Thanks in advance.

Best regards,
Fahd

Hello Fahd,

thank you for the logs.

I see the svn.httpSpooling is set to true, but it turns out you’re using SVN protocol, not HTTP, so this setting is actually does not affect the translation.
Most probably the Out-of-Memory error happens when SubGit tries to load a whole big file to memory, so possible ways to overcome the issue is to not to load the whole file trying to import it in a stream or not to load it at all. Also, it may worth to use Git binary to load big blobs instead of loading them using internal means (JGit), this also may help to avoid those OOMes; I found errors like follows in the log:

[2022-04-12 12:03:49.662][subgit-install][1] Unable to load blob 845bbb202b8d3cf5bbce4497851047c3d5921f63 with Git executable, loading the blob with JGit
com.syntevo.svngitkit.core.b.i: Cannot run program "git" (in directory "/src/gitlab/git-data/repositories/@hashed/81/17/811786ad1ae74adfdd20dd0372abaaebc6246e343aebd01da0bfc4c02bf0106c.git"): error=2, Aucun fichier ou dossier de ce type

Those errors indicate that SubGit is unable to find the git program and this it is using JGit to load those large blobs which is less efficient; to resolve this issue set the core.gitPath setting it the SubGit configuration:

[core]
    …
    gitPath = <path to Git binary>

It may also worth setting the upper limit of a file size that would be loaded to memory on import, too high limit may also cause the OOME:

[core]
    …
    streamFileThreshold = 20971520

Additionally, it may worth disabling eols translation, this not only speeds up the translation, but also reduces memory consumption:

[translate]
    eols = false

If the above changes do not help much and the OOME happens again despite of those settings, a possible workaround is to exclude the problematic branch so that it is not imported thus not triggering the problem. The OOME occurs during the _done branch translation, it can be excluded in by setting the mapping configuration as follows:

[svn]
    …
    trunk = trunk:refs/heads/master
    branches = branches/*:refs/heads/*
    tags = tags/*:refs/tags/*
    excludeBranches = branches/feature_*
    excludeBranches = branches/_done

Another possible workaround it to use other protocol that is less apt to the OOMEs, like HTTP (which allows using spooling) or svn+ssh, or even file protocol – in the latter case, though, the translation will only work on the same computer where SubGit resides as the file protocol can only work with local filesystem, so it either should be performed on the SVN server (but the target Git repository should then be moved somehow to GitLab) or the SVN repository should first be moved or copied to the GitLab server.

Hello,

Thank you for this analysis and all this informations.

I will apply all the leads you have indicated and I will come back to you.

Thanks again.

Best regards,
Fahd