the memory usage depends on SVN repository layout and history, so I think the best solution in this case, would be heap size increasing. To get this done, edit SubGit launching script (\bin\subgit.bat) and add ‘-Xmx1024m’ to ‘EXTRA_JVM_ARGUMENTS’ line:
set EXTRA_JVM_ARGUMENTS=-Djava.util.logging.config.file="%BASEDIR%\conf\logging.properties" -Dsun.io.useCanonCaches=false -Djava.awt.headless=true -Djna.nosys=true -Dsvnkit.http.methods=Digest,Basic,NTLM,Negotiate -Xmx1024m
It would also worth to add ‘–trunk PATH’ option to the ‘configure’ command to set trunk path explicitly.
It’s also possible to set mapping configuration manually. To get this done, run ‘configure’ without ‘layout auto’:
Then edit <REPO_PATH>\subgit\config and set ‘trunk’, ‘branches’ and ‘tags’ options to reflect the SVN repository layout; let me know if you’d need any help with the mapping configuration creating.
I felt that I probably didn’t emphasize enough some of my points in the answer, so I’d like to do it now: absence of the “–trunk” option in the command may lead to the issues like this, so I’d recommend to add it as the first step to resolve the issue. And the second point I’d like to emphasize is that the auto-configuration is not the only way to create the config, it can be done manually, too. If you don’t intend to investigate ‘configure’ problem and want just to proceed with import/mirror, just let me know your SVN repository layout and which branches/tags from SVN you intend to get in Git, I will develop a suitable configuration for you.
I’m trying a subgit configure on a similar repo (400000+ commits, with --trunk specified, I might add), and I’m seeing the same thing.
To me, giving subgit more memory (I needed 12GB to get past the “Building branches layouts…” stage) did help.
With less memory, I could see in Java Mission Control, that it would start to trash the GC. Eventually it would throw an OOME, or a “GC overhead limit exceeded” Exception.
I would provide the logs, but the process is not yet finished.
It’s been “Generating SVN to Git mapping…” for two days now. The repository contains well over 100 projects, that each have 1…400+ branches and tags. I can see this takes a while. :-)
I’ll get back to you, and file an issue if it fails.
@carsonlee.blizzard Any indication as to how long this took?
Mine is still “Generating SVN to Git mapping…”, almost a week now (but I did suspend my machine during the weekend).
there’s another way but logs to get information on what’s happening and why the process takes that long time – collecting thread dumps. Could you run:
jstack -l PID > threads-X.txt
several times, e.g. once per second thus producing several thread dumps. The thread dumps may help if SubGit is spending the most of the time on unexpected activities.
You can get process id PID using ‘jps’ command. Both ‘jstack’ and ‘jps’ command are included into JVM.
I’m sorry, I gave up for now.
I could see in JProfiler that SubGit was performing a lot of IO during mapping, but only took 10% CPU or so.
I’ll try again in a couple of weeks using a ramdisk. That might make things a bit quicker.
we are going to release the next SubGit version 3.3.4 that contains some changes in this algorithm, so my suggestion is to try this version, I will inform you as soon as it is released.
I’m trying again, using SubGit 3.3.4. It still takes a long time, but I can see CPU usage is anywhere between 30% and 100% consistently and memory usage is down significantly. I’ll let this run for a while and update this issue.