How to modify / use a custom .gitignore file?

After a long initial translation, we started working with the mirrored Git repository.
Our repo has over 200k commits and its working copy in SVN has a size of around 20 GB.

We noticed a .gitignore file was created automatically in the Git repo, but not in the SVN repo. It seems like it was generated out of svn:ignores.
That’s fine.

We need an own .gitignore file, however. And the contents should not be translated back to SVN.

When we tried to modify the .gitignore file and push the committed changes, the push would never finish. It hangs on the following message:

$ git push
Remote "origin" does not support the LFS locking API. Consider disabling it with:
  $ git config lfs.<unknown>.locksverify false
Enumerating objects: 7, done.
Counting objects: 100% (6/6), done.
Delta compression using up to 12 threads
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 3.85 KiB | 3.85 MiB/s, done.
Total 3 (delta 1), reused 0 (delta 0), pack-reused 0
remote: Checking connectivity: 3, done.
remote: Fetching revisions from SVN repository:
remote:   up to date
remote: Sending commits to SVN repository:

During that time, one-two java processes take about 100% CPU on the server.
We didn’t find anything that is being logged on the server. In the bare repo, we looked inside logs/refs/heads, but nothing.

After digging around a bit, we found out that the “translation” of the .gitignore file can be disabled, so we did the following:

  • In the subgit/config file, we set svn.excludePath property to /.gitignore (and later we appended a second excludePath = .gitignore just to be sure).
  • In the subgit/config file, we also set translate.ignores to false.
  • We killed all java processes and ran subgit install our-repo.

This doesn’t seem to have any effect. Trying to push a changed .gitignore file, or even trying to push the deletion of the .gitignore file seems to send Subgit into some infinite loop.

Then, we versioned the .gitignore file in SVN, hoping for Subgit to translate that SVN commit and give all developers the correct .gitignore file. But to no avail.
We do see the respective commit on the Git side, but it comes without any changes (according to git log --stat).

Hence the question: What are our options to modify/replace the .gitignore file?

The initial translation was very frustrating (Java processes would constantly run OOM, the VM would become unstable because we were giving Java too much heap space, …). Therefore, we must avoid having to re-translate everything.

Many thanks.

Hello.

By default SubGit translates svn:ignore properties to .gitignore file in Git and vice versa, translates whatever added .gitignore into svn:ignore properties during the mirroring operations. If you don’t want the ignores to be synchronized, then the translate.ignores SubGit setting should be set to false; and if you don’t need .gitignore file (as a file) in SVN, then it indeed should be excluded. So the steps are completely correct, yet I assume you haven’t run subgit install after you’ve changed the configuration – SubGit does not apply the configuration file changes on the fly, it requires subgit install to be invoked against the repository. The subgit install will not cause the re-transaltion, on a configured repository it just checks the configuration, applies the changes if possible and start the daemon for the mirroring.

Thank you for the very quick answer. Unfortunately, I’m afraid that’s what we already did.

I repeated the steps again, however, just to be sure:

[root@myserver]# cat myrepo.git/subgit/config | grep 'Path = '
        #     excludePath = PATTERN
        #     includePath = PATTERN
        excludePath = /.gitignore
        excludePath = .gitignore
[root@myserver]# cat myrepo.git/subgit/config | tail
        # translation process startup overhead.
        idleTimeout = infinity

        # Explicit translation process classpath or path to the directory
        # that contains jars that have to be on the process classpath.
        classpath = subgit/lib

[translate]
        timezone = Europe/Berlin
        ignores = false
[root@myserver]# subgit install myrepo.git
SubGit version 3.3.11 ('Bobique') build #4408

About to shut down background translation process.
Shutdown request sent to background translation process (pid 22826).
Background translation process (pid 22826) has received shutdown request and will exit NOW.

SHUTDOWN SUCCESSFUL

Translating Subversion revisions to Git commits...

    Subversion revisions translated: 200896.
    Total time: 5 seconds.

INSTALLATION SUCCESSFUL

After this, I went on a clean master branch on a client PC and did the following:

➜  myrepo git:(master) ✗ git status | head
On branch master
Your branch is up to date with 'origin/master'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        ...
        ...

➜  myrepo git:(master) ✗ echo "an-example-to-be-ignored" >> .gitignore
➜  myrepo git:(master) ✗ git status | head
On branch master
Your branch is up to date with 'origin/master'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   .gitignore

Untracked files:
  (use "git add <file>..." to include in what will be committed)
➜  myrepo git:(master) ✗ git add .gitignore
warning: LF will be replaced by CRLF in .gitignore.
The file will have its original line endings in your working directory
➜  myrepo git:(master) ✗ git commit -m "feat(git): test a change to the .gitignore file"
[master 5d7396358] feat(git): WINGUI-11 test a change to the .gitignore file
 1 file changed, 1 insertion(+)
➜  myrepo git:(master) ✗ git pull
remote: Enumerating objects: 207, done.
remote: Counting objects: 100% (207/207), done.
remote: Compressing objects: 100% (128/128), done.
Receiving objects:  54% (70/132)mote: Total 132 (delta 102), reused 0 (delta 0), pack-reused 0
Receiving objects: 100% (132/132), 59.66 KiB | 5.97 MiB/s, done.
Resolving deltas: 100% (102/102), completed with 64 local objects.
From ssh://myhost:/myrepo
   2961e5c94..ccaddba5b  master     -> origin/master
Successfully rebased and updated refs/heads/master.
➜  myrepo git:(master) ✗ git push
Remote "origin" does not support the LFS locking API. Consider disabling it with:
  $ git config lfs.<unknown>.locksverify false
Enumerating objects: 7, done.
Counting objects: 100% (6/6), done.
Delta compression using up to 12 threads
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 360 bytes | 360.00 KiB/s, done.
Total 3 (delta 2), reused 0 (delta 0), pack-reused 0
remote: Fetching revisions from SVN repository:
remote:   up to date
remote: Sending commits to SVN repository:

But the result is the same; the command on the client side never returns, and on the server side the java processes must be killed (if we invoke subgit shutdown myrepo, we just run into a timeout).

Here’s a screenshot of the htop on the server (the server is a windows server with Subgit and SSH running in WSL1).

Our only “way out of this” is to kill and install: killall -9 java && subgit install myrepo.git

I don’t see the “starting” part of the subgit install output – this command first shuts the daemon down and then starts it back. It looks like you just don’t give SubGit a chance to finish applying the settings killing its processes, changing the ignores on big repository may take time. Could you please check out if anything is being logged in SubGit logs (they reside in /subgit/logs)? If not, please also check files in the home directory of the user that runs SubGit, there should exist the .subgit/logs directory where SubGit stores temporary logs.

Regarding the starting part of the install output, I’m not certain what to look for.
If I run subgit install again, you can see it cleanly shuts down the daemon, and then logs out the regular text. After that command completes, there are some java processes running, so I assume subgit install also starts the background process.

[root@myserver]# subgit install myrepo.git
SubGit version 3.3.11 ('Bobique') build #4408

About to shut down background translation process.
Shutdown request sent to background translation process (pid 23853).
Background translation process (pid 23853) has received shutdown request and will exit NOW.

SHUTDOWN SUCCESSFUL

Translating Subversion revisions to Git commits...

    Subversion revisions translated: 200902.
    Total time: 9 seconds.

INSTALLATION SUCCESSFUL

As for the logs, thank you for the tip, I had a look in ~/.subgit/logs and found the files “subgit-help.0.log” and “subgit-null.0.log”.

I don’t see anything remarkable in the log files though:

[root@10myserver]# tail -n20 subgit-help.0.log
[2021-11-10 11:44:59.705][subgit-help][1] Initialized file logger, level: ALL, logs directory is: 'null'.
[2021-11-10 11:44:59.705][subgit-help][1] SubGit version 3.3.11 ('Bobique') build #4408
[2021-11-10 11:50:36.035][subgit-help][1] Initialized memory cache logger.
[2021-11-10 11:50:36.040][subgit-help][1] SubGit version 3.3.11 ('Bobique') build #4408
[2021-11-10 11:50:36.040][subgit-help][1] Posix: Non-JNA platform is chosen.
[2021-11-10 11:50:36.041][subgit-help][1] Registered cancel hook.
[2021-11-10 11:50:36.062][subgit-help][1] Command name: help
[2021-11-10 11:50:36.062][subgit-help][1] Command argument: daemon
[2021-11-10 11:50:36.068][subgit-help][1] Posix: Non-JNA platform is chosen.
[2021-11-10 11:50:36.070][subgit-help][1] Initialized file logger, level: ALL, logs directory is: 'null'.
[2021-11-10 11:50:36.070][subgit-help][1] SubGit version 3.3.11 ('Bobique') build #4408
[2021-11-10 12:29:57.429][subgit-help][1] Initialized memory cache logger.
[2021-11-10 12:29:57.433][subgit-help][1] SubGit version 3.3.11 ('Bobique') build #4408
[2021-11-10 12:29:57.433][subgit-help][1] Posix: Non-JNA platform is chosen.
[2021-11-10 12:29:57.434][subgit-help][1] Registered cancel hook.
[2021-11-10 12:29:57.455][subgit-help][1] Command name: help
[2021-11-10 12:29:57.455][subgit-help][1] Command argument: install
[2021-11-10 12:29:57.464][subgit-help][1] Posix: Non-JNA platform is chosen.
[2021-11-10 12:29:57.466][subgit-help][1] Initialized file logger, level: ALL, logs directory is: 'null'.
[2021-11-10 12:29:57.466][subgit-help][1] SubGit version 3.3.11 ('Bobique') build #4408
[root@10myserver]# tail -n20 subgit-null.0.log
[2021-11-10 11:49:17.845][subgit-null][1] Initialized file logger, level: ALL, logs directory is: 'null'.
[2021-11-10 11:49:17.845][subgit-null][1] SubGit version 3.3.11 ('Bobique') build #4408
[2021-11-10 11:51:52.704][subgit-null][1] Initialized memory cache logger.
[2021-11-10 11:51:52.710][subgit-null][1] SubGit version 3.3.11 ('Bobique') build #4408
[2021-11-10 11:51:52.710][subgit-null][1] Posix: Non-JNA platform is chosen.
[2021-11-10 11:51:52.711][subgit-null][1] Registered cancel hook.
[2021-11-10 11:51:52.734][subgit-null][1] Command name: null
[2021-11-10 11:51:52.734][subgit-null][1] Command argument: --help
[2021-11-10 11:51:52.743][subgit-null][1] Posix: Non-JNA platform is chosen.
[2021-11-10 11:51:52.754][subgit-null][1] Initialized file logger, level: ALL, logs directory is: 'null'.
[2021-11-10 11:51:52.755][subgit-null][1] SubGit version 3.3.11 ('Bobique') build #4408
[2021-11-15 15:09:53.134][subgit-null][1] Initialized memory cache logger.
[2021-11-15 15:09:53.141][subgit-null][1] SubGit version 3.3.11 ('Bobique') build #4408
[2021-11-15 15:09:53.141][subgit-null][1] Posix: Non-JNA platform is chosen.
[2021-11-15 15:09:53.142][subgit-null][1] Registered cancel hook.
[2021-11-15 15:09:53.166][subgit-null][1] Command name: null
[2021-11-15 15:09:53.166][subgit-null][1] Command argument: --help
[2021-11-15 15:09:53.179][subgit-null][1] Posix: Non-JNA platform is chosen.
[2021-11-15 15:09:53.182][subgit-null][1] Initialized file logger, level: ALL, logs directory is: 'null'.
[2021-11-15 15:09:53.182][subgit-null][1] SubGit version 3.3.11 ('Bobique') build #4408

I also noticed your remark: “changing the ignores on big repository may take time”.

Please do note that we do not want to change the svn:ignores properties. Therefore, I don’t understand what would take much time.

Basically, Subgit simply needs to allow us committing a .gitignore file, just as if it were any other file. Whether these changes are translated to SVN or not is not important, as long as we can change the .gitignore file. But the svn:ignores properties must remain untouched.

Another idea which I’m thinking of:

Would it be a viable workaround if we temporarily shutdown Subgit, remove all hooks, push the .gitignore changes, and then restore the hooks and restart Subgit?

Would this be doable, or would Subgit get confused or consider the repository corrupted/out-of-sync and force us to start over with an initial translation of everything?

I’m afraid I don’t follow here, what .gitignore changes do you intend to push? My impression was the intent was to switch the synchronization between svn:ignore and .gitignore, but it cannot be done with a push, SubGit must apply the translate.ignores setting instead. That is in fact the only right way for not as otherwise (while translate.ignores is true) SubGit won’t treat the file as a regular file and will be synchronizing svn:ignore <–> .gitignore instead.

It is possible, of course, to shutdown the daemon and remove the hooks, it will be possible to push something to the SubGit repository. Subsequent subgit install will restore the hooks and start the mirror without forcing to start over, but the thing is that it will recognize a new commit in Git and will try to send it to SVN which is in fact the same as pushing the commit while SubGit mirror is running.

Judging from the output, the installation was successful and the setting should have been applied already, though. So the .gitignore file should be treated as a regular file, but it’s not clear what is going on in the repository and what SubGit is doing taking so much resources. The logs you’ve sent does not seem to be related as they end on Nov 15. Does it logs anything more, especially when it starts to consume resources? Could you please try to set core.logLevel to finer or even finest and run subgit install again?

I’ll try increasing the logging and report back. I did not notice yesterday that the logging was from the day before.

But just to be sure we have a clear understanding:

We do not want svn:ignores to be modified by Subgit under any circumstance. Subgit must not try to “translate” .gitignore into svn:ignores. This is a requirement that blocks general adoption.

In the Subgit documentation, this is what is described for translate.ignores:

ignores = [true|false]

a boolean value, can be set to true or false. When true, .gitignore in Git is translated to svn:ignore and vice versa. When false .gitignore from Git is translated to .gitignore file in SVN as any other normal file. Default is true.

In our config, this property is set to false, but apparently, Subgit still tries to translate .gitignore into svn:ignore (I assume so due to the high CPU usage).

So unless I misunderstand the documentation, the property does not have the intended effect.

Hello,

it is correct understanding, when translate.ignores is false, SubGit treats the .gitignore file as a regular file and does not try to reflect the ignores to svn:ignore. But I’m not sure that is what SubGit is trying to do, recent subgit install output clearly stated INSTALLATION SUCCESSFUL which means all the settings had been applied. It is not clear what exactly SubGit is doing taking that much resources and that is what I was hoping to find out in logs.

Indeed, I was also hoping for the logs to give us some information, but I don’t see anything being logged.

I followed your advice and set logLevel = finer under [core] (and then ran subgit install myrepo).
Then, I tried pushing .gitignore changes again.
But it seems like nothing more is being logged (unless I’m not looking at the correct locations).

In our repo, under myrepo/logs, there is some basic logging (changing the log level didn’t change this):

[root@myserver]# l myrepo.git/logs/refs/heads
total 132K
drwxrwxrwx 1 root root 4.0K Nov 11 09:01 .
drwxrwxrwx 1 root root 4.0K Oct 28 08:48 ..
-rwxrwxrwx 1 root root  318 Nov 11 10:59 feat-winrt
-rwxrwxrwx 1 root root 129K Nov 17 09:45 master
[root@myserver]# cat myrepo.git/logs/refs/heads/master| tail
26f4ae976e4b7872949e466911b5e179efe99702 2bfbb6b22e8d92aac45259b59505d2da4f113932 subgit <support@subgit.com> 1637079104 +0000  Reference was updated to reflect
 SVN state: forced-update
2bfbb6b22e8d92aac45259b59505d2da4f113932 13c388e850480410877bbc4bff9eaf3349845453 subgit <support@subgit.com> 1637080731 +0000  Reference was updated to reflect
 SVN state: forced-update
13c388e850480410877bbc4bff9eaf3349845453 67108d6404d77d07111f15728102ec38e28fe42d subgit <support@subgit.com> 1637081095 +0000  Reference was updated to reflect
 SVN state: forced-update
67108d6404d77d07111f15728102ec38e28fe42d 4f3c2a3d443a7debf87e0938513dc448038daf72 subgit <support@subgit.com> 1637082722 +0000  Reference was updated to reflect
 SVN state: forced-update
4f3c2a3d443a7debf87e0938513dc448038daf72 1b021e113cf97b40ca064dd0654595b8bf6952f3 subgit <support@subgit.com> 1637089937 +0000  Reference was updated to reflect
 SVN state: forced-update
1b021e113cf97b40ca064dd0654595b8bf6952f3 8e104892715a10e63a23019c3b74381c79e06f3d subgit <support@subgit.com> 1637124505 +0000  Reference was updated to reflect
 SVN state: forced-update
8e104892715a10e63a23019c3b74381c79e06f3d 2226f64d05eabc49789c5fb8d728b3c76550fa9c subgit <support@subgit.com> 1637126620 +0000  Reference was updated to reflect
 SVN state: forced-update
2226f64d05eabc49789c5fb8d728b3c76550fa9c eb61b4659663666f8a30b5d77b6df0b31b8fc0cb subgit <support@subgit.com> 1637132996 +0000  Reference was updated to reflect
 SVN state: forced-update
eb61b4659663666f8a30b5d77b6df0b31b8fc0cb 4b202e53e83bc5c2206775f712a8b0a045ef388f subgit <support@subgit.com> 1637138647 +0000  Reference was updated to reflect
 SVN state: forced-update
4b202e53e83bc5c2206775f712a8b0a045ef388f 9a911894ff53eea512b767dff5ede2998853452b subgit <support@subgit.com> 1637138710 +0000  Reference was updated to reflect
 SVN state: forced-update

But under ~/.subgit/logs, nothing changes:

[root@myserver]# l ~/.subgit/logs
total 16K
drwxr-xr-x 1 root root 4.0K Nov 15 16:09 .
drwxr-xr-x 1 root root 4.0K Nov  4 02:22 ..
-rw-r--r-- 1 root root 5.4K Nov 10 13:29 subgit-help.0.log
-rw-r--r-- 1 root root 4.9K Nov 15 16:09 subgit-null.0.log

Is there anything else we can do to configure the logging? Maybe Subgit supports a log4j configuration?

If SubGit does not write to the logs, especially those in the home directory, then I’m afraid the only way to investigate it is to get the information out from the process. One way is to run jstackfor that:

jstack -l PID

for all the SubGit processes. Also, it would worth connecting to JVM with any profilers and gather the thread dumps for investigation.

Additionally, could you please advise does the issue arises immediately after subgit install invoking or the installation goes well but the issue starts only when a new push comes? What is the Java version you are using and could it be possible to test SubGit with other version (8 or 11 preferred)?

Thank you for the info. I’ll try with jstack.

After running subgit install myrepo.git, everything works fine.

The issue arises as soon as we push a commit in which .gitignore was changed or deleted. Then we have to kill the processes and run subgit install myrepo.git again (to restart the processed).

If we push any other change (which does not include .gitignore changes), there is no problem: The change lands in SVN in that case.

Ok, thanks for the clarification. My assumption it may be connected to that there is the .gitignore in SVN, but it definitely worths to take a look at logs and dumps to make any conclusions, could it be possible to collect jstack output and thread dumps?

Thank you for the ongoing support!

We added the .gitignore file later on in SVN, after we noticed the problem with Subgit. The problem also occurred without .gitignore file in SVN.

I ran jstack as you suggested:

  • First, I ran it before pushing .gitignore changes; CPU usage was near to 0%
    • See “idle.txt”
  • Then, I pushed .gitignore changes. As expected, CPU went up and Subgit became unresponsive. I ran jstack against the same process, which was now at high CPU usage.
    • See “gitignore-being-pushed.txt”
  • I waited for about 15 seconds, and then I ran jstack again.
    • See “gitignore-being-pushed-2.txt”

At the moment, we’re using OpenJDK 17.0.1.u12-1 on Arch Linux in WSL1 on Windows Server 2019 Build 17763 (this is a temporary setup for the proof-of-concept phase).

jstack-logs.zip (8.1 KB)

I also just found Subgit log files! I was looking under myrepo.git/logs, but the log files are under myrepo.git/subgit/logs !
Here are all the *.log files I found in the subdirectory (leaving out the older and bigger “daemon.1.log” and “daemon.2.log” files):

subgit-logs.zip (355.8 KB)

I had to put the log files in a ZIP because I can only upload two files.

EDIT: Had to anonymize the logs.

Hello,

thank you for the logs and clarification!

We involved the dev team to this issue, they are working on it right now. At the first sight it looks like that high-CPU issue is indeed connected to the ignores and attributes and it also looks that you have quite big .gitattributes file, is that correct? But anyway, the dev team is working on the case, I’ll let you know immediately as they come up with some conclusion.

Thanks a lot! Looking forward to what the dev team says.

Indeed, we have a pretty huge .gitattributes file, with 116033 lines (9,3 MB).
It looks like this file was generated by Subgit and (if I understand correctly), generated a line like these for (almost?) each file in the repository:

Base/.nuget/packages.config -text
Base/3rdParty.nuspec -text
Base/Base.sln -text

I’m not certain if this file is useful or really required…

Hello!

I’m glad to inform we managed to trace and fix the issue! It hasn’t been added to the recent release 3.3.12 (which, by the way, has just been published), but we built an interim build for you, the files (deb package and operating system-agnostic zip package):

https://teamcity.tmatesoft.com/repository/download/build_from_svn_tag/25691:id/subgit-3.3.12-snapshot20211118160200-package.zip
https://teamcity.tmatesoft.com/repository/download/build_from_svn_tag/25691:id/subgit_3.3.12-snapshot20211118160200_all.deb

It will ask for login, choose “login as guest” at this step.
The Debian package upgrade is straight, just install the package in the system; for the zip package unpack it at $PATH where the old SubGit lived so that the upgraded SubGit is invoked by the subgit command. After that upgrade SubGit fat far in every mirrored repository:

subgit install <REPO>

and that completes the upgrade.

1 Like

Hi Ildar, thank you very much for the snapshot. We tested it and it works very well, we could change the .gitignore file, push the changes and the other users could pull the changed file!

One thing that surprised me is that some kind of invisible commit seems to be created in SVN. It seems to only be a cosmetic issue, which I don’t mind:

➜  myrepo git:(master) git push
Total 0 (delta 0), reused 0 (delta 0), pack-reused 0
remote: Fetching revisions from SVN repository:
remote:   up to date
remote: Sending commits to SVN repository:
remote:   f6dfd7c => r201190
remote: Sync completed successfully
To ssh://myserver:/myrepo.git
   2c7b9f95d..f6dfd7cb4  master -> master

In SVN:
image

r201191 | ... | ... | 3 lines

fix(...): ...
------------------------------------------------------------------------
r201189 | ... | ... | 5 lines

fix(...): ...
------------------------------------------------------------------------

It’s a bit weird, but if SVN doesn’t have an issue with it (seems OK so far), I’m fine with it too.

Hi!

Glad to know it worked and the mirror is now back to normal)

As for the empty revisions in SVN – I may assume that those are revisions that only change properties, could it be the case?