On arm64 Linux, all files checked out as executable

We are indirect users of SVNKit through the Jenkins Subversion plugin (https://plugins.jenkins.io/subversion/). Recently we added a number of arm64 Linux slaves to our builds, and found out that when Subversion repositories are checked out, all files got their executable bits set, not only the ones marked svn:executable. Since this results in issues with our builds, we effectively cannot use the Jenkins Subversion plugin, but have to run the regular svn command line client instead.

After some digging I found that the problem is due to SVNLinuxUtil.getFileModeOffset(), which assumes that on all 64-bit Linuxes the offset of the st_mode field in struct stat is 24. However, this only applies to 64-bit x86, e.g. amd64 (aka x86_64, but Java’s os.arch calls it amd64). On arm64 the offset is 16, like it is on e.g. 32-bit x86, but also on 32-bit arm. (There is a whole history behind the ugly juggling with 32 and 64 bit stat structures, but I won’t bore you with that. :) )

Here is a possible patch for this issue. I added a check for os.arch being equal to “amd64”, and use this to get the correct offsets for getFileModeOffset() and getFileUserIDOffset(). (This automatically also applies to getFileGrooupIDOffset() too.)

fix-linux-stat-offsets-1.diff (2.9 KB)

Hello Dimitry,
thanks you very much for the patch, patches are always welcome!

To be honest, we haven’t a possibility to test SVNKit on arm64, so the case of arm64 could be handled completely wrong. On the other hand, we have tested it on x86_64 and x86 and I wouldn’t like change the behaviour for those cases.

I looked through your patch, and I’ve see a suspicious place, so I would like to discuss it with you before applying it.

After the change getFileUserIDOffset() method doesn’t use modeOffset variable anymore for the case of Linux.

With the old code for x86_64 modeOffset=24, so getFileUserIDOffset() returns modeOffset+4=28. With the new code it returns just 4.

With the old code for x86 modeOffset=16, so getFileUserIDOffset() returns modeOffset+8=24. With the new code it returns just 8.

I guess you meant

    private static int getFileModeOffset() {
        int modeOffset = getFileModeOffset();
        if (SVNFileUtil.isLinux) {
            return modeOffset + (SVNFileUtil.isAMD64 ? 4 : 8);
        }
        ...
    }

right? I don’t know what’s the correct value for arm64, so I can’t check that.

Do you have a possibility to test your changes on x86_64 architecture? It’s important for us not to break the correct behaviour for existing x86_64 users.

By the way, have you looked into getFileLastModifiedOffset()? Is it correct for arm64? Incorrect behaviour of this method might not be obvious right away, but result into performance issues (comparing many files byte-by-byte where comparing last modified time would be enough — if it were correct). But it also can lead into wrong files comparison when it says that the last modified time of files is the same when it’s not — this prevents comparing files byte-by-byte when this should normally happen.

In either case, thank you very much for your contribution to the project!

That’s completely understandable. We run our arm64 builds on Amazon EC2 Graviton (AWS Graviton - Amazon Web Services), but it is also possible to use Docker to run arm64 images on Intel hosts.

Yes, this was indeed a mistake, it was meant to return modeOffset + (SVNFileUtil.isAMD64 ? 4 : 8).

I have updated the patch to fix that, and also changed the isAMD64 detection to check for os.arch being equal to “amd64” or “x86_64” (all Linux distributions I tested used the “amd64” name, but I know there are some others out there that use “x86_64” instead).

I made a small overview of the struct stat member offsets on various Linux architectures:

linux aarch64: dev=0 ino=8  mode=16 nlink=20 uid=24  gid=28 rdev=32 size=48 blksize=56  blocks=64  atim=72 mtim=88 ctim=104
linux armv7l : dev=0 ino=12 mode=16 nlink=20 uid=24  gid=28 rdev=32 size=44 blksize=48  blocks=52  atim=56 mtim=64 ctim=72
linux x86_64 : dev=0 ino=8  mode=24 nlink=16 uid=28  gid=32 rdev=40 size=48 blksize=56  blocks=64  atim=72 mtim=88 ctim=104

You can see that on x86_64 (aka amd64) getFileModeOffset() should return 24, getFileUserIDOffset() should return 28 (modeOffset + 4), getFileGroupIDOffset() should return 32, and getFileLastModifiedOffset() should return 88.

On aarch64 (aka arm64) getFileModeOffset() should return 16, getFileUserIDOffset() should return 28 (modeOffset + 4), getFileGroupIDOffset() should return 32, and getFileLastModifiedOffset() should return 88.

This was printed with the following small C program, which should compile on most distributions:

#include <sys/types.h>
#include <sys/stat.h>
#include <stddef.h>
#include <stdio.h>
#include <unistd.h>

int main(void)
{
  printf(
     "dev=%zu"
    " ino=%zu"
    " mode=%zu"
    " nlink=%zu"
    " uid=%zu"
    " gid=%zu"
    " rdev=%zu"
    " size=%zu"
    " blksize=%zu"
    " blocks=%zu"
    " atim=%zu"
    " mtim=%zu"
    " ctim=%zu"
    "\n"
    , offsetof(struct stat, st_dev)
    , offsetof(struct stat, st_ino)
    , offsetof(struct stat, st_mode)
    , offsetof(struct stat, st_nlink)
    , offsetof(struct stat, st_uid)
    , offsetof(struct stat, st_gid)
    , offsetof(struct stat, st_rdev)
    , offsetof(struct stat, st_size)
    , offsetof(struct stat, st_blksize)
    , offsetof(struct stat, st_blocks)
    , offsetof(struct stat, st_atim)
    , offsetof(struct stat, st_mtim)
    , offsetof(struct stat, st_ctim)
  );
  return 0;
}

I can compile the SVNKit project with IntelliJ just fine, but I need to find some way to convince Jenkins to use my svnkit-1.10.4-SNAPSHOT.jar instead of its own copy. (Upstream Jenkins gathers all their jar files into a remoting.jar which it sends to all Jenkins build slaves, and then uses that for its remote operations, such as checking out Subversion repositories.)

I might also try to run the SVNKit test suite, but it is probably a lot more work to get right… :)

Yes, arm64 and amd64 both use 88 for the st_mtim field, and also for the other time-related fields. Interestingly, for aarch64 the offsets of struct stat fields after st_size are the same as for x86_64, e.g.:

linux aarch64: dev=0 ino=8  mode=16 nlink=20 uid=24  gid=28 rdev=32 size=48 blksize=56  blocks=64  atim=72 mtim=88 ctim=104
linux x86_64 : dev=0 ino=8  mode=24 nlink=16 uid=28  gid=32 rdev=40 size=48 blksize=56  blocks=64  atim=72 mtim=88 ctim=104

Note that the mode and nlink fields are swapped around, but this is apparently for historical reasons.

fix-linux-stat-offsets-2.diff (3.0 KB)

Thank you very much for your research!

I’ve created an issue on our issue tracker:
https://issues.tmatesoft.com/issue/SVNKIT-762

and also I’ve applied your patch at revision r10814.

You can build SVNKit from trunk sources directly with

./gradlew svnkit:clean svnkit:build -x svnkit:javadoc -x svnkit:test svnkit-cli:clean svnkit-cli:build svnkit-distribution:clean svnkit-distribution:build

command or wait for the next official release (1.10.4).

Okay, I succeeded but had to apply my patches on top of Jenkins’s SVNKit version, which is 1.10.1 r10777, otherwise Jenkins would get confused. (It also has to be completely shutdown before replacing the svnkit.jar file.)

I tested the resulting Subversion SCM plugin checkouts on Ubuntu 20.04 for arm64 and for x86_64, and both resulted in the correct permission bits for checked out files. I also tried Alpine 3.14 for x86_64, which is interesting because it doesn’t use glibc but musl libc, and it als works correctly.

Very nice, thank you! If the release is out, I will create a ticket in the Jenkins issue tracker, so they can hopefully pick it up in their next Subversion plugin update.

For future reference: [JENKINS-67273] On arm64 Linux, all files checked out as executable - Jenkins Jira