Standard Procedure to validate a migrated repository

Hi Team,

I have migrated few repositories from SVN to Bitbucket. I would like to know if there are any standard procedures to validate a migrated repository so that all the code, the branches, the commits and the tags, i.e, everything is intact after the migration.

I would also like to know if you can help on merging few repositories in to one repository on Bitbucket.

Can you please let us know your thoughts on these?

Thanks,
Sitaram

Hi Sitaram,

basically, SVN Mirror add-on verifies the data during the migration, so if an initial import finishes successfully, that means that all the data according to the mapping configuration is in the target Bitbucket repository, there are no extra procedures to check the migrated data.

I’m afraid, SVN Mirror add-on has no functionality to merge Bitbucket repositories, it always mirror SVN to Bitbucket repositories one to one. However, it is possible to combine repositories embedding them as modules with our new product called Git X-Modules:

TMate Software
Atlassian Marketplace

Git X-Modules allows including one (or more) repository as a module (subdirectory) in another repository and update the module as the source module repository updates, would it satisfy the merging needs you have?

Hi Ildar,

I noticed the revisions of one svn repo do not match with the commits on migrated repo.

There are 1011 revisions on trunk branch in svn repository but the migrated repository’s master branch has just 474 commits.

I checked multiple other branches for the count of revisions on this svn repository and the count of the commits on the respective branches on migrated repository and they do not match.

Branch A on SVN - 1011 revisions → Branch A on migrated repository has 474 commits
Branch B on SVN - 1012 revisions → Branch B on migrated repository has 475 commits
Branch C on SVN - 1011 revisions → Branch C on migrated repository has 474 commits
Branch D on SVN - 1007 revisions → Branch D on migrated repository has 471 commits

Can you please help on why the commits went missing on the migrated repository?

This is a critical issue as the integrity of the migration itself is at stake.

Thanks,
Sitaram

Hi Sitaram,

such a difference between svn log and git log outputs does not necessarily indicate a fault, the difference may originate just from the difference between SVN and Git version control systems. This difference may also be expected originating from the mapping configuration since SVN Mirror add-on may import not all the data and entities from SVN. Additionally, it may be a history translation issue that only affects the history got log <branch> command show but not the commits and data, and such a situation may be expected under certain circumstances, like renaming an SVN branch not by SVN means but instead with the OS’s mv command, for example.
In any case, it’s very hard to say for sure what the difference is originated from, it requires a deeper investigation. First, could you please advise how exactly did you count the revisions in SVN and in Bitbucket – namely, what command has you run for the counting? Then, could you please share the mapping configuration along with SVN Mirror logs from the affected repository? And also, could it be possible to share 'svn log -v` and ‘git log’ outputs for the affected branches to investigate?

Hi Ildar,

The mapping configurations that we used for importing SVN repository to Bitbucket are as follows.

trunk = trunk:refs/heads/master
branches = branches/:refs/heads/
tags = tags/:refs/tags/
shelves = shelves/:refs/shelves/
triggerGitGC = false

Please find the commands from both SVN and Bitbucket.

The revision count from SVN repository:

svn log -v http://xxxxxxxx/svn/build/branches/B_6_1/ -q | grep ^r
r9920 | ChuA | 2008-06-01 01:13:59 -0700 (Sun, 01 Jun 2008)
r9919 | ChuA | 2008-06-01 01:12:25 -0700 (Sun, 01 Jun 2008)
r9918 | ChuA | 2008-06-01 01:11:01 -0700 (Sun, 01 Jun 2008)
r2 | buildserver | 2008-05-31 17:45:24 -0700 (Sat, 31 May 2008)
r1 | buildserver | 2008-05-31 16:45:43 -0700 (Sat, 31 May 2008)
(included just few here but got a total of 1012 revisions)

svn log -v http://xxxxxxxx/svn/build/branches/B_6_1/ -q | grep ^r | wc -l
1012

The commands from Git repository, same branch:

git rev-list --count HEAD
475

git log | grep ^commit
commit e38b5ec3dc625035410dfafa75801aaad02c21c9
commit 59c3b2a80918b9e1b67e877ab7bf522726ed4812
commit 4595d35c858f499300c23fd613cd1abac0a74671
commit 7d5b9b0f08a3aca1343bf2424de449c0ea92b92a
commit 98029d7e90e37e14714cd4b600431ec56f84c209
commit cd1d631b5510551e3347414411e26088843154c4

git log | grep ^commit | wc -l
475

The difference in the count of commits is seen in every branch in this repository.

Please find the alternative commands on SVN and GIT for the given branches.

svn log -v http://xxxxxxxx/svn/build/branches/B_5_8/ -q | grep ^r | wc -l
1000

git checkout B_5_8
Switched to a new branch ‘B_5_8’
git log | grep ^commit | wc -l
463
git rev-list --count HEAD
463

svn log -v http://xxxxxxxx/svn/build/branches/B_5_7/ -q | grep ^r | wc -l
998

git checkout B_5_7
Switched to a new branch ‘B_5_7’
git rev-list --count HEAD
461

svn log -v http://xxxxxxxx/svn/build/branches/B_5_6/ -q | grep ^r | wc -l
993

git checkout B_5_6
Switched to a new branch ‘B_5_6’
git rev-list --count HEAD
457

svn log -v http://xxxxxxxx/svn/build/branches/B_5_5/ -q | grep ^r | wc -l
987

git checkout B_5_5
Switched to a new branch ‘B_5_5’
git rev-list --count HEAD
450

Note: The commit history of other SVN repositories match with the commit history of their respective Bitbucket repositories. The issue is seen in this specific repository only.

The first revision on SVN in trunk was done on 31st May 2008 where as the first commit on master branch on Bitbucket repository is on 19th May 2011. The commit history between May 2008 and May 2011 is completely missing.

Please let us know if you need any other information.

Thanks,
Sitaram

Hello Sitaram,

thank you for your response and the provided information.

I’m afraid though, that these outputs give us very little clue on what is happening there, we would need full commands outputs to be able to compare the logs commit-by-commit and comprehend the situation. In addition, please collect the following command output from the affected Bitbucket repository:

GIT_NOTES_REF=refs/svn/map git log --format="%H %N" --all --tags ^refs/svn/map

and also collect the SVN Mirror logs in the add-on UI.

The date of the first commit difference in SVN and Git may indicate that there was a non-SVN move/renaming in the history – in such a case SVN Mirror add-on has no means to find out that the commits belong to the same branch and that’s why the history in Git starts from the very first commit after the renaming. The earlier commits are still there, the add-on imports the to Git as well, only they are not being shown in the git log output. To fix such a history issue both old and new names must be present in the mapping, this way the add-on will be able to glue the branch history.

Hi Ildar,

The command fails to run in the affected Git repository with the following error.

GIT_NOTES_REF=refs/svn/map git log --format=“%H %N” --all --tags ^refs/svn/map

fatal: bad revision ‘^refs/svn/map’

Do we need to add anything else to this command?

Also this migration happened 6 months ago and I am afraid I do not have the logs at this moment.

I also realised that the trunk branch was deleted in a revision and added in the next revision.

Let us consider the trunk branch revisions here. The total revisions in trunk branch as per svn log is 1011.

There were 582 commits until the trunk was deleted in the revision number r21598.

The trunk was added in the next revision number r21599. There are 427 revisions in SVN from revision r21599 (after trunk is added).

However the Bitbucket repo has 475 commits which starts with the commit that is mapped with r21599 revision on SVN (when the trunk was added).

The count 475 on Bitbucket (total commits) does not match with the revisions count 427 on SVN repo since trunk has been added again.

What are the possible ways to debug the count mismatch?

Thanks,
Sitaram

Hi Sitaram,

it seems I forgot to mention that this command should be used in the repository directory right on the Bitbucket server, it won’t work in a working copy directory, my apologies for this mistake.
Removing a branch and adding it back is another action that SVN Mirror add-on is not able to trace the history through, so it’s expected that the history in Git starts from the commit that matches r21599. As for the difference in commits count – that might be originated from the SVN and Git logs difference, but it’s hard to say for sure, we definitely would need the svn log and git log commands outputs along with the additional git log command mentioned in my previous message.

Hi Ildar,

The given command (GIT_NOTES_REF=refs/svn/map git log --format=“%H %N” --all --tags ^refs/svn/map) result shows a total of 2937 commits. Is this supposed to show all the commits of the whole repository? If so there is a huge difference in the total number of commits.

The actual revision count from the svn repository with the command
svn log -r HEAD:1 http://xxxxxxxx/svn/build/ | grep “^r[0-9]”
returns a total of 6924 revisions and I have verified these revisions.

The second question is can the history (that comes with your command) be seen on the commits section of the repository? If so, how can we get that done?

Thanks,
Sitaram

Hello Sitaram,

this command supposed to show git log along with the git notes that SVN Mirror writes during SVN-to-Git translation which is very helpful information for the investigation. It’s not that some of that commands we mentioned should return some number of commits that would match with other numbers, all those commands are supposed to collect information that should help to investigate how the import went and what is the current situation, and I’m afraid it’s hardly possible to make any conclusions having only numbers of commits the commands return, we need the outputs themselves to investigate.
The commits section in the repository UI in Bitbucket is being generated by Bitbucket itself, I’m not aware if there’s any way to customise this output.