How to Clone Git Repositories with JGit
Whatever you plan to do with an existing repository, first a clone has to be created. Whether you plan to contribute or just want to peek at its history, a local copy of the repository is needed.
While cloning a repository with JGit isn’t particularly difficult, there are a few details that might be worth noting. And because there are few online resources on the subject, this article summarizes how to use the JGit API to clone from an existing Git repository.
Cloning Basics
To make a local copy of a remote repository, the CloneCommand needs at least to be told where the remote is to be found:
Git git = Git.cloneRepository() .setURI( "https://github.com/eclipse/jgit.git" ) .call();
The Git factory class has a static cloneRepository() method that returns a new instance of a CloneCommand. setURI() advises it where to clone from and like with all JGit commands, the call() method actually executes the command.
Though remote repositories – like the name suggests – are usually stored on a remote host, setURI() can also specify a path to a local resource.
If no more information is given, JGit will choose the directory in which the cloned repository will be stored for you. Based on the current directory and the repository name that is derived from its URL, a directory name is built. In the example above it would be ‘/path/to/current/jgit’.
But usually, you would want to have more control over the destination directory and explicitly state where to store the local clone.
The setDirectory() method specifies where the working directory should be and with setGitDir() the location of the metadata directory (.git) can be set. If setGitDir() is omitted, the .git directory is created directly underneath the working directory
The example below
Git git = Git.cloneRepository() .setURI( "https://github.com/eclipse/jgit.git" ) .setDirectory( "/path/to/repo" ) .call();
will create a local repository whose work directory is located at ‘/path/to/repo’ and whose metadata directory is located at ‘/path/to/repo/.git’.
However the destination location is chosen, explicitly through your code or by JGit, the designated directory must either be empty or must not exist. Otherwise, an exception will be thrown.
The settings for setDirectory(), setGitDir() and setBare() (see below) are forwarded to the InitCommand that is used internally by the CloneCommand. Hence more details thereover are explained in Initializing Git Repositories with JGit.
The Git instance that is returned by CloneCommand.call() provides access to the repository itself (git.getRepository()) and can be used to execute further commands targeting this repository. When finished using the repository it must be closed (git.close()) or otherwise the application may leak file handles.
To later regain a Repository (or Git) instance, the path to the work directory or .git directory is sufficient. The article How to Access a Git Repository with JGit has detailed information on the subject.
Upstream Configuration
As a last step the clone command updates the configuration file of the local repository to register the source repository as a socalled remote.
When looking at the configuration file (.git/config) the remote section looks like this:
[remote "origin"] url = https://github.com/eclipse/jgit.git fetch = +refs/heads/*:refs/remotes/origin/*
If no remote name is given, the defaults ‘origin’ is used. To have the CloneCommand use a particular name under which the remote repository is registered, use setRemote().
The refspec given by ‘fetch’ determines which branches should be exchanged when fetching from or pushing to the remote repository by default.
Cloning Branches
By default, the clone command creates a single local branch. It looks at the HEAD ref of the remote repository and creates a local branch with the same name as the remote branch referenced by it.
But the clone command can also be told to clone and checkout certain branch(es). Assuming that the remote repository has a branch named ‘extra’, the following lines will clone this branch.
Git git = Git.cloneRepository() .setURI( "https://github.com/eclipse/jgit.git" ) .setDirectory( "/path/to/repo" ) .setBranchesToClone( singleton( "refs/heads/extra" ) ); .setBranch( "refs/heads/extra" ) .call();
With setBranchesToClone(), the command clones only the specified branches. Note that the setBranch() directive is necessary to also checkout the desired branch. Otherwise, JGit would attempt to checkout the ‘master’ branch. While this is isn’t a problem from a technical point of view, it is usually not what you want.
If all branches of the remote repository should be cloned, you can advise the command like so:
Git git = Git.cloneRepository() .setURI( "https://github.com/eclipse/jgit.git" ) .setDirectory( "/path/to/repo" ) .setCloneAllBranches( true ) .call();
To prevent the current branch from being checked out at all, the setNoCheckout() method can be used.
Listing Remote Branches
If you want to know which branches a remote repository has to offer, the LsRemoteCommand comes to the rescue. To list all branches of a JGit repository, use Git’s lsRemoteRepository() like shown below.
Collection<Ref> remoteRefs = Git.lsRemoteRepository() .setHeads( true ) .setRemote( "https://github.com/eclipse/jgit.git" ) .call();
In case you would also want to list tags, advise the command with setTags( true ) to include tags.
For reasons I rather don’t want to know, JGit requires a local repository for certain protocols in order to be able to list remote refs. In this case Git.lsRemoteRepository() will throw a NotSupportedException. The workaround is to create a temporary local repository and use git.lsRemote() instead of Git.lsRemoteRepository() where git wraps the temporary repository.
Cloning Bare Repositories
If the local repository does not need a work directory, the clone command can be instructed to create a bare repository.
By default non-bare repositories are created, but with setBare( true ) a bare repository is created like shown below:
Git git = Git.cloneRepository() .setBare( true ) .setURI( "https://github.com/eclipse/jgit.git" ) .setGitDir( "/path/to/repo" ) .call();
Here the destination directory is specified via setGitDir() instead of using setDirectory().
The resulting repository’s isBare() will return true, getGitDir() will return /path/to/repo and since there is no work directory getWorkTree() will throw a NoWorkTreeException.
Note that ‘bare’ here only applies to the destination repository. Whether the source repository is bare or not doesn’t make a difference when cloning.
Cloning Submodules
If the remote repository is known to have submodules or if you wish to include submodules in case there are any, the clone command can be instructed to do so:
Git git = Git.cloneRepository() .setCloneSubmodules( true ) .setURI( "https://github.com/eclipse/jgit.git" ) .setDirectory( "/path/to/repo" ) .call();
The above example advises the clone command to also clone any submodule that is found.
If setCloneSubmodules( true ) wasn’t specified while cloning the repository, you can catch up on the missing submodules later. For more details see the article How to manage Git Submodules with JGit.
Cloning with Authentication
Of course, JGit also allows accessing repositories that require authentication. Common protocols like SSH and HTTP(S) and their authentication methods are supported. A detailed explanation on how to use authentication support can be found in the JGit Authentication Explained article.
What’s Next
If you are wondering what to do next with a repository, you may want to read Getting Started with JGit. The tutorial explains the most commonly used Git commands and their respective JGit counterparts. It walks through the steps to create a repository, fetch contents from a remote, add and remove files to/from the history, inspect the history, and finally push back the changes to the originating repository.
Concluding How to Clone Git Repositories with JGit
For almost all features of the native Git clone command, there is a counterpart in JGit. Even a progress monitor which may be useful when JGit is embedded in interactive applications exists. And for the missing mirror option apparently a workaround exists. Only the often asked for shallow clones (e.g. git clone --depth 2
) aren’t yet supported by JGit.
The snippets shown throughout this article are excerpts from a learning test that illustrates the common use cases of the CloneCommand. The full version can be found here:
https://gist.github.com/rherrmann/84089f0e38d9eb875601
In order to help with setting up the development environment, you may want to also read An Introduction to the JGit Sources. If you still have difficulties or further questions, please leave a comment or ask the friendly and helpful JGit community for assistance.
- Extras for Eclipse: Neon Update - 6. July 2016
- What’s the Difference? Creating Diffs with JGit - 16. June 2016
- Terminate and Relaunch in Eclipse - 19. April 2016
In my local git repository,i have more than one commits for a file.how can i get status/action has been performed on that commit.I mean ,how can i know on a particular commit ,is that file added newly or is it been modified or is it been labeled..i want the correct eclipse API to get the above details.
I am afraid I don’t quite understand your question.In order to examine what a commit has changed compared to its parent commit, use the DiffCommand. Its
call()
method returns a list of DiffEntries that described each changed file.Sorry i did’t get your point correctly…my requirement is ,For a particular commit .what action has been performed to a particular file. So far i got the log message through RevCommit rev=git.log().call().from that rev ,i am able to get authorname and committed date and time…now i want to get the action for that particular file…consider i have added the file for first time..so i should get the action as Added…can you help in this
Did you look at the DiffCommand? It will tell you what files haven added, deleted and changed in a commit compared to its ancestor comit.
How do you do sparseCheckout programitically using JGIT API .I am able to do using cmd below
git config core.sparseCheckout true
echo “SpecificSubFolder” >> .git/info/sparse-checkout
appreciate any examples
According to this enhancement request, sparse checkout is not yet imlemented in JGit.
I have two questions so far:
1) Does git need to be installed in the machine where JGit runs?
2) How soon will JGit include the implementation for shallow clones?
btw, thank you for this awesome library and for providing tutorials like this one.
Hi Daniela,
though my stake in providing JGit is minimal, I am nonetheless glad you like the tutorials, JGit is a pure Java library implementing the Git version control system (see https://eclipse.org/jgit/). Therefore native Git need not be installed. However, be aware that since JGit is an independent implementation of Git there are (small) gaps and differences between the two.
I can’t tell how soon a certain feature will be available in JGit. It is best to track the respective bug report and state your interest there or on the mailing list.
Best
Rüdiger
Thank you Rüdiger for always replying in stackoverflow. Nice work here!
I’d like to clone only if the remote repository is bare (since, if I understood correctly, only bare repo can be pushed to). Is it possible to know if the remote repo is bare ?
AFAIK it is not possible to check if a repository is bare – as long as you cannot access the repositoy locally.
However, according to this post (http://stackoverflow.com/questions/1764380/push-to-a-non-bare-git-repository), pushing to a non-bare remote is possible. And at least in theory, the bare state of a repository may change.
I wouldn’t pro-actively prevent cloning a non-bare repository but rather transform the result when pushing to a non-bare repository fails into a meaningful error message.
Thanks for the article!
I was wondering, how can it be done if the source URL is local and you want to make sure there are no -hardlinks?
In git, we’d normally use : git clone –no-hardlinks /path/to/source /path/to/dest
Any idea how it can be done?
PS: We’ve been testing the file:// protocol but for bigger repos (~10Gigs), it’s much slower than the local –no-hardlinks above
..forgot to add:
For comparison on the git clone operations done on our big repo via shell:
– git clone file:///path/to/source/repo /path/to/dest/repo # takes about 25-30 minutes
– git clone –no-hardlinks /path/to/source/repo /path/to/dest/repo # takes less than 2 minutes
I suspect that the reason why the file:// protocol is much slower is because it goes through the compression/send/decompression phase while the other one doesn’t
Eric,
thank you for your feedback. From what I see, the -no-hardlinks option is not supported by JGit. If you find this should be supported, please file an enhancement request: https://eclipse.org/jgit/support/
Regarding the performance observations, you way want to forward these to the JGit mailing list https://dev.eclipse.org/mailman/listinfo/jgit-dev
– Rüdiger
Hi Rüdiger,
Thanks for the reply. Too bad “–local –no-hardlinks” isn’t supported.
I will look into the suggestions you’ve provided.
Cheers!
is it possible to read the git repo files without cloning?
In short, no! You need to first clone a repository before you can access its contents.
Hi,
Thank you for your blog!
Maybe you can give me advice with my issue. If I clone repository like:
Git git = Git.cloneRepository()
.setURI(repositoryURI)
.setCredentialsProvider(new UsernamePasswordCredentialsProvider(credentials.getUserName(), credentials.getPassword()))
.setDirectory(localRepositoryDir)
.call();
And after this call push command like
PushCommand pushCommand = repositoryGit
.push()
.setCredentialsProvider(new UsernamePasswordCredentialsProvider(credentials.getUserName(), credentials.getPassword()))
.setForce(true)
.setPushAll();
All works fine. But after this I triying just checkout and push and it not send data to server. No errors. But not results.
Thank you for help!
If you call the
PushCommand
right after theCloneCommand
, it will do nothing, beacuse your local repository is in sync with the remote repository.A checkout does not change the situation either, unless, there are new commits on the checked out branch that have not been published to the remote repository yet.
To find out what the
PushCommand
did, you can evaluate thePushResult
returned bycall()
. It provides very detailed information about every ref that participated in the push operation.For a general understanding of what push does, you may also want to read this SO post.
Hi,
Thank you for help!
I found solution. Problem was in incorrect checkout branch name.
Regards,
Aleksandr.
Hi, I have a requirement to clone a git repository and look for pom.xml and read it, is there a way I can do it without cloning whole repository in the directory?
As commented earlier, you need to clone a Git repository before you can read its content. See also here: https://stackoverflow.com/questions/19414568/viewing-file-from-git-using-jgit-remotely-without-creating-local-repo
Once there is support for shallow clones in JGit, you can reduce the bandwidth needed to clone the repository.
I have similar requirement
– to clone a git repository and look for pom.xml and read it, is there a way I can do it without cloning whole repository in the directory?
Did you got the solution ?
Thanks,
is it possible to do
“git remote rename origin old-origin”
using jgit
JGit does not provide a ready-to-use command to rename a remote repository. However with the help of other APIs, you should be able to do that.
I would start by renaming all remote branches. With the
ListBranchCommand
you can obtain a list of all branches, filter the relevant remote branches and useRenameBranchCommand
to change the remote part of their name (e.g. refs/remotes/old-origin/foo -> refs/remotes/new-origin/foo).Thereafter, only the configuration section should remain with the old name.
repository.getConfig()
gives you access to the repository’s configuration and provides API to copy the remote config section and delete the old one. Finally, save the changes withStoredConfig::save
.Have I forgotten anything that
git remote rename
does while renaming a remote?Can you provide me the whole source code of cloning repository from github because it will very much useful for me.
David,
cloning a repository hosted on GitHub should be no different than cloning any other repository. The article lists various examples that actually clone a GitHub repository, namely
https://github.com/eclipse/jgit.git
.If you are stuck with a specific problem, have you looked on StackOverflow? Over the years, it has become a popular place for JGit questions and answers.
Hi,
Getting java.io.EOFException: Packfile is truncated, while cloning repo, from java code.
Can you please help.
I am afraid I can’t help given the little information that you provide.
Someone on SO suggests that the exception may be related to a slow network connection: https://stackoverflow.com/questions/28528596/packfile-is-truncated-error-while-cloning-git-repository
Is it possible to clone only a specific directory from a Git repository??
Why would you want to do that?
Hi Rüdiger,
I used maven scm plugin to clone repositories. I plan to use JGit now. I see that JGit clones lot more data than the plugin as the .git directory is pretty big now. Any suggestion ? I need to clone one branch only.
You can restrict the number of branches that are cloned with
setBranchesToClone
. For example:Git git = Git.cloneRepository()
.setBranchesToClone(Arrays.asList("refs/heads/master"))
.setDirectory(...)
.setURI(...)
.call();
Presumably, the main reason that JGit clones a lot more data is that it clones the entire history. Until JGit has support for shallow clones (see https://bugs.eclipse.org/bugs/show_bug.cgi?id=475615), I recommend to stay with C Git.
Oh i see ! that’s the reason then.
Yeah i used the same code but still.
Thanks for the insight.
Hello Experts,
I wonder whether it is possible to have paging in JGit. For example, I may have hundreds of files checked out / modified. How can I limit them so that I can show, say, 10 at a time, when I run the StatusCommand in JGit.
Regrads,
-Abdur
Hello Abdur,
JGit is a Java implementation of Git. It has no built-in support for paging. You will have to handle this aspect in your application. For example the HTTP endpoint that serves the list of changes would run the
StatusCommand
and return only the requested slice of entries. If you see that holding the results of the command in memory is an issue, you can use the lower-level APIs (those that theStatusCommand
uses itself) to stream the result. Looking into the source code of the respective command should get your started.HTH
Rüdiger
Hello Rüdiger,
Thank you very much for your detailed reply and the tips for possible implementation. I got to be involved in another more urgent task and could not get back to this task yet.
I’ll look into the implementation of StatusCommand and see how I can implement the paging with status command.
My apologies for the delayed reply.
I appreciate your valuable feedback very highly.
Regards,
-abdur
Hi Rüdiger Herrmann,
I wanted to know how I can clone ALL repositories along with all branches present on git using JGit.
For example, I have a git repositories as follows
Repo A: (Having two branches)
Repo B: (Having three branches)
Repo C:(Having five branches)
Basically, I wanted to traverse all repos with branches and clone them in my local.
Thanks in advance
As described in the article, you can use
setCloneAllBranches
to tell theCloneCommand
to clone all branches.In order to clone multiple repositories, you will need to traverse over the list of repository URLs and call the
CloneCommand
for each of the entries.Thanks for the article. I would like to know if there is some java lib that enables us to clone git repo in memory rather than on the file system. I didn’t get any relevant documentations or articles specific to in-memory git operations. I also read some of your answers on SO.
As far as I know, there is no such library. Though JGit provides an
InMemoryRepository
, it cannot be used as a destination for the clone command.See also https://stackoverflow.com/questions/31271278/clone-a-git-repository-into-an-inmemoryrepository-with-jgit
Hi , thanks a lot for the article. I had a question regarding jgit, would be really grateful if you could help. If i have cloned some repo using user1 credentials then i want to push the changes with user2 credentials . both the users have access to the remote repo. Is it possible ? It’s like multi-user scenario .
You should be able to use different
CredentialsProvider
s with thePushCommand
to pass each user’s credentials.How do i create a feature branch from master branch using java code any suggestion ?
This question has already been answered here: https://stackoverflow.com/q/55089678/2986905