Getting Started with JGit

Home  >>  Eclipse  >>  Getting Started with JGit

Getting Started with JGit

On December 15, 2015, Posted by , In Eclipse, By ,,, , With 39 Comments

If you ever wondered how basic Git commands like git init, git checkout and so on are executed in JGit, read on.

This tutorial gives an overview of the most commonly used git commands and their counterparts in JGit. It walks through the steps to create a repository, fetch contents from a remote, add and remove files to/from the history, inspect the history, and finally push back the changes to the originating repository.

JGit provides an API that is similar to the Git high-level commands. Instead of

git commit -m "Gabba Gabba Hey"

on the command line, you would write

git.commit().setMessage( "Gabba Gabba Hey" ).call();

in JGit.

All JGit commands have a call() method that, after setting up the command is used to actually execute it. The classes are named after the respective Git command with the suffix Command. While some commands offer a public constructor, it is recommended to use the Git factory class to create command instances like shown in the above example.

Getting the Library

But before diving further into the JGit API, let’s get hold of the library first. The most common way to get JGit is probably from the Maven repository. But if you prefer OSGi bundles, then there is also a p2 repository for you. The download page lists the necessary information to integrate the library.

For the scope of this article, it is sufficient to integrate what is referred to as the core library in project/bundle org.eclipse.jgit. If you are interested what else there is in the JGit source code repository, I recommend reading the Introduction to the JGit Sources

Creating a Repository

To start with, we need a repository. And to get hold of such thing, we can either initialize a new repository or clone an existing one.

The InitCommand lets us create an empty repository. The following line

Git git = Git.init().setDirectory( "/path/to/repo" ).call();

will create a repository with a work directory at the location given to setDirectory(). The .git directory will be directly underneath in /path/to/repo/.git. For a detailed explanation of the InitCommand please read the Initializing Git Repositories with JGit article.

An existing repository can be cloned with the CloneCommand

Git git = Git.cloneRepository()
  .setURI( "https://github.com/eclipse/jgit.git" )
  .setDirectory( "/path/to/repo" )
  .call();

The code above will clone the JGit repository into the local directory ‘path/to/repo’. All options of the CloneCommand are expained in depth in How to Clone Git Repositories with JGit.

If you happen to have an existing local repository already that you wish to use, you can do so as described in How to Access a Git Repository with JGit.


Close Git When Done
Note that commands that return an instance of Git like the InitCommand or CloneCommand may leak file handles if they are not explicitly closed (git.close()) when no longer needed.

Fortunately, Git implements AutoCloseable so that you can use the try-with-resources statement.

Populating a Repository

Now that we have a repository, we can start populating its history. But to commit a file, we first need to add it to the so-called index (aka staging area). The commit command will only consider files that are added to (or removed from) the index.

The JGit command therefore is – you guess it – the AddCommand.

DirCache index = git.add().addFilepattern( "readme.txt" ).call();

Consequently the above line adds the file readme.txt to the index. It is noteworthy that the actual contents of the file are copied to the index. This means that later modifications to the file will not be contained in the index, unless they are added again.

The path given to addFilepattern() must be relative to the work directory root. If a path does not point to an existing file, it is simply ignored.

Though the method name suggests that also patterns are accepted, the JGit support, therefore, is limited. Passing a ‘.’ will add all files within the working directory recursively. But fileglobs (e.g. *.java) as they are available in native Git are not yet supported.

The index returned by call(), in JGit named DirCache, can be examined to verify that it actually contains what we expect. Its getEntryCount() method returns the total number of files and getEntry() returns the entry at the specified position.

Now everything is prepared to use the CommitCommand in order to store the changes in the repository.

RevCommit commit = git.commit().setMessage( "Create readme file" ).call();

At least the message must be specified, otherwise call() will complain with a NoMessageException. An empty message, however is allowed. The author and committer are taken from the configuration if not denoted with the accordingly labelled methods.

See also  How do you import static in Eclipse?

The returned RevCommit describes the commit with its message, author, committer, time stamp, and of course a pointer to the tree of files and directories that constitute this commit.

In the same way that new or changed files need to be added, deleted files need to be removed explicitly. The RmCommand is the counterpart of the AddCommand and can be used in the same way (with the contrary result of course).

DirCache index = git.rm().addFilepattern( "readme.txt" ).call();

The above line will remove the given file again. Since it is the only file within the repository, the returned index will return zero when asked for the number of entries in it.

Unless setCached( true ) was specified, the file will also be deleted from the work directory. Because Git does not track directories the RmCommand also deletes empty parent directories of the given files.

An attempt to remove a non-existing file is ignored. But unlike the AddCommand, the RmCommand does not accept wildcards in its addFilepattern() method. All files to be removed need to be specified individually.

And with the next commit, these changes will be stored in the repository. Note that it is perfectly legal to create an empty commit, i.e. one that hasn’t had files added or removed before executed. Though I’m not aware of a decent use case.

State of a Repository

The status command lists files that have differences between either the index and the current HEAD commit or the working directory and the index or files that are not tracked by Git.

In its simplest form, the StatusCommand collects the status of all files that belong to the repository:

Status status = git.status().call();

The getters of the Status object should be self-explaining. They return the set of file names which are in the state that the method name describes. For example, after the readme.txt file was added to the index like shown previously, status.getAdded() would return a set that contains the path to the just added file.

If there are no differences at all and no untracked files either, Status.isClean() will return true. And as its name implies, returns Status.hasUncommittedChanges() true if there are uncommitted changes.

With addPath(), the StatusCommand can be configured to show only the status of certain files. The given path must either name a file or a directory. Non-existing paths are ignored and regular expressions or wildcards are not supported.

Status status = git.status().addPath( "documentation" ).call();

In the above example, the status of all files recursively underneath the ‘documentation’ directory will be computed.

Exploring a Repository

Now that the repository has a (small) history we will look into the command to list existing commits.

The simplest form of the git log counterpart of JGit allows to list all commits that are reachable from current HEAD.

Iterable<RevCommit> iterable = git.log().call();

The returned iterator can be used to loop over all commits that are found by the LogCommand.

For more advanced use cases I recommend using the RevWalk API directly, the same class that is also used by the LogCommand. Apart from providing more flexibility it also avoids a possible resource leak that occurs because the RevWalk that is used internally by the LogCommand is never closed.

For example, its markStart() method can be used to also list commits that are reachable from other branches (or more generally speaking from other refs).

Unfortunately, only ObjectIds are accepted and therefore the desired refs need to be resolved first. An ObjectId in JGit encapsulates an SHA-1 hash that points to an object in Gits object database. Here, ObjectIds, that point to commits, are required and resolving in this context means to obtain the ObjectId that a particular ref points to.

Putting it all together, it looks like the snippet below:

Repository repository = git.getRepository()
try( RevWalk revWalk = new RevWalk( repository ) ) {
  ObjectId commitId = repository.resolve( "refs/heads/side-branch" );
  revWalk.markStart( revWalk.parseCommit( commitId ) );
  for( RevCommit commit : revWalk ) {
    System.out.println( commit.getFullMessage );
  }
}

The commit id to which the branch ‘side-branch’ points is obtained and then the RevWalk is instructed to start iterating over the history from there. Because markStart() requires a RevCommit, RevWalk’s parseCommit() is used to resolve the commit id into an actual commit.

Once the RevWalk is set up, the snippet loops over the commits to print the message of each commit.
The try-with-resource statement ensures that the RevWalk will be closed when done. Note that it is legal to call markStart() multiple times to include multiple refs into the traversal.

See also  Clean Sheet Service Update

A RevWalk can also be configured to filter commits, either by matching attributes of the commit object itself or by matching paths of the directory tree that it represents. If known in advance, uninteresting commits and their ancestry chain can be excluded from the output. And of course, the output can be sorted, for example by date or topologically (all children before parents). But these features are outside of the scope of this article but may be covered in a future article of its own.

Exchanging with a Remote Repository

Often a local repository was cloned from a remote repository. And the changes that were made locally should ultimately be published to the originating repository. To accomplish this, there is the PushCommand, the counterpart of git push.

The simplest form will push the current branch to its corresponding remote branch.

Iterable<PushResult> iterable = local.push().call();
PushResult pushResult = iterable.iterator().next();
Status status 
  = pushResult.getRemoteUpdate( "refs/heads/master" ).getStatus();

The command returns an iterable of PushResults. In the above case the iterable holds a single element. To verify that the push succeeded, the pushResult can be asked to return a RemoteRefUpdate for a given branch.

A RemoteRefUpdate describes in detail what was updated and how it was updated. But it also has a status attribute that summarizes the outcome. And if the status returns OK, we can rest assured that the operation succeeded.

Even though the command works without giving any advice, it has plenty of options. However, in the following only the more commonly used are listed. By default, the command pushes to the default remote called ‘origin’. Use setRemote() to specify the URL or name of a different remote repository. If other branches than the current one should be pushed refspecs can be specified with setRefSpec(). Whether tags should also be transferred can be controlled with setPushTags(). And finally, if you are uncertain whether the outcome is desired, there is a dry-run option that allows simulating a push operation.

Now that we have seen how to transfer local objects to a remote repository we will look a how the opposite direction works. The FetchCommand can be used much like its push counterpart and also succeeds with its default settings.

FetchResult fetchResult = local.fetch().call();
TrackingRefUpdate refUpdate 
  = fetchResult.getTrackingRefUpdate( "refs/remotes/origin/master" );
Result result = refUpdate.getResult();

Without further configuration, the command fetches changes from the branch that corresponds to the current branch on the default remote.

The FetchResult provides detailed information about the outcome of the operation. For each affected branch, a TrackingRefUpdate instance can be obtained. Most interesting probably is the return value of getResult() that summarizes how the update turned out. In addition it holds information about which local ref (getLocalName()) was updated with which remote ref (getRemoteName()) and to which object id the local ref pointed before and after the update (getOldObjectId() and getNewObjectid()).

If the remote repository requires authentication, the PushCommand and FetchCommand can be prepared in the same way as all commands that communicate with remote repositories. A detailed discussion can be found in the JGit Authentication Explained article.

Concluding Getting Started with JGit

Now it is your turn to take JGit for a spin. The high-level JGit API isn’t hard to understand. If you know what git command to use, you can easily guess which classes and methods to use in JGit.

While not all subtleties of the of the Git command line are available, there is solid support for the most often used functionalities. And if there is something crucial missing, you can often resort to the lower-level APIs of JGit to work around the limitation.

The snippets shown throughout the article are excerpts of a collection of learning tests. The full version can be found here:
https://gist.github.com/rherrmann/433adb44b3d15ed0f0c7

If you still have difficulties or questions, please leave a comment or ask the friendly and helpful JGit community for assistance.

Rüdiger Herrmann
Follow me
Latest posts by Rüdiger Herrmann (see all)

39 Comments so far:

  1. renga says:

    I tried to use Status status = git.status().addPath( “documentation” ).call(); but i am unable to get addPath() after status…can u help me in this

    • Rüdiger Herrmann says:

      I am afraid I can’t help you. I don’t have the slightest idea what ‘get addPath() after status’ means.

  2. renga says:

    i tried to use Status status = git.status().addPath( “documentation” ).call().i got an error says, The method addpath() is undefined for the type StatusCommand

    • Rüdiger Herrmann says:

      addPath() is availabel since JGit v3.1. It looks like you need to upgrade to a newer version of JGit.

  3. gihan says:

    can i commit a file to a remote repository without cloning the base to my local machine?

    • Rüdiger Herrmann says:

      In short, no. You need to first create a local clone of the remote repository. Changes can only be committed to the local copy. Finally, by pushing to the originating repository, the local changes are made available on the remote repository for others to fetch.

  4. Uday Soni says:

    Hi ,

    I want to build utility which takes remote URL of git repository as a argument 1 and search a given string ( argument 2 ) on the code base.

    Do we need to clone and checkout the remote repository in code to use the git command like

    TreeWalk,RevWalk, RevCommit etc .

    Or we can simply search on remote URL using git grep.

    Please advise.

    • Rüdiger Herrmann says:

      Almost all Git commands require a local repository and searching its contents is no exception to this. RevWalk and friends as well as git grep require a local clone of the repository to search. In order to search a repository’s history you don’t necessarily need to checkout the work directory.

      HTH
      Rüdiger

  5. Anand says:

    Very helpful article.
    I am trying to perform continuous Integration / continuous deployment from GIT based Repo to Azure Cloud without involving Local repo creation or interference in between.
    Logic being if any push is performed by any user in that Repo then it has to get deployed in Azure Cloud.

    Please help me with necessary code snippets or how to connect my GIT repo with Azure.

    Thanks
    Anand

    • Rüdiger Herrmann says:

      Anand,

      I am glad this article helped. I am afraid I neither uderstand what you are trying to achive nor what code snippets are ‘necessary’. If you have a specific problem, you may want to search Google or post a question on SO. If that doesn’t servie your needs, please contact me in private to receive an offer for professional services.

      Best,
      Rüdiger

  6. renganathan says:

    I am having a local repository ,Which has sub module repository also.If i try to access the repository from two different instance of eclipse third party tool.Will git prevent the access for second third party tool if that eclipse repository is being by used by first third eclipse party tool? if git does not restrict the second third party tool how to do that restriction.User is same for all third party tool

    • Rüdiger Herrmann says:

      Git as well as JGit use lock files to protect against multiple threads or processes writing to the same repository at the same time.

      JGit’s API signals a lock failure through corresponding return values, For example, if a ref cannot be updated because another process holds the lock, RefUpdate::update() returns LOCK_FAILURE.

  7. nihel says:

    thank you for your article. it is really detailed and well explained.But i didn’t untrestand how to use jgit just on one file without creating a repository. i am just looking for printing commit history on just one file without defining a new reepository ( my file exists already in a git repository but i don’t know how to refer to it ) from my actual work.
    thanks
    Nihel

    • Rüdiger Herrmann says:

      Nihel,
      you cannot ‘use JGit … without creating a repository’. By design, distributed versioning systems require you to create a local copy in order to access or modify its contents. In Git, this is called a clone. Hence you need to first create a local copy of a remote repository as outlined above.

      Afterwards, you can use JGut or Git to access its history.

      HTH
      Rüdiger

  8. Heikki Doeleman says:

    So in order to use JGit you need a local installation of Git, too ?

    This would then introduce a dependency on which version of Git you have installed, I think ? Where can we find information like this ?

  9. Carlos says:

    I am doing, pull, commit, push in my method.

    I would expect exceptions when something fails.
    For what I see, in JGit I need for example when performing a “push” to iterate over the results and check the status?

    In JGit commands might fail and no exception is thrown, right?

  10. Rüdiger Herrmann says:

    Admittedly, the way JGit commands report their result is inconsistent in some areas. As you pointed out, the PushCommand returns a result that need to be examined in order to ensure that the command succeeded. If not, the result gives you detailed information of what went wrong.

    The reason to return such information instead of throwing an exception might be that – for example – it is not exactly a failure to attempt to push a non-fast-forwardable commit.

    However, as said earlier, not all commands behave the same way. Connection errors, for example, are reported as exceptions and a checkout that cannot be performed because of conflicting files in the working directly also causes an exception.

    All JGit commands return a result that – if applicable – should be examine to make sure the command succeeded.

  11. Madhuri says:

    Hi Rüdiger Herrmann

    This is so helpful article.

    I have tried to execute below statement from eclipse.
    Git git = Git.init().setDirectory(file).call();
    I have included all required jars to my project build path. I am getting below exception each time.

    Exception in thread “main” java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory
    at org.eclipse.jgit.lib.Repository.(Repository.java:117)
    at org.eclipse.jgit.lib.BaseRepositoryBuilder.build(BaseRepositoryBuilder.java:612)
    at org.eclipse.jgit.api.InitCommand.call(InitCommand.java:120)
    at Prep.main(Prep.java:113)
    Caused by: java.lang.ClassNotFoundException: org.slf4j.LoggerFactory
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    … 4 more

    • Rüdiger Herrmann says:

      Glad that you found this article helpful. The NoClassDefFoundError is not JGit specific. Most likely slf4j is missing from your class path.

  12. Fallou says:

    Very helpful article! Is there any methods in jGit matching the ‘git for-each-ref’ command?

    • Rüdiger Herrmann says:

      Repository::getAllRefs returns a Map of all refs. Key is the full name of the ref as a String, value is the corresponding Ref object.

      With Repository::getRefDatabase you get access to the ref database that manages the refs/ directory and its sub-directories. If you are interested in ref-like elements like FETCH_HEAD et al, RefDatabase::getAdditionalRefs will return them.

  13. Zhang says:

    How to execute the remote instruction after the local library init

    • Rüdiger Herrmann says:

      I am afraid, I don’t understand your question. Which remote instruction do you refer to? And what is a local library init? Do you mean local repository init? Some more context and code that you tried would also help understand your problem.

  14. Sai Chaitanya says:

    Hi, Is there any way that we can do git svn clone with jgit ?

    • Rüdiger Herrmann says:

      JGit does not and will not support git svn commands (see https://bugs.eclipse.org/bugs/show_bug.cgi?id=343067)

      If you have access to the Subversion repository, you may install SubGit which will create a Git interface for Subversion.

      Ideally, you would convert the Subversion repository to Git and spare yourself a lot of trouble which emulations and always bear.

  15. Saif Ahmad says:

    Hi, Could you pleas help me while converting ByteArray to String. it has encoding problem could you please let me know what i am doing wrong here. As you can see my below code

    ObjectId objectId = treeWalk.getObjectId(0);
    ObjectLoader loader = repository.open(objectId);
    byte [] byteData= loader.getBytes();
    String sData = new String(byteData, “UTF-8”);

    The below code execution output is,
    æ?⛲ÑÖCK‹)®wZØÂäŒS‘100644 Version.h

  16. Pulkit says:

    How do I find the names of all the files that were committed between 2 specific dates( YYYY/MM/DD hh:mm:ss) ?

  17. Tebogo says:

    Hi

    After i commit a file on local repository, when i try to read the file back using the commit id, i just get commit ids ,tree ids etc written on the file.

    public void init() throws IOException, GitAPIException {

    Git git = Git.open(new File(“/C:/temp/testing/”));
    File file = new File(“C:/temp/testing/”,”helloworld.txt”);

    git.add().addFilepattern(file.getName()).call();
    RevCommit test_commit = git.commit().setMessage(“add text file just”).call();

    get(file.getName(), test_commit.getId().toObjectId().getName(), git.getRepository(),test_commit.getId());
    System.out.println(test_commit.toString());

    }

    public void get(String file, String commid, Repository repository, ObjectId id) {
    try {

    System.out.println(“Looking up file {} in revision {}” + file + commid);

    ObjectId obj = ObjectId.fromString(commid);
    byte[] filecontent = repository.open(obj).getCachedBytes();
    File files = getLocalFileForByteArray(filecontent,file);
    System.out.println(“done”);

    } catch (Exception e) {
    }
    }

    public static File getLocalFileForByteArray(final byte[] fileContents, final String fileName) throws IOException {
    final Path tempDirectory = Files.createTempDirectory(“import”);
    final Path localTempFile = Files.createFile(Paths.get(tempDirectory.toString(), fileName));
    final ByteArrayInputStream byteInputStream = new ByteArrayInputStream(fileContents);
    FileCopyUtils.copy(byteInputStream, Files.newOutputStream(localTempFile));
    return localTempFile.toFile();

    }

  18. Martin d'Anjou says:

    There is a typo:
    “`
    DirCache index = git.add().addFilePattern( “readme.txt” ).call();
    “`

    Should be:
    “`
    DirCache index = git.add().addFilepattern( “readme.txt” ).call(); // lowercase pattern
    “`

  19. Rüdiger Herrmann says:

    Good catch! Thank you very much, Martin. I’ve updated the post.