Git (software)

Infobox Software
name = Git

author = Linus Torvalds
developer = Junio Hamano, Linus Torvalds
latest release version = [cite mailing list |mailinglist=git |author=Junio C Hamano |url= |title=ANNOUNCE GIT |date=2008-09-13]
latest release date = September 13 2008
programming language = C, Bourne Shell, Perl [ [;a=tree - git/git.git/tree ] ]
operating system = POSIX
genre = Revision control system
license = GNU General Public License v2
website =

Git is a free distributed revision control, or software source code management project with an emphasis on being fast. Git was initially created by Linus Torvalds for Linux kernel development.

Every Git working directory is a full-fledged repository with complete history and full revision tracking capabilities, not dependent on network access or a central server.

Several high-profile software projects now use Git for revision control [cite web |url= |title=Projects that use Git for their source code management |accessdate=2008-02-20] , most notably the Linux kernel, Server, Qt (toolkit), One Laptop per Child (OLPC) core development [cite web |url= |title=Project hosting |author=OLPC wiki |accessdate=2008-02-20] , Ruby on Rails web framework [cite web |url= |title= "Rails is moving from SVN to Git" |accessdate=2008-04-03] , VLC, and Wine.

Git's current software maintenance is overseen by Junio Hamano.


Linus Torvalds has quipped about the name "", which is British English slang for a stupid or unpleasant person: [cite journal |url= |title=After controversy, Torvalds begins work on git |journal=InfoWorld |issn=0199-6649 |date=2005-04-19 |accessdate=2008-02-20] Cquote|I'm an egotistical bastard, and I name all my projects after myself. First Linux, now git. This self-deprecation is certainly tongue-in-cheek, insofar as Torvalds did not, in fact, name Linux after himself (see History of Linux).

The official Git wiki also gives a number of alternative explanations for the name. [ [ GitFaq: Why the 'git' name?] ]


Git's design was inspired by BitKeeper and Monotone. [cite mailing list |mailinglist=linux-kernel |author=Linus Torvalds |url= |title=Re: ANNOUNCE Git wiki |date=2006-05-05 "Some historical background" on git's predecessors] [cite mailing list |mailinglist=linux-kernel |url= |author=Linus Torvalds |title=Re: Kernel SCM saga |date=2005-04-07] Git was originally designed only as a low-level engine that others could use to write front ends such as Cogito or StGIT. [cite mailing list |author=Linus Torvalds | url= |title=Re: Kernel SCM saga |date=2005-04-08 |mailinglist=linux-kernel |accessdate=2008-02-20] However, the core Git project has since become a complete revision control system that is usable directly. [cite mailing list |mailinglist=git |author=Linus Torvalds |url= |title=Re: Errors GITtifying GCC and Binutils |date=2006-03-23]

Git's design is a synthesis of Torvalds's experience maintaining a large distributed development project, his intimate knowledge of file system performance, and an urgent need to produce a working system in short order. (See the history section for details.) These influences led to the following implementation choices:

* Strong support for non-linear development. Git supports rapid branching and merging, and includes specific tools for visualizing and navigating a non-linear development history. A core assumption in Git is that a change will be merged more often than it is written, as it is passed around various reviewers.
* Distributed development. Like Darcs, BitKeeper, Mercurial, SVK, Bazaar and Monotone, Git gives each developer a local copy of the entire development history, and changes are copied from one such repository to another. These changes are imported as additional development branches, and can be merged in the same way as a locally developed branch.
* Repositories can be published via HTTP, FTP, ssh, rsync, or a Git protocol. Git also has a CVS server emulation, which enables the use of existing CVS clients and IDE plugins to access Git repositories.
* Subversion and svk repositories can be used directly with git-svn.
* Efficient handling of large projects. Torvalds has described Git as being very fast and scalable, [cite mailing list | author=Linus Torvalds |url= |title=Re: VCS comparison table |date=2006-10-19 |mailinglist=git] and performance tests done by Mozilla showed it was an order of magnitude faster than other revision control systems, and two orders of magnitude faster on some operations. [Citation |author=jst | url= | title=bzr/hg/git performance |journal=Jst's Blog |last=Stenback |first=Johnny |date=2006-11-30 |accessdate=2008-02-20, benchmarking "git diff" against "bzr diff", and finding the former 100x faster in some cases.] [cite web | author=Roland Dreier |url= |title=Oh what a relief it is |date=2006-11-13, observing that "git log" is 100x faster than "svn log" because the latter has to contact a remote server.]
* Cryptographic authentication of history. The Git history is stored in such a way that the name of a particular revision (a "commit" in Git terms) depends upon the complete development history leading up to that commit. Once it is published, it is not possible to change the old versions without it being noticed. (Mercurial and Monotone also have this property.)
* Toolkit design. Git was designed as a set of programs written in C, and a number of shell scripts that provide wrappers around those programs. [cite mailing list
author=Linus Torvalds
title=Re: VCS comparison table
, describing Git's script-oriented design
] Although most of those scripts have been rewritten in C as part of an ongoing effort to port it to Microsoft Windows, the design remains, and it is easy to chain the components together to do other clever things. [cite web
title=Git rocks!
, praising Git's scriptability
* Pluggable merge strategies. As part of its toolkit design, Git has a well-defined model of an incomplete merge, and it has multiple algorithms for completing it, culminating in telling the user that it is unable to complete the merge automatically and manual editing is required.
* Garbage accumulates unless collected. Aborting operations or backing out changes will leave useless dangling objects in the database. These are generally a small fraction of the continuously growing history of wanted objects, but reclaiming the space using git-gc --prune can be slow. [cite web
title= Git User's Manual

One property of Git is that it snapshots directory trees of files. The earliest systems for tracking versions of source code, SCCS and RCS, worked on individual files and emphasized the space savings to be gained from delta encoding the (mostly similar) versions. Later revision control systems maintained this notion of a file having an identity across multiple revisions of a project.

Torvalds rejected this concept; [cite mailing list
author=Linus Torvalds
title=Re: more git updates..
] consequently, Git does not explicitly record file revision relationships at any level below the source code tree. This has some significant consequences:

* It is slightly more expensive to examine the change history of a single file than the whole project. [cite mailing list
author=Bruno Haible
title=how to speed up "git log"?
] To obtain a history of changes affecting a given file, Git must walk the global history and then determine whether each change modified that file. This method of examining history does, however, let Git produce with equal efficiency a single history showing the changes to an arbitrary set of files. For example, a subdirectory of the source tree plus an associated global header file is a very common case.
* Renames are handled implicitly rather than explicitly. A common complaint with CVS is that it uses the name of a file to identify its revision history, so moving or renaming a file is not possible without either interrupting its history, or renaming the history and thereby making the history inaccurate. Most post-CVS revision control systems solve this by giving a file a unique long-lived name (a sort of inode number) that survives renaming. Git does not record such an identifier, and this is claimed as an advantage. [cite mailing list
author=Linus Torvalds
title=Re: impure renames / history tracking
] [cite mailing list
author=Junio C Hamano
title=Re: Errors GITtifying GCC and Binutils
] Source code files are sometimes split or merged as well as simply renamed, [cite mailing list
author=Junio C Hamano
title=Re: Errors GITtifying GCC and Binutils
] and recording this as a simple rename would freeze an inaccurate description of what happened in the (immutable) history. Git addresses the issue by detecting renames while browsing the history of snapshots rather than recording it when making the snapshot. [cite mailing list
author=Linus Torvalds
title=Re: git and bzr
, on using git-blame to show code moved between source files
] (Briefly, given a file in revision N, a file of the same name in revision N−1 is its default ancestor. However, when there is no like-named file in revision N−1, Git searches for a file that existed only in revision N−1 and is very similar to the new file.) However, it does require more CPU-intensive work every time history is reviewed, and a number of options to adjust the heuristics.

Additionally, people are sometimes upset by the storage model:

* Periodic explicit object packing. Git stores each newly created object as a separate file. Although individually compressed, this takes a great deal of space and is inefficient. This is solved by the use of "packs" that store a large number of objects in a single file (or network byte stream), delta-compressed among themselves. Packs are compressed using the heuristic that files with the same name are probably similar, but do not depend on it for correctness. Newly created objects (newly added history) are still stored singly, and periodic repacking is required to maintain space efficiency. Git does periodic repacking automatically but manual repacking is also possible with the git-gc command.

Git implements several merging strategies; a non-default can be selected at merge time: [cite web | author=Linus Torvalds | url= | title=git-merge(1) | date=2007-07-18 ]

; resolve: the traditional 3-way merge algorithm.; recursive: This is the default when pulling or merging one branch, and is a variant of the 3-way merge algorithm. "When there are more than one common ancestors that can be used for 3-way merge, it creates a merged tree of the common ancestors and uses that as the reference tree for the 3-way merge. This has been reported to result in fewer merge conflicts without causing mis-merges by tests done on actual merge commits taken from Linux 2.6 kernel development history. Additionally this can detect and handle merges involving renames." [cite web | author=Linus Torvalds | url= | title=CrissCrossMerge | date=2007-07-18 ] ; octopus: This is the default when merging more than two heads.

Early history

Git development began after many Linux kernel developers were forced to give up access to the proprietary BitKeeper system (see BitKeeper - Pricing change). The ability to use BitKeeper free of charge had been withdrawn by the copyright holder Larry McVoy after he claimed Andrew Tridgell had reverse engineered the BitKeeper protocols in violation of the BitKeeper license. At Linux.Conf.Au 2005, Tridgell demonstrated during his keynote that the reverse engineering process he had used was simply to telnet to the appropriate port of a BitKeeper server and type "help". [Citation |author=Jonathan Corbet |journal=Linux Weekly News |url= |title=How Tridge reverse engineered BitKeeper |date=2005-04-20]

Torvalds wanted a distributed system that he could use like BitKeeper, but none of the available free systems met his needs, particularly his performance needs. From an e-mail he wrote on April 7 2005 while writing the first prototype: [cite mailing list |mailinglist=linux-kernel |author=Linus Torvalds |url= |title=Re: Kernel SCM saga.. |date=2005-04-07]

However, the SCMs I've looked at make this hard. One of the things (the main thing, in fact) I've been working at is to make that process really "efficient". If it takes half a minute to apply a patch and remember the changeset boundary etc. (and quite frankly, that's "fast" for most SCMs around for a project the size of Linux), then a series of 250 emails (which is not unheard of at all when I sync with Andrew, for example) takes two hours. If one of the patches in the middle doesn't apply, things are bad bad bad.

Now, BK wasn't a speed demon either (actually, compared to everything else, BK "is" a speed deamon [sic] , often by one or two orders of magnitude), and took about 10–15 seconds per email when I merged with Andrew. HOWEVER, with BK that wasn't as big of an issue, since the BK<->BK merges were so easy, so I never had the slow email merges with any of the other main developers. So a patch-application-based SCM “merger” actually would need to be "faster" than BK is. Which is really really really hard.

So I'm writing some scripts to try to track things a whole lot faster. Initial indications are that I should be able to do it almost as quickly as I can just apply the patch, but quite frankly, I'm at most half done, and if I hit a snag maybe that's not true at all. Anyway, the reason I can do it quickly is that my scripts will "not" be an SCM, they'll be a very specific “log Linus' state” kind of thing. That will make the linear patch merge a lot more time-efficient, and thus possible.

(If a patch apply takes three seconds, even a big series of patches is not a problem: if I get notified within a minute or two that it failed half-way, that's fine, I can then just fix it up manually. That's why latency is critical—if I'd have to do things effectively “offline”, I'd by definition not be able to fix it up when problems happen).

Torvalds had several design criteria:
# Take CVS as an example of what "not" to do; if in doubt, make the exact opposite decision. To quote Torvalds, speaking somewhat tongue-in-cheek:
#: “For the first 10 years of kernel maintenance, we literally used tarballs and patches, which is a much superior source control management system than CVS is, but I did end up using CVS for 7 years at a commercial company " [presumably Transmeta] " and I hate it with a passion. When I say I hate CVS with a passion, I have to also say that if there are any SVN (Subversion) users in the audience, you might want to leave. Because my hatred of CVS has meant that I see Subversion as being the most pointless project ever started. The slogan of Subversion for a while was ‘CVS done right’, or something like that, and if you start with that kind of slogan, there's nowhere you can go. There is no way to do CVS right.”cite video |people=Linus Torvalds |year=2007 |date=05-03 |url= |title=Google tech talk: Linus Torvalds on git |time=02:30 |accessdate=2007-05-16]
# Support a distributed, BitKeeper-like workflow
#: “BitKeeper was not only the first source control system that I ever felt was worth using at all, it was also the source control system that taught me why there's a point to them, and how you actually can do things. So Git in many ways, even though from a technical angle it is very very different from BitKeeper (which was another design goal, because I wanted to make it clear that it wasn't a BitKeeper clone), a lot of the flows we use with Git come directly from the flows we learned from BitKeeper.”
# Very strong safeguards against corruption, either accidental or malicious [cite mailing list |author=Linus Torvalds |authorlink=Linus Torvalds |mailinglist=git |date=2007-06-10 |title=Re: fatal: serious inflate inconsistency |url= A brief description of Git's data integrity design goals.]
# Very high performanceThe first three criteria eliminated every pre-existing version control system except for Monotone, and the fourth excluded everything.So, immediately after the 2.6.12-rc2 Linux kernel development release, he set out to write his own.

The development of Git began on April 3 2005. [cite mailing list |mailinglist=git |author=Linus Torvalds |url= |title=Re: Trivia: When did git self-host? |date= 2007-02-27] The project was announced on April 6, [cite mailing list |mailinglist=linux-kernel |author=Linus Torvalds |url= |title=Kernel SCM saga.. |date= 2005-04-06] and became self-hosting as of April 7. [cite mailing list |mailinglist=git |author=Linus Torvalds |url= |title=Re: Trivia: When did git self-host? |date= 2007-02-27] The first merge of multiple branches was done on April 18. [cite mailing list |mailinglist=git |author=Linus Torvalds |url= |title=First ever real kernel git merge! |date=2005-04-17] Torvalds achieved his performance goals; on April 29, the nascent Git was benchmarked recording patches to the Linux kernel tree at the rate of 6.7 per second. [cite mailing list |mailinglist=git |author=Matt Mackall |url= |title=Mercurial 0.4b vs git patchbomb benchmark |date=2005-04-29] On June 16, the kernel 2.6.12 release was managed by Git. [cite mailing list |mailinglist=git-commits-head |author=Linus Torvalds |url= |title=Linux 2.6.12 |date=2005-06-17]

While strongly influenced by BitKeeper, Torvalds deliberately attempted to avoid conventional approaches, leading to a very novel design. [cite mailing list |mailinglist=git |author=Linus Torvalds |url= |title=Re: VCS comparison table |date=2006-10-20 A discussion of Git vs. BitKeeper] He developed the system until it was usable by technical users, then turned over maintenance on July 26 2005 to Junio Hamano, a major contributor to the project. [cite mailing list |mailinglist=git |author=Linus Torvalds |url= |title=Meet the new maintainer... |date=2005-07-27] Hamano was responsible for the 1.0 release on December 21 2005, [cite mailing list |mailinglist=git |author=Junio C Hamano |url= |title=ANNOUNCE: GIT 1.0.0 |date=2005-12-21] and remains the maintainer as of April 2008.


Like BitKeeper, Git does not use a centralized server. However, Git's primitives are not inherently a SCM system. Torvalds explains, [cite mailing list
author=Linus Torvalds
title=Re: more git updates...
] Cquote
In many ways you can just see git as a filesystem — it's content-addressable, and it has a notion of versioning, but I really really designed it coming at the problem from the viewpoint of a "filesystem" person (hey, kernels is what I do), and I actually have absolutely "zero" interest in creating a traditional SCM system.
:(Note that his opinion has changed since then.) [cite mailing list
author=Linus Torvalds
title=Re: Errors GITtifying GCC and Binutils

Git has two data structures, a mutable "index" that caches information about the working directory and the next revision to be committed, and an immutable, append-only "object database" containing four types of objects:
*A "blob" object is the content of a file. Blob objects have no names, timestamps, or other metadata.
* A "tree" object is the equivalent of a directory: it contains a list of filenames, each with some type bits and the name of a blob or tree object that is that file, symbolic link, or directory's contents. This object describes a snapshot of the source tree.
* A "commit" object links tree objects together into a history. It contains the name of a tree object (of the top-level source directory), a timestamp, a log message, and the names of zero or more parent commit objects.
* A "tag" object is a container that contains reference to another object and can hold additional meta-data related to another object. Most commonly it is used to store a digital signature of a commit object corresponding to a particular release of the data being tracked by Git.

The object database can hold any kind of object. An intermediate layer, the "index", serves as connection point between the object database and the working tree.

Each object is identified by a SHA-1 hash of its contents. Git computes the hash, and uses this value for the object's name. The object is put into a directory matching the first two characters of its hash. The rest of the hash is used as the file name for that object.

Git stores each revision of a file as a unique blob object. The relationships between the blobs can be found through examining the tree and commit objects. Newly added objects are stored in their entirety using zlib compression. This can consume a large amount of hard disk space quickly, so objects can be combined into packs, which use delta compression to save space, storing blobs as their changes relative to other blobs.


Git is primarily developed on Linux, but can be used on other Unix-like operating systems including BSD, Solaris and Darwin. Git is extremely fast on POSIX-based systems such as Linux. [Citation |url= |journal=Jst's Blog |title=bzr/hg/git performance |date=2006-11-30 |first=Johnny |last=Stenback |accessdate=2008-02-20]

Git also runs on Windows. There are two variants:

* A native Microsoft Windows port, called msysgit (using MSYS from MinGW), is approaching completion. [cite web |url= |title=Git on MSYS] There are downloadable installers ready for testing (under the names "Git" and "msysgit", where "Git" is aimed for users). [cite web |url= |title=msysgit] . While somewhat slower than the Linux version [cite mailing list |mailinglist=git |author=Johannes Schindelin |url= |title=Re: Switching from CVS to GIT |date=2007-10-14 A subjective comparison of Git under Windows and Linux on the same system.] , it is acceptably fast [cite mailing list |mailinglist=git |title=Re: Switching from CVS to GIT |url= |author=Martin Langhoff |date=2007-10-15 Experience running msysgit on Windows] and is reported to be usable in production, with only minor awkwardness. [cite mailing list |mailinglist=git |author=Johannes Sixt |url= |title=Re: Switching from CVS to GIT |date=2007-10-15] In particular, some commands are not yet available from the GUIs, and must be invoked from the command line. Many issues have been resolved, such as handling of CRLF line endings and Windows' lack of POSIX compatibility. [cite web |url= |title=POSIX fork()]

* Git also runs on top of Cygwin (a POSIX emulation layer) [cite mailing list |mailinglist=git |title=Re: VCS comparison table |author=Shawn Pearce |url= |date=2006-10-24] , although it is noticeably slower, especially for commands written as shell scripts. [cite mailing list |mailinglist=git |author=Johannes Schindelin |url= |title=Re: PATCH Speedup recursive by flushing index only once for all |date=2007-01-01] This is primarily due to the high cost of the fork emulation performed by Cygwin. However, the recent rewriting of many Git commands implemented as shell scripts in C has resulted in significant speed improvements on Windows. [cite mailing list |mailinglist=git |author=Shawn O. Pearce |url= |title=PATCH 0/5 More builtin-fetch fixes |date=2007-09-18] Regardless, many people find a Cygwin installation too large and invasive for typical Windows use. [cite mailing list |mailinglist=git |author=Kevin Smith |url= |title=Re: git 0.99.7b doesn't build on Cygwin |date=2005-2005-06-28]

Other alternatives for running Git include:

* git-cvsserver (which emulates a CVS server, allowing use of Windows CVS clients): []
* Eclipse IDE-based Git client, based on a pure Java implementation of Git's internals: [ egit]
* NetBeans IDE support for Git is under development. []
* A Windows Explorer extension (a project for a TortoiseCVS/TortoiseSVN-lookalike was started already) [ Git Cheetah]

"Libifying" the lowest-level Git operations would in theory enable re-implementation of the higher-level components for Windows without rewriting the rest. [cite mailing list |mailinglist=git |url= |title=Re: windows problems summary |author=Johannes Schindelin |date=2006-03-02]


Git has been criticized for its usability, documentation, and design. [,-Mercurial,-and-Bzr.html "More on Git, Mercurial, and Bzr" by John Goerzen] [ "Git and hg" by Ted Tso] [ "Some Thoughts On Git" by Paolo Capriotti.]

Until recently, Git's Windows support was poor enough to make projects that support both POSIX and Windows look elsewhere, even if most developers use POSIX-based systems. Examples of projects that publicly ruled out any use of Git, include Mozilla [citation |url= |journal=preed's blah-blah-blahg |title=Version Control System Shootout Redux |last=Reed |first=J. Paul |date=2006-11-27 |accessdate=2008-02-20] , Ruby [cite mailing list |mailinglist=ruby-core |url= |title=merge YARV into Ruby |last=SASADA |first=Koichi |date=2006-11-07 |accessdate=2008-02-20] , and FlightGear [cite mailing list |mailinglist=FlightGear-Devel |url= |title=GIT; Was: _Sport Model_ |date=2008-08-20 |accessdate=2008-08-20] .

See also

* Distributed revision control
* List of revision control software
* Comparison of revision control software


External links

* [ Git homepage]
* [ Original Git Homepage]
* [ Git User's Manual] , also distributed with Git in Documentation/user-manual.txt
* [ Git] - the project page at
* [ Kernel Hackers' Guide to git]
* [ The guts of git] , article by
* [ Git] and [ WhatIsGit] at [ LinuxMIPS] wiki
* [ Projects that use Git] from GitWiki
* [ Google Tech Talk: Randal Schwartz on Git] from
* [ Google Tech Talk: Linus Torvalds on git] from
* [ An introduction to git-svn for Subversion/SVK users and deserters] , article by Sam Vilain
* [ Git for computer scientists] explains how Git conceptually works
* [ Git from the bottom up] is similar to "Git for computer scientists", but more thorough. For some high-level commands, it explains how low-level commands can be used to achieve the same effect.
* [irc:// #git] on freenode
* [ git by example] - simple walk through of common git commands
* [ Git Magic] - a comprehensive listing of Git tips & tricks, popularly referred to as "magic". Describes some of the lesser known features of Git.

Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Git (software) — Git Git Développeurs Junio Hamano Linus Torvalds Dernière version …   Wikipédia en Français

  • Git — may refer to: * Git (album), by Skeletons The Girl Faced Boys *Git (pronoun), the second person, dual, personal pronoun (subject case) in Old English *Git (software), a distributed version control system *Feathers in the Wind (깃) (pronounced and… …   Wikipedia

  • Git — Entwickler Junio C. Hamano, Shawn O. Pearce, Linus Torvalds und viele andere Aktuelle Version (8. November 2011) …   Deutsch Wikipedia

  • Git — Git …   Википедия

  • GIT (Begriffsklärung) — GIT steht für: Gas Innendruck Technik, siehe GIT Gastrointestinaltrakt (Magen Darm Trakt) Guitar Institute of Technology die Software Sammlung GNU Interactive Tools Git steht für die Versionsverwaltungssoftware Git in einem musikalischen Kontext… …   Deutsch Wikipedia

  • Git (Begriffsklärung) — GIT steht für: Gas Innendruck Technik oder Gasinjektionstechnik, siehe Innendruck Spritzgießen Gastrointestinaltrakt (Magen Darm Trakt) Guitar Institute of Technology die Software Sammlung GNU Interactive Tools GIT Verlag, deutscher Fachverlag,… …   Deutsch Wikipedia

  • Software configuration management — Saltar a navegación, búsqueda Software Configuration Management (SCM) ó en castellano Gestión de configuración de software es una especialización de la Gestión de configuración a todas las actividades en el sector del desarrollo de software. SCM… …   Wikipedia Español

  • Git — Para otros usos de este término, véase GIT. Git Desarrollador Junio Hamano, Linus Torvalds …   Wikipedia Español

  • GIT Verlag — Logo Der GIT Verlag (Glas – Instrumenten – Technik) ist ein Fachverlag mit Sitz in Darmstadt, der sich auf die Herausgabe von Titeln in den Bereichen Labortechnik, Chemie, Automation und Gesundheit spezialisiert hat. Der Verlag gibt 40 Titel… …   Deutsch Wikipedia

  • Software Configuration Management — Das Software Configuration Management (SCM) oder Softwarekonfigurationsmanagement ist eine Spezialisierung des Konfigurationsmanagements auf alle Aktivitäten im Bereich der Software Entwicklung. SCM hat mehrere Ziele: Definition und Verfolgung… …   Deutsch Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”

We are using cookies for the best presentation of our site. Continuing to use this site, you agree with this.