Skip to content

Git hash-object

Resources

From a-plumbers-guide-to-git.

The hash-object command takes a path to a file, reads its contents, and saves the contents of the file to the Git object store. It returns a hex string – the ID of the object it just created.

echo "An awesome aardvark admires the Alps" > animals.txt
git hash-object -w animals.txt
# a37f3f668f09c61b7c12e857328f587c311e5d1d

find .git/objects -type f
# .git/objects/a3/7f3f668f09c61b7c12e857328f587c311e5d1d

The first two characters of the hex string are the directory name.

The object ID is chosen based on the contents of the object – specifically, prepend a short header to the file, then take the SHA1 hash. This is how Git stores all of its objects – the content of an object determines its ID. The technical name for this is a content-addressable filesystem.

Files with same content

What if you save two files with the same contents, but different filenames? What do you see in .git/objects?

hash-object returns the same ID for files with the same content. Which means that hash-object is only saving the contents of the files – it isn’t saving anything about their filenames. Each object ID is a pointer to some text, but that text isn’t associated with a filename.

Retrieving data from git objects

Use git cat-file.