
Ever wondered what actually happens when you run git add or git commit? This deep-dive explores Git's internal architecture, demystifies the .git folder, and reveals how Git uses blobs, trees, and commits to track your code's history with cryptographic precision.
In my previous post, we learned how to use Git. But have you ever wondered what's actually happening under the hood? What is that mysterious .git folder doing? Why does Git use weird hexadecimal strings everywhere? How does it know exactly what changed in your files?
Today, we're going on a journey inside Git's brain. By the end of this post, you'll understand the elegant simplicity that powers one of the most important tools in software development. Trust me, once you "get" how Git works internally, the commands will make so much more sense.
Let's start with the elephant in the room - that .git folder that appears when you run git init.
The .git folder is Git's database. It's where Git stores everything it knows about your project's history - every commit, every file version, every branch, everything.
Here's the beautiful part: everything you need to know about your project's entire history is contained in this one folder. Delete your working files? No problem, Git can restore them from .git. Your entire project's timeline, all branches, all commits - it's all right there.
Think of it like this:
.git folder = A filing cabinet with perfect records of everythingWithout the .git folder, Git would be useless. This folder is the reason Git can:
When you run git init, Git creates this folder and says: "I'm ready to track your project now!"
Run this in any Git repository:
bashls -la .git/
You'll see something like:
.git/ ├── HEAD ← Points to your current branch ├── config ← Repository settings ├── description ← Repository description ├── index ← Staging area (more on this later!) ├── hooks/ ← Scripts that run at certain Git events ├── info/ ← Repository info and exclude patterns ├── objects/ ← THE HEART: Where all your data lives ├── refs/ ← Pointers to commits (branches and tags) │ ├── heads/ ← Your branches live here │ └── tags/ ← Your tags live here └── logs/ ← History of where branches have pointed
Don't worry if this looks overwhelming. We'll break down the important parts piece by piece.
Here's where it gets fascinating. Git stores everything as objects in the objects/ folder. There are only four types of objects, and understanding them is the key to understanding Git.
(There's also a fourth type called "tag", but we'll focus on these three)
Let me explain each one with a real-world analogy.
A blob is how Git stores your actual file content. Just the content - no filename, no folder location, just the raw data.
Analogy: Think of a blob like a book without a cover. It has all the content (the pages), but no title or author information.
When you save a file in Git, it:
objects/ as a blobKey insight: Git doesn't store filenames in blobs! It stores them in trees (next section).
Example:
bash# Let's say you have index.html with this content: # <h1>Hello World</h1> # Git creates a blob object containing that exact text # The blob gets a unique hash like: 557db03de997c86a4a028e1ebd3a1ceb225be238
Mind-blowing fact: If you have the exact same file content in 100 different places in your project, Git only stores it once as a single blob. Git is smart about deduplication!
A tree object represents a directory. It contains:
Analogy: A tree is like a table of contents. It says: "In this directory, there's a file called index.html that points to this blob, and a subdirectory called css/ that points to another tree."
Structure:
Tree Object for my-project/ ├── blob 557db03 index.html ├── blob 8a3f2bc README.md └── tree 4d5e6a7 css/ └── blob 9f8e7d6 style.css
Example breakdown:
bash# Your project folder: my-project/ ├── index.html ← Stored as blob 557db03 ├── README.md ← Stored as blob 8a3f2bc └── css/ └── style.css ← Stored as blob 9f8e7d6 # Git creates: # - 3 blob objects (for the 3 files) # - 2 tree objects (for my-project/ and css/)
Key insight: Trees give context to blobs. They're what turn "random file content" into "this is index.html in the root directory."
Note: You can run the following command in any of your git tracked projects too see the tree structure
gitgit ls-tree HEAD
A commit object is a snapshot of your entire project at a specific moment. It contains:
Analogy: A commit is like a photograph with metadata. The photo shows what your project looked like (the tree), and the metadata tells you when it was taken, who took it, and what the occasion was (commit message).
Structure:
Commit Object a3f8b2c ├── tree: 4d5e6a7 ← Points to root tree ├── parent: 7b9e1f3 ← Points to previous commit ├── author: Your Name ├── committer: Your Name ├── date: 2025-12-30 14:32:01 └── message: "Add homepage and styling"
The Commit Chain:
C1 ← C2 ← C3 ← C4 (HEAD → main) │ │ │ │ │ │ │ └─ tree_4 │ │ └────── tree_3 │ └─────────── tree_2 └──────────────── tree_1 Each commit points to its parent, creating a linked list of history
Key insight: Commits are immutable (unchangeable). Once created, a commit's hash never changes. This is why Git is so reliable - history can't be accidentally modified.
Let me show you how these three object types work together with a real example.
bashmy-website/ ├── index.html # Contains: <h1>Welcome</h1> ├── about.html # Contains: <h1>About Us</h1> └── css/ └── style.css # Contains: h1 { color: blue; }
┌─────────────────────────────────────────────────────────────┐ │ COMMIT OBJECT │ │ hash: a3f8b2c │ │ message: "Initial website structure" │ │ author: You │ │ tree: → ┌───────────────────────────────────┐ │ └──────────│ TREE (root: my-website/) │ │ │ hash: 4d5e6a7 │ │ │ │ │ │ ├─ blob 1a2b3c index.html │ │ │ ├─ blob 4d5e6f about.html │ │ │ └─ tree 7g8h9i css/ │ │ └──────────────┬────────────────────┬┘ │ │ │ │ ▼ ▼ │ ┌──────────────────┐ ┌──────────────────┐ │ │ BLOB │ │ TREE (css/) │ │ │ hash: 1a2b3c │ │ hash: 7g8h9i │ │ │ │ │ │ │ │ <h1>Welcome</h1> │ │ blob 9j0k1l │ │ └──────────────────┘ │ style.css │ │ └─────────┬────────┘ │ ┌──────────────────┐ │ │ │ BLOB │ ▼ │ │ hash: 4d5e6f │ ┌──────────────────┐ │ │ │ │ BLOB │ │ │ <h1>About Us</h1>│ │ hash: 9j0k1l │ │ └──────────────────┘ │ │ │ │ h1{color: blue;} │ │ └──────────────────┘ │
What this means:
Notice all those weird strings like a3f8b2c and 1a2b3c? These are SHA-1 hashes - unique identifiers that Git generates.
How hashes work:
bash# Git runs your content through a cryptographic function Content: "<h1>Welcome</h1>" ↓ SHA-1 hash function ↓ Result: 1a2b3c4d5e6f7g8h9i0j1k2l3m4n5o6p7q8r9s0
Properties of hashes:
objects/Example:
bash# If two people create files with identical content Person A: index.html → "Hello World" Person B: main.html → "Hello World" # Git creates only ONE blob object because the content is identical # Both filenames point to the same blob hash
This is why Git is incredibly efficient with storage!
Remember the staging area from the basics guide? Let's see what it actually is.
The index (also called staging area) is a binary file at .git/index. It's a snapshot of your next commit.
Think of it as a draft:
The index contains:
Example:
bash# Before git add Index: (empty or contains previous commit's state) # After: git add index.html Index: ├── index.html → blob 1a2b3c (staged, ready to commit) # After: git add about.html css/style.css Index: ├── index.html → blob 1a2b3c ├── about.html → blob 4d5e6f └── css/style.css → blob 9j0k1l
The index is Git's way of asking: "Are you sure you want to commit these specific changes?"
Let's trace exactly what happens when you run git add index.html.
bash$ git add index.html
Git performs these steps:
┌──────────────────────────────────────────────────────────┐ │ STEP 1: Read the file │ │ ────────────────────────────────────────────────────────│ │ Git reads index.html from your working directory │ │ Content: "<h1>Welcome to my site</h1>" │ └──────────────────────────────────────────────────────────┘ ↓ ┌──────────────────────────────────────────────────────────┐ │ STEP 2: Create a blob object │ │ ────────────────────────────────────────────────────────│ │ Git compresses the content and generates a hash │ │ Hash: 1a2b3c4d... │ │ │ │ Git stores this as: .git/objects/1a/2b3c4d... │ │ (First 2 chars = directory, rest = filename) │ └──────────────────────────────────────────────────────────┘ ↓ ┌──────────────────────────────────────────────────────────┐ │ STEP 3: Update the index │ │ ────────────────────────────────────────────────────────│ │ Git updates .git/index with: │ │ - Filename: index.html │ │ - Blob hash: 1a2b3c4d... │ │ - Metadata: permissions, size, timestamps │ └──────────────────────────────────────────────────────────┘
Key takeaways:
git add creates blob objects immediately (not during commit!).git/objects/bash# After git add index.html # Find the blob object: git hash-object index.html # Output: 1a2b3c4d5e6f... (the blob hash) # Verify Git stored it: find .git/objects -type f # You'll see: .git/objects/1a/2b3c4d5e6f... # Read the blob content back: git cat-file -p 1a2b3c4d5e6f # Output: <h1>Welcome to my site</h1>
Mind blown? Git already saved your file content even before you commit!
Now let's see what happens when you run git commit -m "Initial commit".
bash$ git commit -m "Add homepage"
Git performs these steps:
┌──────────────────────────────────────────────────────────┐ │ STEP 1: Create tree objects │ │ ────────────────────────────────────────────────────────│ │ Git looks at the index and creates tree objects │ │ representing your directory structure │ │ │ │ For subdirectories, Git creates nested trees │ │ │ │ Root tree (4d5e6a7): │ │ ├─ blob 1a2b3c index.html │ │ ├─ blob 4d5e6f about.html │ │ └─ tree 7g8h9i css/ │ │ │ │ css/ tree (7g8h9i): │ │ └─ blob 9j0k1l style.css │ └──────────────────────────────────────────────────────────┘ ↓ ┌──────────────────────────────────────────────────────────┐ │ STEP 2: Create commit object │ │ ────────────────────────────────────────────────────────│ │ Git creates a commit object containing: │ │ │ │ tree 4d5e6a7 ← Root tree │ │ parent 7b9e1f3 ← Previous commit │ │ author Your Name <you@email> │ │ committer Your Name <you@email> │ │ timestamp 1735567921 │ │ │ │ Add homepage ← Your message │ │ │ │ Hash: a3f8b2c9d... │ └──────────────────────────────────────────────────────────┘ ↓ ┌──────────────────────────────────────────────────────────┐ │ STEP 3: Update the branch reference │ │ ────────────────────────────────────────────────────────│ │ Git updates .git/refs/heads/main to point to new commit │ │ │ │ Before: .git/refs/heads/main → 7b9e1f3 (old commit) │ │ After: .git/refs/heads/main → a3f8b2c (new commit) │ └──────────────────────────────────────────────────────────┘ ↓ ┌──────────────────────────────────────────────────────────┐ │ STEP 4: Update HEAD │ │ ────────────────────────────────────────────────────────│ │ HEAD points to your current branch: │ │ │ │ .git/HEAD → ref: refs/heads/main → a3f8b2c │ └──────────────────────────────────────────────────────────┘
Result:
a3f8b2c4d5e6a7 and 7g8h9igit add)Let's visualize the entire add → commit process:
┌────────────────────────────────────────────────────────────────┐ │ COMPLETE GIT ADD → COMMIT FLOW │ └────────────────────────────────────────────────────────────────┘ INITIAL STATE ───────────── Working Directory Index Repository ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ index.html │ │ (empty) │ │ Commit C1 │ │ (modified) │ │ │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ AFTER: git add index.html ─────────────────────────── Working Directory Index Repository ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ index.html │────▶│ index.html │ │ Commit C1 │ │ │ │ → blob 1a2b │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │ └─────────────────────┤ │ .git/objects/ │ ├─ 1a/2b3c... ◀ AFTER: git commit -m "Update homepage" ─────────────────────────────────────── Working Directory Index Repository ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ index.html │ │ index.html │ │ Commit C1 │ │ │ │ → blob 1a2b │ │ ↑ │ └──────────────┘ └──────────────┘ │ Commit C2 │◀── HEAD │ (new!) │ └──────────────┘ │ .git/objects/ │ ├─ 1a/2b3c... │ (blob) ├─ 4d/5e6a... │ (tree) └─ a3/f8b2... ◀ (commit)
This is where Git's design really shines. Let me show you how Git detects changes with extreme efficiency.
When you run git status or git diff, Git doesn't compare file contents line by line. Instead, it compares hashes.
Process:
bash$ git status
What Git does:
┌──────────────────────────────────────────────────────────┐ │ 1. Get committed version hash │ │ Git looks at HEAD commit's tree │ │ index.html → blob 1a2b3c │ └──────────────────────────────────────────────────────────┘ ↓ ┌──────────────────────────────────────────────────────────┐ │ 2. Get working directory hash │ │ Git hashes current index.html content │ │ Result: 4d5e6f │ └──────────────────────────────────────────────────────────┘ ↓ ┌──────────────────────────────────────────────────────────┐ │ 3. Compare hashes │ │ 1a2b3c ≠ 4d5e6f │ │ ↓ │ │ File has changed! │ └──────────────────────────────────────────────────────────┘
Why this is brilliant:
Git uses content-addressable storage - meaning content is stored and retrieved based on its hash.
Traditional file systems:
Filename → Location on disk → Content "index.html" → /path/to/file → <h1>Hello</h1>
Git's approach:
Content → Hash → Storage location <h1>Hello</h1> → 1a2b3c4d → .git/objects/1a/2b3c4d
Benefits:
Git actually performs three different comparisons:
┌─────────────────┐ │ Working │ ◀─── Compare 1: Modified? │ Directory │ (Working vs Index) └────────┬────────┘ │ ▼ ┌─────────────────┐ │ Index │ ◀─── Compare 2: Staged? │ (Staging Area) │ (Index vs Repository) └────────┬────────┘ │ ▼ ┌─────────────────┐ │ Repository │ ◀─── Compare 3: Committed? │ (HEAD commit) │ (Current vs Previous) └─────────────────┘
Example:
bash# Scenario 1. Committed version: "Hello" (blob: abc123) 2. Staged version: "Hello World" (blob: def456) 3. Working version: "Hello World!" (blob: ghi789) # git status output Changes to be committed: modified: index.html ← Index ≠ Repository Changes not staged for commit: modified: index.html ← Working ≠ Index
Git compares hashes at each level to show you exactly what's different.
Here's something that might blow your mind: branches in Git are just files containing a commit hash.
A branch is a lightweight movable pointer to a commit.
Example:
bash# The main branch is literally just a file: $ cat .git/refs/heads/main a3f8b2c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5 # That's it! Just a commit hash.
Creating a branch:
bash$ git branch feature-login # Git creates: .git/refs/heads/feature-login # Content: (same hash as main) a3f8b2c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5
Commit History: ─────────────── C1 ← C2 ← C3 ← C4 ← C5 ↑ └─ main (points to C5) └─ feature-login (also points to C5) .git/refs/heads/main: a3f8b2c (C5's hash) .git/refs/heads/feature-login: a3f8b2c (C5's hash) .git/HEAD: ref: refs/heads/main
When you make a commit on feature-login:
C1 ← C2 ← C3 ← C4 ← C5 ← C6 ↑ ↑ │ └─ feature-login (moved to C6) └────── main (still at C5) .git/refs/heads/feature-login: 7g8h9i0 (C6's hash) ← Updated!
Why this is amazing:
HEAD is a special pointer that tells Git "which commit am I currently on?"
bash$ cat .git/HEAD ref: refs/heads/main # HEAD points to the main branch # main points to commit a3f8b2c # So you're on commit a3f8b2c
Detached HEAD (when HEAD points directly to a commit):
bash$ git checkout a3f8b2c # HEAD is now at a3f8b2c (detached) $ cat .git/HEAD a3f8b2c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5 # HEAD points directly to a commit, not a branch
Let's build a complete mental model of how everything works together.
┌─────────────────────────────────────────────────────────────┐ │ YOUR PROJECT │ │ │ │ ┌────────────────────────────────────────────────────┐ │ │ │ WORKING DIRECTORY │ │ │ │ (Files you see and edit) │ │ │ │ │ │ │ │ my-project/ │ │ │ │ ├── index.html │ │ │ │ ├── about.html │ │ │ │ └── css/style.css │ │ │ └────────────────────────────────────────────────────┘ │ │ │ │ │ │ git add │ │ ▼ │ │ ┌────────────────────────────────────────────────────┐ │ │ │ INDEX (Staging Area) │ │ │ │ .git/index │ │ │ │ │ │ │ │ Staged files: │ │ │ │ ├─ index.html → blob 1a2b3c │ │ │ │ ├─ about.html → blob 4d5e6f │ │ │ │ └─ css/style.css → blob 9j0k1l │ │ │ └────────────────────────────────────────────────────┘ │ │ │ │ │ │ git commit │ │ ▼ │ │ ┌────────────────────────────────────────────────────┐ │ │ │ REPOSITORY (.git folder) │ │ │ │ │ │ │ │ COMMITS: │ │ │ │ C1 ← C2 ← C3 │ │ │ │ ↑ │ │ │ │ └─ main │ │ │ │ │ │ │ │ OBJECTS: │ │ │ │ ├─ Commits (C1, C2, C3) │ │ │ │ ├─ Trees (directory structures) │ │ │ │ └─ Blobs (file contents) │ │ │ │ │ │ │ │ REFS: │ │ │ │ ├─ heads/main → C3 │ │ │ │ └─ HEAD → refs/heads/main │ │ │ └────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────┘
USER ACTIONS WHAT GIT DOES INTERNALLY ───────────────── ────────────────────────────── 1. Edit index.html (Changes only in working directory) │ │ 2. git add index.html ──▶ • Hash content → 1a2b3c │ • Create blob in objects/1a/2b3c │ • Update .git/index │ │ 3. git commit ──▶ • Create tree object (4d5e6a) │ • Create commit object (a3f8b2) │ • Update refs/heads/main │ • Update HEAD │ │ 4. git log ──▶ • Read commit chain from HEAD │ • Display commit metadata │ │ 5. git status ──▶ • Compare working vs index (hashes) • Compare index vs HEAD (hashes) • Report differences
Let's do a hands-on exercise to see Git's internals in action.
bashLoading syntax highlighter...
What you learned:
git add created a blob immediatelygit commit created a tree and commit object.git/objects/git cat-file -p <hash>bashLoading syntax highlighter...
Magic: Git stored the content only once, even though it appears in two different files!
Git uses hashes everywhere, and this provides incredible data integrity.
Scenario: Someone tries to modify an old commit
Original commit C2: ├── tree: abc123 ├── parent: xyz789 ├── author: Alice ├── message: "Fix bug" └── hash: def456 (calculated from all above) Tampered commit C2: ├── tree: abc123 ├── parent: xyz789 ├── author: Bob ← Changed! ├── message: "Fix bug" └── hash: ??? When Git recalculates the hash: Hash ≠ def456 → Corruption detected!
Result: Any modification to a commit changes its hash, which cascades up the chain, making it obvious something was tampered with.
Git's commit history is a Merkle tree (or hash tree):
Commit C3 (hash includes ↓) │ ├─── tree T3 (hash includes ↓) │ ├─── blob B1 │ └─── blob B2 │ └─── parent: C2 (hash includes ↓) │ ├─── tree T2 └─── parent: C1 ├─── tree T1 └─── parent: null
Why this matters:
This is the same cryptographic structure used in blockchain!
Reality: Git stores complete snapshots, not diffs.
When you commit, Git saves the full content of every file (as blobs), not just what changed. However, Git is smart about storage:
git diff or git log -pReality: Branches are just 41-byte files containing a commit hash.
Creating a branch doesn't copy any files. It just creates a new pointer. This is why branching in Git is instant and uses negligible space.
Reality: The index gives you precise control over commits.
The staging area lets you:
git add -p)Reality: Git uses timestamps and hashes for lightning-fast comparisons.
Git first checks file metadata (size, modification time). Only if that changed does it hash the content. And even then, comparing hashes is instant.
Let me give you one final, complete visualization:
┌─────────────────────────────────────────────────────────────────────┐ │ THE .GIT FOLDER │ │ (Git's Complete Database) │ └─────────────────────────────────────────────────────────────────────┘ .git/ │ ├── HEAD ─────────────────────▶ ref: refs/heads/main │ "I'm on the main branch" │ ├── index ────────────────────▶ Binary file tracking staged changes │ (The staging area) │ ├── config ───────────────────▶ Repository settings │ (user name, remotes, etc.) │ ├── objects/ ─────────────────▶ THE HEART: All your data │ ├── 1a/ │ │ └── 2b3c4d... ────────▶ Blob (file content) │ ├── 4d/ │ │ └── 5e6a7b... ────────▶ Tree (directory) │ ├── a3/ │ │ └── f8b2c9... ────────▶ Commit (snapshot) │ └── pack/ ────────────────▶ Compressed objects (Git optimizes later) │ ├── refs/ ────────────────────▶ All your branches and tags │ ├── heads/ │ │ ├── main ─────────────▶ a3f8b2c9... (commit hash) │ │ └── feature-login ───▶ 7g8h9i0j... (commit hash) │ └── tags/ │ └── v1.0 ─────────────▶ 5k6l7m8n... (commit hash) │ └── logs/ ────────────────────▶ History of what branches pointed where ├── HEAD ─────────────────▶ Reflog (your local history) └── refs/heads/main ──────▶ main branch's history
You might be wondering: "Why do I need to know all this? Can't I just use Git without understanding internals?"
Fair question! Here's why this knowledge is valuable:
When something goes wrong, you'll know where to look:
git reflog (it's in .git/logs/)Understanding internals makes advanced commands make sense:
git rebase rewrites commits (creates new hashes)git cherry-pick copies commits (creates new commit with same tree)git gc repacks objects and cleans up loose objectsInstead of memorizing commands, you understand why things work:
You're not afraid of Git anymore because you know what it's doing. You can experiment, knowing the .git folder has your back.
Let's recap the essential concepts:
Everything is an object: Blobs (file content), Trees (directories), Commits (snapshots)
Hashes are everywhere: Git uses SHA-1 hashes as unique identifiers and for integrity
The .git folder is Git: Your entire project history lives in this one folder
Commits are snapshots: Git saves complete project states, not just diffs
Branches are pointers: Lightweight, fast, and use almost no disk space
Content-addressable storage: Same content = same hash = stored once
Three-stage workflow: Working Directory → Index → Repository
git add: 1. Hash file content 2. Create blob in objects/ 3. Update index git commit: 1. Create tree objects from index 2. Create commit object pointing to tree 3. Update branch reference 4. Update HEAD
Commit ────▶ Tree ────▶ Blobs │ │ │ └─────▶ Trees (subdirectories) │ └─ Points to parent commit(s) └─ Creates history chain
Now that you understand Git's internals, you're ready for more advanced topics:
git push and git pull work with Git's object modelGit is beautifully elegant once you understand it. Everything revolves around:
The .git folder isn't mysterious anymore - it's a well-organized database of objects and references. When you run git add, you're creating blobs. When you run git commit, you're creating trees and commit objects. When you create a branch, you're just writing a hash to a file.
Git's genius is hiding this complexity behind simple commands while giving you the power of a cryptographically secure, distributed version control system.
Next time you run git commit, you'll know exactly what's happening behind the scenes. And that knowledge? That's power.
Now go explore your .git folder with confidence! 🚀
Want to dive deeper? Try these exercises:
git cat-file -p to explore every object in your repository.git/refs/heads/git reflog to see your command historyQuestions or discoveries? I'd love to hear what you found in the comments!
Related posts based on tags, category, and projects
Before Git and version control systems, developers passed code around on pendrives, created folders named "final_v2_ACTUAL_FINAL", and lost weeks of work to accidental overwrites. This is the story of why version control became absolutely essential for software development.
A beginner-friendly guide to Git that explains what it is, why developers use it, and walks you through essential commands with practical examples. Learn the core concepts and start your version control journey with confidence. If you've ever accidentally deleted important code, wondered "wait, what did I change?", or struggled to collaborate with teammates without overwriting each other's work - Git is here to save the day. Let me show you why Git has become every developer's best friend.
A comprehensive walk through of my first webdev cohort orientation class where Piyush sir took us on an beautiful journey through Git fundamentals. Learn about the problems Git was created to solve, master essential commands like init, add, commit, reset, and revert, and peek inside the .git folder to understand how Git works internally with linked lists, objects, and refs.
# 1. Create a new test repository
mkdir git-internals-test
cd git-internals-test
git init
# 2. Create a simple file
echo "Hello Git Internals" > test.txt
# 3. Add it (creates a blob!)
git add test.txt
# 4. Find the blob hash
git hash-object test.txt
# Output: e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 (example)
# 5. Look inside the objects directory
find .git/objects -type f
# Output: .git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
# 6. Read the blob content
git cat-file -p e69de29
# Output: Hello Git Internals
# 7. See what type of object it is
git cat-file -t e69de29
# Output: blob
# 8. Make a commit
git commit -m "Add test file"
# Output: [main (root-commit) a3f8b2c] Add test file
# 9. Find the commit hash (use your actual hash)
git log --oneline
# Output: a3f8b2c Add test file
# 10. Inspect the commit object
git cat-file -p a3f8b2c
# Output:
# tree 4d5e6a7812...
# author Your Name <email>
# committer Your Name <email>
#
# Add test file
# 11. Inspect the tree
git cat-file -p 4d5e6a7812
# Output:
# 100644 blob e69de29bb2... test.txt
# 12. See all objects
find .git/objects -type f
# You'll see:
# - Blob (e69de29...)
# - Tree (4d5e6a7...)
# - Commit (a3f8b2c...)# 1. Create two files with identical content
echo "Same content" > file1.txt
echo "Same content" > file2.txt
# 2. Add both
git add file1.txt file2.txt
# 3. Get hashes
git hash-object file1.txt
git hash-object file2.txt
# Both output: 5e40c0877058c504203932e5136051cf3cd3519b (same!)
# 4. Find objects
find .git/objects -type f
# Only ONE blob for both files!
# 5. Commit
git commit -m "Two files, one blob"
# 6. Inspect tree
git cat-file -p HEAD^{tree}
# Output:
# 100644 blob 5e40c08... file1.txt
# 100644 blob 5e40c08... file2.txt
# ↑ ↑
# └───────┬───────┘
# Same hash!