Let's Git to Know Version Control
Git is a distributed version-control system for tracking changes in source code during software development - Wikipedia
What is Git
The book definition aside, what really is Git?
It’s a simple tool to make snapshots of your work & switching between the snapshots is instant. A snapshot here means the state of a system at a given time. The purpose of using Git is to create many snapshots during your work either to save the progress you’ve made so far or to mark a checkpoint that you can get back to in future
Okay, now what’s a distributed version-control system?
Git is a DVCS (Distributed Version Control System), which is a fancy way of saying every snapshot metadata is stored across every machine that has a copy of your work. This is in contrary to the Centralized Version Control System, which keeps all snapshot metadata on a central server (single source of truth) that’s accessed by clients on demand. A DVCS may seem like an overkill with duplicated data across every machine, but in reality it’s much more efficient that way considering the cheap storage costs. Being able to work offline has to be the biggest advantage of using DVCS. Centralized VCS demands an internet connection to be able to work as everything is stored on a central server
Why do I need Git
To explain this, lets dive into a simple scenario where you plan to write a story. You’d like to share it with your friends to get early feedback before publishing it
- Create a new word document to roughly jot down a couple of ideas & characters for the story
- Work towards picking good ideas & discarding the rest as you go
- Refine the storyline to see it all come together
- May encounter few characters not fitting in the story so you change their roles
- Make some changes to the storyline with modified characters
- This process repeats for several times before arriving at a final version
- You then share it with your friends for their feedback
- They might ask some questions & suggest few changes
- You go back to make further edits based on the feedback
- And the story is finally complete
Considering the effort spent to achieve the final document
- Can you go back to a version of the document as it was on a specific date?
- Can you justify the reason behind the changes made to a particular character or storyline?
- Would it be easy to share an editable copy with each of your friends & consolidate all their edits into a single one?
If any one of the above questions ring a NO to you then you need Git. If the word document would’ve been version controlled using Git then you would’ve been able to do all the above things without a second thought
Git gives you the freedom to commit your changes (snapshots) to keep a well defined history of modifications made to the document. Git enables you to add checkpoints, conduct experiments, take different routes, travel back in time to a specific version, collaborate with ease, work offline & much more. The wide adaption of Git in almost all organizations is also a major reason to use Git
How to speak Git
Few jargons to be aware of when speaking Git
- Repository - Directory containing project files with a hidden .git directory at it’s root
- Branch - A version of repository that diverges from main working project, these can be used to create features, bug-fixes, experiments
- Master - Primary branch of all repositories
- Merge - Bringing over changes from one branch to another
- Working Area - State of a repository when files are modified but not added to staging
- Staging Area - Area containing files ready to be committed
- Commit - Saved snapshot of the work with a unique hash to identify it
- Remote - A common system that’s accessible to everyone on the team to share changes
- Clone - Get entire copy of a repository from remote for the first time
- Pull - Get changes made by someone else from a remote into the local repository
- Push - Send local changes to rest of the team by pushing them to remote
- Blob - Raw binary form of storing data from a file
- Tree - Data structure with nodes forming a parent & child relationship in a hierarchy
How Git Works
Git is a key value data store that keeps all your changes hashed into a binary format saved as Git Objects. These objects are stored in a hidden .git directory within the repository. On every commit, git takes modified file contents & stores them as blobs which are pointed to by a tree that gets referenced in a commit
Note: Git doesn’t care about the filetype added to the store & so you can use Git to track text, image, video or any file