The Global Information Tracker (GIT) Blog Part-1
9th April 2024
Table of content
Part-0 Introduction : This part covers my motivation for writing this two-part blog series on Git
Part-1 Basics : This part provides an introduction to git, get's you acquainted with the basic commands and tells one how to undo changes
Part-2 Branches : This part gives one an intuition as to what are branches, what are the different types of branches and what are the various kinds of merges that can take place
Part-0 Introduction
BackStory
During my first year of undergrad I came across this distributed version control system(DVCS) called Git and the developer platform GitHub.
At that time I was looking into it as I heard about
"Hacktoberfest" taking place in my university.
Since Java was one of the many accepted languages and given my basic understanding of it since my schooling. I went ahead and looked into the event.
Back in 2020 Hacktoberfest was giving free T-shirts to participants that complete the challenge I was intrigued by it.
To complete the challenge one was required to contribute to any public repository with the topic Hacktoberfest. To be specific one had
to make around 4-5 successful pull requests i.e their changes had to be merged to the main repository
post their registration on their website (don’t worry if you don’t understand what I said here).
Given that there was only a week left for October to end. I learnt the basics of git such as how to pull and how to submit pull requests on Github.
My knowledge on Git had stayed this way until my third year of undergrad where I was able to do majority of my work with this basic knowledge.
However, recently I have developed a keen interest to learn git thoroughly. Why you the sudden interest, you wonder? Simply put, I aim to leverage Git more effectively for my future projects.
In this blog series, I document my journey of unraveling the complexities of Git. The blog series is an amalgamation of my understandings from the git documentation,
the book “Ry’s Git Tutorial” and my own learnings along way during my undergrad. Another reason for me to write these series of blogs was to have these blogs as a place I can come back to from time to time to revise things as and when required.
Part-1 The Basics
Have you ever found yourself working on a project and wanting to experiment with the code, perhaps to improve it or for another purpose ? but you didn’t do it as you were afraid that the code might break.
If the answer to that question is yes, then git solves the problem of when you are stuck with code that doesn’t work and need to go back to a working version of the project by constantly undoing or saving the stable files somewhere.
The purpose of Git is to a be version control system (VCS) i.e you get to track all the changes you make in your code base. Whats the point of tracking? If you track your changes, you can go to a working version when things fail or make a branches at multiple places,
experiment and develop parallelly in peace. Think of Git like a Time-Machine for your code. Where you not only “Revert” to the past but can also “Merge” into the future.
Git is a distributed version control system (DVCS) that was developed by Linus Torvalds back in 2005. One of the main reasons for developing it was that with Git everyone could locally work on their own local copies of the codebase
and leave the merging and sharing of code to a later point i.e after they are done building. The advantage of this was now multiple people could work parallel on the codebase. Now that the context is set,
let's dive in to the world of git and the plethora of git commands that help us manipulate files.
When you start working on a project you would ideally want to have the .git directory to be present in it. Think of the ".git directory " as the watchmen that tracks all the changes in data of your repository(repository is an another name for your codebase, it is also called as repo).
To get the .git folder you initialise your repository by typing in git init. To check all the marked changes that have taken place we run git status. When I say mark it means that while you are coding your project there are certain things that you would want to track and certain things that you don’t want to track.
This could be because of security reasons (for example an api key) or because the content is not necessary (for example some executable files, a general rule of thumb is to omit files that can be generated from the track files). The practice of segregating files into track and untracked allows us to keep the repo lean and clean.
Incase this is your first time using Git, one has to configure as to who the author to the code is by running the
git config commands to set the username and name respectively by running
git config —global user.name <username > and
git config —global user.email <[email protected] > .
One can also configure their git credentials locally as well by changing the flag to
—local.
Tracking various versions or files is a two stage process. First is when we stage the files and second is when we commit them. The advantage of this two step procedure is that it gives us the ability to make meaningful progression by adding and removing files in the stage phase i.e group relevant changes to distinct snapshot/commit.
To track a file/directory we use the git add <file name> or git add <directory name > and to commit them we run git commit. You can also track multiple files by adding them as arguments to the git add command or track all the files in the repo with the git add . command.
While committing changes is a good practice it is important to add meaningful messages to them regarding what has commit achieved. This is done by adding in a flag -m "<commit message>" to git commit and this makes the command complete. Once you reach a point where you have performed multiple commits once can view the history
of the various stages/versions of the project with git log command. When you do this, you can see that each time you made a commit there is a very large unique string attached to it. This string helps us identify which commit we are talking about and an other perk of these strings is that they are SHA-1 i.e the commit will never be corrupted without Git knowing about it
because of a concept called checksum. When do I know I need to commit ? Well there are two general rules to know when to commit. The first being, you commit a snapshot for each significant addition to your project it could be a new feature or just an upgrade. Second, don’t commit a snapshot if you can’t come up with a single, specific message for it.
Now that we have the snapshots in the history, in the event of a code break you can checkout back to a working version. First we need to check which version we want to go back to by running the git log —oneline with -oneline flag as it improves the readability of the history.
Once you have the commit ID we checkout to the particular safe working version by performing git checkout <commit ID> .
Once you are satisfied with the changes you can hop onto the main branch with git checkout main. If going through the commit IDs seems to be cumbersome one can tag their commits with git tag -a <tag name> -m "<commit message>" to the commit they are currently present on.
While working on a current commit, if you made some bad decisions and you want to remove the commit from your history/log one can do this by getting the ID of the commit we want to undo/remove and then performing git revert <commit ID>.
When this happens an interesting phenomenon takes place, instead of deleting the commit Git figures out how to undo the changes it contains, then takes on another commit with the resulting content This happens as git was written with the vision to never lose history.
If there is a scenario lets say where you are on a particular commit and you write a new module/function but the code breaks and you want to go back to a functioning version then git reset —hard helps to undo the changes that are not committed i.e don’t have a commit ID. When you run this all the tracked files content changes to the most recent commit.
Once you have reached a stage where you want to delete all the untracked files, then you can run git clean -f. To summarise if its the working directory you want to make changes in then you run git reset —hard but if its a commit you want to undo you run git revert <commit ID> of the commit you want to remove> .
Part-2 Branches
Think of the history of Git is something similar to a Linked List is my guess, where they link the new commit to the previous commit. Let’s see if this is true or not in part 3.
Another beautiful use case of git is the ability to collaborate and work on projects in parallel. The whole world of open source is built on the foundations of git.
To understand this further let’s take a simple example, suppose you and your friends went to hackathon and you want to develop a website where two of you work on the backend while the other works on frontend.
If you had a single code base where both of sides are developing things can get messy real quick. Using Git one can make two separate branches in the repo named backend and frontend. This way one can use the same file structure and work in parallel at their respective branches.
Once you have tested the functionality of each side, you can merge both of these branches onto the main branch. This example show us how Git helps in increasing the productive of a developer. Another great reason to create branches is that a branch provides you with opportunity to have an independent line of work.
This way you don’t have to replicate all the files and pre-existing codebase if you want to experiment on it. If the experiment works out, you can always merge it to the main branch else just delete the entire branch and forget about it. An additional benefit of this is that all your experiments will be in a single directory as multiple branches that don’t affect the working stable version.
If the above context seems overwhelming don’t worry, we’ll go through the concepts of branching and rebasing step by step. To view the existing branches one can use git branch.
If you have not made any branches till now, you see that only the main branch is present when you run this command. The "*" next to the branch denotes the branch one is present in.
Now before we start making branches in a second we need to know that "HEAD" is git’s way of telling us which snapshot/commit we are currently on. The rule of thumb that people generally follow is
“Create a new branch for each major addition in your project” i.e don’t create branches if you cannot give it a specific name. To create a new branch, first checkout as to where you want the starting point of the branch to be with git checkout <place>.
Then you perform git branch <branch name> to create the new branch. You enter into this branch with git checkout <the new branch name>. Now if you make some changes and make a new commit then a new node will be created from the branch where you are currently present at and the HEAD automatically moves to the new commit.
Now one thing you need to remember is that when you create branches the log history reflected to you depends on the branch the HEAD is currently present on. For example if you have two different branches and if you checkout to branch B1 and check the history and then checkout to branch B2
and check the history there is a high possibility that both of them wont reflect the same logs as branches offer us an independent line of development. Think of them as separate project folders. Once you have learnt how to create multiple branches the next step is to merge the two branches,
this is where it can tricky very fast because of merge conflicts which you will see later on. In order to merge branches you first checkout to the branch which you want to update and then perform git merge <branch name you want the branch to be merged to>. In general the main branch is known to be the stable branch of the project.
Please note, there are different kinds of merges in Git. The most basic merge is known as fast forward merge where you move the tip of a branch to the match the tip of another branch and after this merge both the branches have same history. Once this is done for hygiene purpose, if you are not going to continue experimenting on any of the branches post merging you can delete by with git branch -d <branch name>.
Here the -d flag tells Git to delete the branch. A key point to understand here is that Git doesn’t delete branches which have unmerged changes. However if one is sure that they don’t need the branch they run git branch -D <branch name> to forcefully delete it.
Merging branches is not always linear. If you have a situation where when one try to merge two branches that have edited the same content. Git doesn’t know how to combine the two changes, so it stops to ask us what to do. Think of merge conflict like messing up the shared variables in a critical section. The way to solve merge conflict is to manually intervene and update as to what are the changes we want. Git uses <<<<<<<, =======, and >>>>>>> as markers to show us the conflict and these should be deleted before we commit the file once changes are made.
When we are committing this files we don’t need to use -m flag as Git already knows it’s a merge of the conflict list. For example if you have the font style to be Arial in one branch and Times new romain in the other, when you merge you need to choose which style you want.
To help you understand branches better, the most common types of branches are as following:
1.Topic Branch: is used to work on a specific topic or task. It is typically created off the main branch (such as main) and is used to isolate changes related to that topic. Once the work on the topic is completed, the changes in the topic branch can be merged back into the main branch.
2.Feature Branch: is similar to a topic branch but it is specifically used to develop a new feature for the project. It is created off the main branch and is used to work on the new feature in isolation from the main codebase. Once the feature is complete, the changes can be merged back into the main branch. This branch generally is maintained for longer duration than Topic branches.
3.HotFix Branch: is used to quickly address a critical issue in the codebase, such as a bug or security vulnerability. It is created off the main branch or sometimes a release branch and is used to make the necessary changes. Once the fix is complete, the hotfix branch is merged back into the main branch (and possibly other active branches) to apply the fix to the codebase.
A merge conflict takes places when we try to merge branches that have edited the same content. Git doesn’t know how to combine the two changes, so it stops to ask us what to do.
To read the second part of the blog click here -->
Git Blog Part-2