Git


Git is probably the most popular version control software today. It helps collaboration with other programmers and tracks the changes happening in the code over time.

Difference between Git and GitHub


Git was originally developed by Linus Torvalds as a by-product of his having to manage the development of the Linux kernel. GitHub is a web application that uses Git and helps developers collaborate on software. It is common for Git and GitHub to be used even for short CS group projects in college. The source code for Anki Books is on GitHub; this is where you can get the source code if you want it. If you have Git installed, then you can clone the repo with the git clone git@github.com:KyleRego/anki_books.git command.

The git command


The Git command line interface is the git command commonly used inside a Bash shell, which is a program that provides an interface to the operating system (usually a Linux distribution) inside of a terminal emulator. I believe on Windows you can use Git Bash to get a version of that shell. To see the various git commands, use the git --help command, and the output will show common git commands. The --help is a long option, and many command line programs take this option to display helpful documentation about the program including how to get further help.

The git --help command


The output of the git --help command includes git init, which creates a new Git repository in the current directory. The git --help also outputs some information about the git clone command, which you could use to clone the Anki Books repository on GitHub (see GitHub docs for how to do that). This would include every version of the project all the way back to the initial "commit."

Introduction to the Git index and the Git object store


A Git repository has two main things: and index and an object store. When you clone a Git repository, you get the object store, but not the index. The index is a private data structure to a repository and tracks the changes that are getting ready for or "staged" for a new commit.

The four atomic data types of Git


Git has four atomic data types that make up the other types: blobs, trees, commits, and tags.

The simplest atomic Git type one is the blob, which represents a file.

The tree is the atomic Git type that represents a directory. Since a directory (directory is an older way to refer to "folder" by the way) itself contains files and directories, a Git tree references blobs and other trees.

A commit is an atomic Git data type that represents metadata about a change introduced to the repository, which includes the author of the commit. That the author of each commit is tracked is how the git blame command can be used to see who last modified some code. At the beginning of the Git commit history is the root commit (commonly called the initial commit in the first commit message), the only commit that does not have a parent or previous commit. Every commit after the root commit will be pointing back to at least one previous commit. A commit can have multiple parents in the case of merging multiple branches together.

Note while branches are central to how developers speak about Git, they are not one of the fundamental data types. The last fundamental Git data type is the tag, which might be used to label a commit.

The Git branch: a directed acyclic graph of Git commits


The branch can be thought of as a series of commits each pointing backwards to one or more previous commits until eventually it all reaches the root commit. A graph in general can be defined as a structure of nodes connected by edges. If at least one edge has a direction, it is a directed graph. Git involves a directed acyclic graph which is acyclic because there is no way to follow the edges in the correct direction to the root commit. Anki Books early on had the domains table with a many-to-many self-referential relationship to itself (which was an example of a graph too), and it was fine until I introduced a cycle which made my recursive SQL query to crash my Windows Subsystem for Linux.

The Git index


The Git index is an important thing to know about. Let's say you cloned a repo, created a new branch and switched to it using the git checkout -b <new_branch_name> command, made some changes, and wrote some new unit tests covering your changes. It is time to introduce your changes to a branch by making a commit. Before you commit your changes, you stage them in the index with the git add command. By using the git status command, you can see the state of changes in the working directory and index: what is tracked, untracked, and staged. Files that are ignored should not be visible in the output of the git status command. When you use the git commit command, it takes the files that are staged in the index and those changes are introduced. Usually after that, you might push your branch up to GitHub and open a "pull request" which would merge that change into a different branch, like one called main for example. A quick way to add all your changes to the index is to use the git add . command. The -m short option is usually used with git add to specify the commit message.

Merging branches


A merge is when two branches are combined, and in order to preserve the history of commits of the branches being merged, it introduces a merge commit. The resulting code does not reflect any branch more than any other branch that was involved in the merge. A linear history can be easier to reason about, so oftentimes people use the git rebase command to rebase their commits onto the target branch of the merge. This will rewrite the history of the branch being rebased such that the commits appear linearly after the target branch's head, which refers to the most recent commit on a branch. After this type of rebase, the merge will not introduce a merge commit because it is just doing a degenerate type of merge called a fast forward.

You can merge branches together on the command line with git merge: if you are on branch A and use git merge <branch_b_name> it will make a merge commit on branch A that is the result of a merge between the two branches. The git branch command with the -d short option can then be used to delete branch B safely. If you ever use the -d short option with git branch and it throws a warning, that means there is a commit in branch B that would be lost. Even if you did do that and realized you need the lost commit, you can turn to the reflog with the git reflog command.

Oftentimes merging branches is done on GitHub which provides a nice UI for code reviews and discussions. It is called a "pull request" and the merge will happen by clicking a button. The merge can happen between forks of the GitHub repository which are different remote versions of the repository on GitHub.

If you can use the git command on the command line, a good way to learn is to use the command line to get help. The git command takes a long option --help that you can use to get help: git --help. The different git commands also take the same long option to show help, e.g., git add --help, git commit --help.

Example: a commit


113 scenarios (113 passed)
1182 steps (1182 passed)
7m36.288s
Coverage report generated for Cucumber Features, RSpec to /path_to/anki_books/coverage. 3340 / 3342 LOC (99.94%) covered.
$ git status
On branch main
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   app/controllers/books_controller.rb
        modified:   db/schema.rb
        modified:   spec/factories/books.rb
        modified:   spec/requests/books/show_spec.rb

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        db/migrate/20231026144625_add_public_to_books.rb
        db/migrate/20231026145631_update_books_public_to_false.rb
        db/migrate/20231026150352_add_not_null_to_books_public.rb
        db/migrate/20231026151536_add_default_public_for_books.rb

no changes added to commit (use "git add" and/or "git commit -a")
$ git add .
$ git branch
* main
$ git commit -m "Add public books"
Running pre-commit hooks
Analyze with RuboCop........................................[RuboCop] OK

✓ All pre-commit hooks passed

Running commit-msg hooks
Check subject capitalization.....................[CapitalizedSubject] OK
Check subject line................................[SingleLineSubject] OK
Check for trailing periods in subject................[TrailingPeriod] OK
Check text width..........................................[TextWidth] OK

✓ All commit-msg hooks passed

[main f127946] Add public books
 8 files changed, 73 insertions(+), 3 deletions(-)
 create mode 100644 db/migrate/20231026144625_add_public_to_books.rb
 create mode 100644 db/migrate/20231026145631_update_books_public_to_false.rb
 create mode 100644 db/migrate/20231026150352_add_not_null_to_books_public.rb
 create mode 100644 db/migrate/20231026151536_add_default_public_for_books.rb
$ git status
On branch main
nothing to commit, working tree clean
Previous Next