Git, launched in 2005, is a complimentary, distributed version control system that facilitates collaborative work on
software projects, irrespective of the geographical location of the contributors. It offers the ability for multiple
developers to independently integrate their modifications to the project, log and trace these alterations, and, when
necessary, access previous versions of the project. Being platform-independent, Git can be utilised across diverse
computing environments.
Contrastingly, GitHub enhances the collaboration process in Git. It serves as a platform that hosts code repositories
in a cloud-based environment, thus enabling multiple developers to collaboratively engage with a single project and
monitor each other’s modifications in real time.
Furthermore, GitHub encompasses features for project organization and management. For instance, tasks can be assigned
to specific individuals or groups, and permissions and roles can be established for collaborators. Comment moderation
is also available to keep all participants informed of the latest developments. Additionally, GitHub repositories are
publically accessible, encouraging global interaction among developers. This allows for collective refinement of code,
as developers worldwide can contribute to each other’s work with modifications and enhancements.
For research, the Docs of Github or the
Docs of Git can be used. For the workflow, an initial exposure to Git is extremely
helpful, allowing the work processes to be understood concretely. are understood.
Hint: Currently, this repository refers to working with Git on a local level only. More and working with remote repositories will follow.
Installation instructions for the respective operating system can be readily obtained from Google. One of the first
steps in using Git in conjunction with Gibhub is to configure Git on your own computer.
Inside the terminal:
Furthermore, extremely complex configurations are possible.
The initialisation command of Git is necessary for this:
Git then creates a new repository directory in the current directory. All files required by Git are stored in the
subfolder /.git. In UNIX systems, the dot in front of the folder name indicates that this directory is hidden.
Git stores the corresponding versions of the programme we are working on in .git so that we can subsequently track how
we have changed our programme.
Example: I want to create a repository called program.
I create a directory called program.
I then change to the appropriate directory and create the appropriate repository using Git’s initialisation command:
If you change to the appropriate /.git directory and run the ls command, you will get the following output:
The folder /.git contains various subfolders, which we won’t be doing much with at the moment, as Git automatically
manages all versions.
The following will be about understanding how the different versions are managed within a project and what nutritional
value git offers for one’s own workflow. Many IDEAs facilitate working with Git in connection with Github. However, it
is also necessary here to understand what is actually used how and above all why within the IDEA.
Here, everyone is free to do version management relatively automatically or manually.
In the following, we will first introduce version management with Git. It should be noted that we will work exclusively
locally. This means that all changes and subsequent work will be done exclusively on our own computer and nothing will
be put online.
git add - Bridge between working directory and staging area
With a commit, the state of a project is saved at the corresponding time where the command was listed. Here, a
commit corresponds to a version of the project, so to speak.
It is important to understand here that with a commit, the state of a project or a specific file in our working directory
is transferred from the project to the so-called staging area. A commit is first prepared in the staging area with
the following command:
For single file:
For entire folders:
After we have marked the corresponding files/folders with the command in the staging area, we could execute a commit
directly. First, however, an important note:
Assuming we would continue to write our programme in our working directory after committing to the staging area, the
changes we have made would not be adopted by the version stored in the staging area. For this, we would have to execute
the git add command again.
Here it can be helpful to imagine several working levels:
git commit - Bridge between staging area and repository
Since we have stored the current version of our programme in the staging area, we should switch to storing the current
version in our repository from our staging area. This can be done by using the following command in the terminal:
This command creates a new commit from all the staged changes. When committing, it is necessary to store a commit message.
For a better overview, it is helpful to store in the commit message what was specifically changed. Short and concise is
the appropriate home remedy for the first line. This is followed by an empty line and then a small, more concrete description.
For this purpose, the parameter -m is used in connection with the message in the inverted commas. If a commit was carried
out, a concrete commit ID is attributed to this commit. This is important when it comes to the work history of the
repository.
It is worth noting that one atomic change per commit should be used to maintain the clarity of the repository.
Here are two examples:
One commit per bug
One commit per feature
If the commit command was used without the -m parameter, Git will prompt you to put a message in a text file. If
this doesn’t happen, for example because you don’t leave a message, the commit command will be aborted.
Tip: If a programming project is growing and other programmers from all over the world may be working on your project,
it would be advisable to leave all commit messages in English.
Good practice for commit messages
Short and concise: The first line of the commit should be a brief summary of the changes. It is often recommended
that this line should not exceed 50 characters.
Imperative form: Commit messages should be written in the imperative form, for example “Fix bug” instead of
“Fixed bug” or “Fixes bug”. This convention is followed by Git itself in its automatically generated messages, such
as “Merge pull request #123 from branch”.
Detail in body: If the changes require more context or explanation, these should be written in the body of the commit.
This should follow the summary line and after a blank line. It is often recommended that each line in the body should
contain no more than 72 characters.
What and Why: It can be helpful to describe not only what has been changed, but also the reason for the change.
This can be particularly useful when fixing bugs or making complex changes.
Reference resources: If your change references an issue or bug report, you should reference it in your commit message.
Some teams use certain conventions for this, such as adding “Fixes #123” to the end of the commit message to
close the issue, which is listed as number 123 in their issue tracking system.
The following parameter is useful if you have committed something and later find out that what you committed doesn’t really work.
Then the original commit gets a new commit ID. This way, you don’t have to do another commit.
To change a commit, use the following command in the terminal:
This will ensure that the latest changes from your working directory are included in the staging area.
Important: This rewrites the log from the repository. More precisely, the last commit is overwritten. Ideally, you
should only use this command locally on your computer.
This becomes important when several people are working on a repository. Confusion can arise because not all people
working on the repository have the same version on their computer.
This command allows you to display the individual changes to your repository. An individual commit is clearly identified
by its SHA I hash. These hashes are created on the basis of the corresponding commit so that no two commits can have the
same checksums.
To do this, use the following command in the terminal:
The order of the commit history is such that the oldest change is shown at the bottom and the most recent at the top.
You are shown the individual commits, including the corresponding checksums, author’s name, time and, last but not least,
your commit message. It is important to note here that the most recent commit is referred to as HEAD.
If we look at the history of the repository, we see a history with different hash values. The last commit is referred
to here as HEAD. HEAD here means the most recent commit checked out.
We see here a series of commits with corresponding abbreviations of the hash values and the HEAD commit. If we look at
the commits from top to bottom, the second commit (calculated from HEAD) is correspondingly the second to last commit.
This can be identified not only with its checksum, but also as HEAD~1. This is the parent of the HEAD commit.
The third commit is then HEAD~2, the fourth commit is HEAD~3, etc.
These names can also be used instead of the hash values of the individual commits. This makes it a little more
convenient to work with.
Example:
or
Another important parameter for the command is -p. This displays significantly more information for the commits, such
as which lines within the code specifically added or removed. If the commit needs to be displayed in less detail, the
following parameter can be used:
This command is useful if you are working within your working directory, but you are not sure which changes have already
been applied to your staging area. With the following command, you can see the relative changes between the working
directory and the staging area:
To see the relative changes between the staging area and the repository, use the –cached parameter.
Example:
If you want to see specific differences between two individual commits, you need the commit IDs of the two commits and
the following command:
The variables a and b in front of the individual versions do not refer to concrete folders or directories,
but to the corresponding versions that we put into relation.
The following output:
means that in the first file the first line has been deleted and a total of seven lines are output, therefore
-1.7. Subsequently, with the second file, one is added to the first line and a total of five lines are output,
therefore +1.5
Important: If the command git diff –cached does not produce any output, it is probably because no concrete
changes were made in the staging area.
It is quite possible that the comparison will show you changes that may not be correct. This is probably because git
assumes what changes need to be made to get from hash1 to hash2. Therefore, it is important to store the hash values
from old to young.
Note: Whoever commits first should also be named first.
In the simple case that we look at the changes of commits in relation to a single file, then it becomes quite clear.
However, it is often the case that within a project there are several files that are changed within one commit. Then,
of course, you will see a lot of changes. In order to be able to differentiate this somewhat, use the following important
parameter to view the changes of a specific file:
Example:
Note: The – [dataname] parameter can be used with all the git commands presented, such as git log. It is also possible
to view changes to specific file formats or filenames. A proxy can be used for this.
Here we can remove all changes from the staging-area. The prerequisite for this command is that changes are
stored in the staging area at all.For this we use the following command:
If it is a question of removing specific changes, such as from a specific file, then the following command can be
used in conjunction with the corresponding file parameter in the terminal:
But not only all changes marked for a commit can be undone, but also changes that refer specifically to a commit. The
following command is responsible for this:
Here, all changes in the workind directory remain, but all commits after the commit with the specified commit hash
are deleted. This means that all changes that have already been committed to the repository with the corresponding
commits are returned to the working directory after the above command.
Another important parameter, but one that should be used with caution, is the –hard parameter. Here, all commits
are also deleted after the corresponding commit hash. The difference to the previous command is that all changes
are discarded. To do this, use the following command in the terminal:
Important: When working with the git reset command, you must be careful, because this command can cause significant
data loss. Do not use this command when working with remote repositories, such as Github. These repositories are often
shared with other people and this command automatically changes the git history. If in doubt, always tend to not use
the command.
The parameter –hard can also be used in other ways. Assuming there are correct changes in your repository and you
want to continue working on your programme based on this current repository. If you are working in your working directory
and unfortunately find that your changes have led you astray, you can use the following command to overwrite your current
working directory with the current repository (MASTER):
But here too, make sure you know what data is in your working directory before you execute the command. This can lead
to immense data loss.
The git checkout command restores the state of a given commit. Then the current working directory is in the state
it was in at the specified commit hash. You can view files, compile the project, run tests and even edit files without
fear of losing the current project state. None of your actions and changes are saved in the repository.
Type the following command in the terminal:
Important: Git tries to make local changes here, which can lead to conflicts. Only use this command if your working
directory is empty or has no changes.
With the git checkout command, we can check out a previous commit and return the repository to the status. Checking out
a particular commit puts the repository into a state with the HEAD detached. In the detached state, all newly made
commits are orphaned as soon as you return to an established branch. Orphaned commits are deleted by Git during the next
purge. The purge occurs at configured intervals and removes orphaned commits permanently. To prevent orphaned commits
from falling victim to garbage collection, we need to make sure that we are in a branch.
Once you have finished working on a particular commit, you can jump back to the original version of your project with
the following command:
In case you want to see the status of your project from a few weeks ago, but you still have changes in your current
working directory and/or staging area, you can use the command to display the status from that point in time without
losing the current changes. You can use the following command to cache your current changes:
In short, git stash allows you to cache the current changes in your working directory and staging area.
Workflow example:
You save the current state in the stash.
Look at the past commit.
Switch back to the current state.
Read the old state from the stash.
Output:
Saved working directory and index state
"WIP on master: 016c048 Create index file".
HEAD is now at 016c048 Create index file
(To restore them type "git stash apply")
Following this, you can use the previously introduced git status command to see that your current working directory
is cleaned up:
# On branch master
nothing to commit, working directory clean
The last saved stash always has the smallest room on the list, as if you were always adding another package to a stack.
With the following command you can look at all cached stashes:
Output:
stash@{0}: WIP on master: 016c048 Create file
stash@{1}: WIP on master: a1370d1 Revert
stash@{2}: WIP on master: 1cd6015 Change numbers
With the parameter pop, you can always restore the last saved stash. What is meant is that with the command
stash@{0} is restored first, followed by stash@{1} and then stash@{2}.
You also have the option of using the already introduced command git diff to compare the current stash with what is
in your current working directory. With the stash command and the previously introduced commands, you have creative ways
to jump between versions and changes without having to worry about losing recent changes.
Here I would like to introduce a command that also reverts changes, but in a slightly different way. Here, a new commit
is created that reverts the corresponding changes. You have the option of changing a commit afterwards, as described
above. This was done, for example, with the command git commit –amend, but this only worked as long as the
repository was local. Here Git offers you the option of making a new commit that undoes all the changes made to an
existing one.
Instead of removing the commit from the project history, it determines how to reverse changes caused by the commit. A
new commit is then added with the corresponding reversed content. This prevents Git’s history from being truncated,
which is important for the integrity of your commit history and for reliable collaboration.
When the command is executed, a commit is created as described, which must also contain a corresponding commit message.
In this particular case, this commit message is generated automatically. This makes sense in so far as it may not even
be marked in the repository history that this is a revert.
You should use this undo command when you want to reverse a commit from your project history. This can be useful, for
example, if it turns out that a bug was introduced into the project by a very specific commit. Instead of manually
navigating to the code in question, fixing it and committing a new snapshot, you can use git revert to do the whole
process automatically.
A revert has two important advantages over a reset. First, the project history remains unaffected: The process has no
effect on commits that have already been published in a shared repository. Second, you can apply git revert to individual
commits at any point in the history. git reset, on the other hand, only works retroactively from the current commit.
The advantage of using revert is that a new commit is created to undo certain changes. This way, no commits are deleted
from the commit history or orphaned. git revert is the safer option compared to git reset in terms of losing code.
With this command you can display changes to a file. It shows line by line in which commit it was last changed.
To make working in the terminal more comfortable and to keep an overview, you can also use the following command to
highlight the same commits in colour:
This can be helpful when working with several people and you want to make sure that if there is a bug, for example,
that you write to the right person about it.
For more detailed handling of the git blame command, use the following command in the terminal:
With the so-called gitignore you can have files ignored. The file .gitignore contains a list of paths that are ignored
by git. Here, changes to this data will have no effect on the commands git diff, git add, etc.
Important: The .gitignore file should always be committed so that certain files, folders, etc. are not included in
the shared repository in the first place. This can also be very important when working together in a larger team.
Create a file called .gitignore inside your working directory as follows:
Example of the content of a .gitignore file:
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock
# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintainted in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
This is an example from the repository directly from Github. There are some concrete examples for all common programming
languages that can be used freely. For research or direct use, you can find the corresponding .gitignore files
gitignore.
With this licence, you may use, modify and share the work as long as you credit the original author. However, you may
not use it for commercial purposes, i.e. you may not make money from it. And if you make changes and share the new work,
it must be shared under the same conditions.