GIT -- The Fast Version Control System

From Remeis-Wiki
Revision as of 17:24, 20 April 2018 by Kuehnel (talk | contribs)
Jump to navigation Jump to search

What is a GIT Repository? (by Matthias Kühnel)

A repository is a container for project files and a program, in this case 'git', handles the different versions of the files and takes care about merging of different file versions. That means many persons can work on the same project (or more precisely on the same files) simultaneously without taking care about the modifications of the other editors. So in principle each editor 'clones' the repository to his or her local computer, does some modifications and at last 'pushes' these changes back into the repository.

Existing Repositories: How to clone and commit changes

The following text is copied from an e-mail from Joern concerning the software scripts of the Remeis observatory. More information one these scripts can be found at Software at the Remeis-Observatory.

Get the files

If you want to modify scripts, you will first have to get a full copy of the repository (called a "clone"):

 git clone ssh://account@crux.sternwarte.uni-erlangen.de/data/git/aitlib

(same for intscripts, xmmscripts, xtescripts, xspecscripts, isisscripts, cyclo, fpipe) where account is your account at the Sternwarte.

Please use the above command EVEN IF YOU ARE WORKING locally in Bamberg/Erlangen. Do NOT do a clone with the file://-syntax of git to allow you to work on several machines (since the git push command will not work properly). Just forget that file://... is an allowed git URL.

Committing your changes

After doing the clone, edit the scripts and check them. In case you edit the isisscripts: Do not forget to make the isisscripts, i.e. compile your changes into the overall isisscripts.sl by typing

 make

in the isisscripts/ directory. You should usually do your changes in smallish steps, i.e., applying a few changes, checking them, and then committing them to the repository as follows:

 git commit filename

OR

 git commit -a 

(the last if you've made changes to many files). This command will ask you to enter information for the change log.

CONTRARY to CVS, a commit does not yet make your changes available to others. This is advantageous, because it allows you to do commits locally while you're developing a code, and then go back to an older version once you realize that you've made a mistake. However, let's assume that you've programmed something that is working and you want to make it available to everybody. In this case, commit everything as described above and then do a

 git push

(do NOT forget this last command).

After you have committed and pushed your changes, WE STRONGLY RECOMMEND THAT YOU REMOVE YOUR CLONED REPOSITORY (rm -r repo). The reason is that experience shows that people often start working on their "private" versions of the scripts and then either forget to commit their changes (=nobody else gets access to them) OR they forget to do regular updates of their local clones and then run into problems that have already been fixed in the official repository. Access to the current version of the scripts is better obtained using the approach outlined below (and implemented on the Remeis machines, for example).

Other useful commands

  • assuming you've already cloned the repository and you want to update it to the newest version, use
    git pull
  • to obtain a change log of the repository
    git log
  • to obtain a change log for one file
    git log filename
    (which will work pretty much for all files, but for some reason does not work for the intscripts; note that even for the intscripts the full change log is still available and you can still check out older versions of the script should you desire to do so)
  • to tag one of the commits with a tagname, e.g., to mark the submitted version of your paper as submitted, get the commit id from git log and do
    git tag 'tagname' id
    where tagname only needs quotes if it contains, e.g., spaces
  • to push the tag
    git push --tags
  • optimize the local git repository
    git gc --aggressive
    this will optimize the tree of stored local changes, removing intermediate data that are not needed anymore. This makes your local repository dramatically faster and can save significant space. You should run this every now and then (on very active directories probably once a week).
  • find out where the repository originally came from before it was cloned:
    git remote -v
    or for a little more information
    git remote show origin
  • showing the differences between the old and new file:
    diff -u new_file old_file

Create your own repository

The following entry is from an email by Matthias Kühnel to Jieun (with additional comments added afterwards) and was intended to explain how to create a repository in order to edit a paper together using GIT.

Now we'll create a git-repository for your paper: in your Remeis home directory create a directory where to store all your repositories in, such as ~/git (at the moment we'll create only one ;-)). Create a subdirectory for your paper in there, ~/git/choi2011a for example. Change into that directory and create an empty repository by

  git init --bare --shared

You should get a message like "Initialized empty Git repository in /home/choi/git/choi2011a/". Add --shared only if everyone in your group should be able to commit changes. All in all, this should create a structure of files and subdirectories in your repository directory. Later your files will be hidden somewhere in this structure (I'm not sure where to be honest). The thing is that a bare repository only contains modifications, that means no whole file, only different pieces depending on the file version. Now you have to modify the 'config' file of the repository (//note:// you should not need the following modifications if you initialized the repository with the --shared option), for example

  /home/choi/git/choi2011a/config

The file should look like

   [core]
        repositoryformatversion = 0
        filemode = true
        bare = true

Your repository is now ready to be used, but empty. So let's add some files into it!

Create an empty directory somewhere that will contain your paper. We now clone your (empty) repository into your source directory:

  git clone crux:/home/choi/git/choi2011a .

Note that we clone from machine crux, even if we are working on that machine. This approach will make your life easier if you are working on multiple machines in the Remeis cluster. Do not forget the trailing dot!

Now your directory is a clone of the repository. Copy all files and subdirectories that you would like to be part of the repository into the directory and then add these files to git

 git add filenames

where filenames can include (relative) paths to files somewhere in the directory tree of your repository.

A general rule for tex repositories for papers is to only add the source code and

  • no* compiled or auxiliary files. That means, just add the .tex file,

any used styles (.sty) and images (.pdf, .ps, .eps or whatever). If you have also isis scripts (.sl) which creates some plots, you may add them also. Once you have added all necessary files, you have to 'commit' the changes:

 git commit -a

Now your default editor should open automatically, where you have to enter a comment describing the changes (attention: the default editor should be set first in the ~/.cshrc, see TC shell). This comment will be put into the log. The editor also shows a list of files which will be added/modified/removed. Please note that this list is for your information *only*. Any changes to the list have no effect! After you have entered a comment, quick save the file (if you use jed

Ctrl-X-S) and exit the editor (Ctrl-X-C). You should see something like:

  [master 5e91497] your_entered_comment
   1_or_more files changed, 3341_or_any_other_number insertions(+), 0
   deletions(-)
   create mode 100644 a_file_you_have_added
   ...

In order to avoid being prompted that unnecessary files such as editor backup files (e.g., filenames ending in a tilde or with .bak and other unnecessary files are not part of the repository, you can generate files called .gitignore in your git directories. These files contain descriptions of files which should not be in the repository. a .gitignore file is valid in the current directory and all of its subdirectories. These may contain further .gitignore-files. A good .gitignore for a paper would be:

#
# git ignore file for TeX files
#
*~
*.aux
*.log
*.bbl
*.blg
*.bak

Note that after creating the file you will have to add it to the repository! Files that were checked in before the .gitignore exists are not affected by adding the .gitignore, even if the filename is explicitly written down there. In this case, do a git rm [filename] and commit, and afterwards the file will not be checked in again.

That's a summary of the changes, which will be put into the repository once you have 'push'ed them:

  git push origin master

Please note, that the usual command is

  git push

without the origin-master-stuff, which has to be done only once if an empty repository was created! If you modify previously added files later, you don't have to add them again, of course. Instead skip the adding command and 'commit' and 'push' the changes directly.

Everybody knowing the path to your repository can now clone it, edit files and push changes. To update your local copy with the repository (i.e. to get the changes of somebody else) use

  git pull

Committing only parts of the modifications

A further functionality Manfred finds particularly useful.

It has been mentioned above that one can only commit selected files with

 git add file1 [file2 ...]  &&  git commit

(Do not use git commit -a in this case. If you really want to commit all changes, there is no need to first git add the files for this next commit.)

It is also possible to git add only parts of the modifications, namely with the -i(nteractive) Option. When I run

 git add -i file1 [file2 ...]

and press p for (patch), 1 for the first file, and then hit [Return], I can decide for every changed block ("hunk") in file 1 whether (y) I want to add ("stage") this change to the next commit or not (n). If I want to commit only part of what git considers a "hunk" in first place, I can press s in order to split the current hunk. When I'm done with file 1 or when I quit with q, I can start over from the beginning, e.g. patching the next file.

After the desired hunks have been staged to the index, you run

  git commit

as usual, without -a!

GIT Config

Make sure to update your name and e-mail-address in your home under .gitconfig like

  [user]
        email = matthias.kuehnel@sternwarte.uni-erlangen.de
        name = Matthias Kuehnel