Concepts around collaboration

Objectives

  • Be able to decide whether to divide work at the branch level or at the repository level.

Instructor note

  • 15 min teaching

Motivation

  • Someone has given you access to a repository online and you want to contribute?

  • We will review how to make a copy and send changes back.

  • Then, we make a “pull request” that allows a review.

  • Once we know how code review works, we will be able to:

    • propose changes to repositories of others

    • review changes submitted by external contributors.

Cloning a repository

In order to make a copy a repository (a clone), the git clone command can be used. Cloning of a repository is of relevance in a few different situations:

  • Working on your own, cloning is the way to copy a repository on, e.g., a personal computer, a server, and a supercomputer.

  • The original repository could be a repository that you or your colleague own. A common use case for cloning is when working together within a smaller team where everyone has read and write access to the same git repository.

  • Alternatively, cloning can be made from a public repository of a code that you would like to use. Perhaps you have no intention to work on the code, but would like to stay in tune with the latest developments, also in-between releases of new versions of the code.

  • Your work is not visible to others, because it is on your computer.

Cloning

Cloning

Forking a repository

Forking a repository on a forge creates a clone that reside under a different account on the same forge (a fork).
It is typically done to work on a git repository you cannot write to.

  • Your work is visible to others, because it is on the web

  • commits in the fork can be made to any branch (including main or master)

  • The commits that are made within the branches of the fork repository can be contributed back to the parent repository by means of pull (or merge) requests.

Forking

Forking

Exercise

What is the difference between forking and then cloning (your fork, to your computer) vs cloning (to your computer) and then pushing to a brand new repository?

Generating from templates and importing

There are two more ways to create “copies” of repositories into your user space:

  • A repository can be marked as template and new repositories can be generated from it like using a cookie-cutter. The newly created repository will start with a new history.

  • You can import a repository from another hosting service or web address. This will preserve the history of the imported project and features like Wikis, issues and the like.

Discussion

  • Visit one of the repositories/projects that you have used recently and try to find out how many forks exist and where they are.

  • In which situations could it be useful to start from a “template” repository by generating?

Synchronizing changes between repositories

  • We need a mechanism to communicate changes between the repositories.

  • We will pull or fetch updates from remote repositories (we will soon discuss the difference between pull and fetch).

  • We will push updates to remote repositories.

  • We will learn how to suggest changes within repositories on a forge and across repositories (pull request).

  • Repositories that are forked or cloned do not automatically synchronize themselves: We will learn how to update forks (by pulling from the “central” repository).

  • A main difference between cloning a repository and forking a repository is that

    • cloning is a general operation for generating copies of a repository to different computers

    • forking is a particular operation implemented on forges (that includes cloning)

Forking and cloning

Forking and cloning

Authentication: connecting to the repository from your computer

There are mainly two ways to do authentication:

  • SSH keys

  • HTTPS

Please have a look at this guide by CodeRefinery for a general introduction to authentication options.

We suggest setting up and using an SSH key, since it is a form of authentication that is also used on other services (e.g., to access HPC systems). For a step-by-step guide look at this walkthrough by Software Carpentry.

Authentication via HTTPS might require less set up, if password authentication is allowed.
If not, you can use a personal access token as a drop-in replacement, which can be configured at these pages:

Problems in Collaborative Software development

Merging can be a difficult moment in the life cycle of a software.

Git will try to do reasonable operations when merging two different lines of work, but:

  • There might be an detectable ambiguity in the way that two different lines of work can be reconciled (this leads to a conflict)

  • the results are not guaranteed to give you working software all the times (i.e., you don’t get a conflict, but the result is not correct either - this is scarier).

Contributing to the main branch as often as possible, to make the changes as small as possible, is a possible approach to reduce the difficulty related to merging.

In the following chapters we will focus on tools that ease the communication aspect of collaborative software development.