Other features

Objectives

  • Have an idea of what is possible to do with actions or CI, and how

In the past episodes we have seen some examples of minimal workflows (testing, documentation).

In this episode we focus on specific features. In the exercise repository (see link to GiHub version, link to GiLab version) a number of examples and exercises have been collected, so that you can try different features.

Fork it on the platform you prefer.

General ideas

Basic ideas for GitHub Actions and Gitlab CI/CD.

The workflow are defined in yaml files in .github/workflows (one file - one workflow)

  • Typically each push triggers the workflows unless skipped

  • Each workflow has

    • a list of event triggers (typically push, but others are possible, for example workflow_dispatch which is necessary to launch the workflow manually)

    • a list of jobs, that run independently, made of steps

      • each job has a list of steps that run one after another

      • steps runs on the same “machine” and share data and context

      • each step can use a pre-defined action (see ) or a script

      • the first step typically is the checkout action

      • steps are executed one after the other. If a step fails, the subsequent steps are not executed, and the job fails.

    • if a job in a workflow fails, then the workflow fails

  • if a workflow fails, then the build is marked as failed.

Note

Editing workflow and pipeline files

Whenever you want to edit a file that defines a workflow (on GitHub) or a pipeline (on GitLab) consider doing this directly on the web interface, as the build-in editors have a linter and autocompletion which will vastly reduce the probability of making trivial mistakes.

Warning

Exercises on GitHub: starting workflows manually

To trigger manually the workflows on GitHub using your fork of example repository, you might have to switch the default branch to the appropriate one, in “Settings” -> “General” -> “Default Branch”.

Note

Email notifications

Both GitHub and GitLab are eager to send you emails when a workflow or pipeline fails.

While this is in most case very useful, consider disabling email notifications on the repository you use for the exercises during this session.

A basic pipeline: Compile and run a C program

This example is available in the example repository, on the main branch.

In this case we have a single job, with 3 steps: checkout, build and run.

name: BasicExample
on: 
  - push                                  # the workflow runs when we push
  - workflow_dispatch                     # we can launch the workflow manually

jobs:
  build_and_run:
    runs-on: ubuntu-latest                # This is necessary, 
                                          # specifies the kind of host where to run
    steps:
      - name: "Checkout"
        uses: actions/checkout@v6         # We use a pre-defined *action*
      - name: "build"
        run: gcc -o hello ./src/hello.c
      - name: "run"
        run: ./hello

Note that runs-on takes one or more labels of a runner, and identifies the type of host.

Failures

Proper reporting and propagation of the failure of a command in a pipeline or workflow is paramount.

A CI job not properly reporting a failure is itself a severe bug.

Switch to the branch failures.

The shell script ./scripts/doesnt-fail-but-typically-should.sh shows a typical pitfall.

  • How can the problem be fixed?

  • What is the default behaviour of on GitHub or GitLab, when writing the logic in a workflow/pipeline file instead of a separate shell script?

Switch to the branch failures and follow the instructions in the README, and have a look at the workflow/pipeline definition file.

Secrets and repository-specific behaviour

In some cases we want to change the behaviour of the workflows/pipelines depending on the repository.

In other cases, we might want to avoid storing some information under version control (it might be not general enough, or it might be a secret). but it is needed to run workflows or pipelines.

The solution is to use environment variables (also secrets on GitHub).

Check out the branch environment-variables in the example repository, to see how to customize a job by using environment variables that are set at the repository level.

Artifacts

Artifacts are the main way to transfer information between jobs in a GitLab pipeline, and from the runner to the GitLab web interface.

To look at the examples, switch to the artifacts branch.

  1. Have a look at ``.github/workflows/artifacts.yaml.

  2. Try to run the pipeline, and check the output.

Code reuse

When creating a large workflow or pipeline, we might be tempted to copy/paste the job definition. What alternatives do we have?

YAML anchors also work on GitHub actions (with some limitations compared to GitLab).

In the GitHub context, actions themselves are the main building block, and they are reusable by default.

Parametric tasks are also a way to “reuse” code, or at least to avoid code duplication.

Parametric tasks: Matrix

Sometimes you might want to run the same job for many different “cases”, or parameters (e.g., different versions of a library/dependency/compiler).

Different variations of a job in a workflow can be run by using a matrix strategy.

Mirroring

A Mirror is a copy of a repository that is kept automatically in sync while being, typically, on a different forge, to take advantages of different features, or reach different communities.

There are 2 possible techniques:

  • Pull: the “mirror” repository is configured to automatically pull from the original at regular intervals. This is typically a “premium” feature, as it is generally inefficient (polling).

  • Push: the original repository is configured to automatically push to the “mirror” whenever there are changes. This is typically more efficient.

To set a push mirror, one can use GitHub actions (using the action https://github.com/wangchucheng/git-repo-sync)

A small guide is available in the github-to-gitlab-mirror branch of the example repository.

When triggered, the workflow on that branch pushes the current branch of the repository to a mirror on GitLab.com.

Push mirror from GitHub to GitLab

  1. Fork the example repository on GitHub.

  2. Create and empty repository on a GitLab server (tip: disable notifications)

  3. Checkout the github-to-gitlab-mirror branch in the example repo

  4. Follow the instructions in the README.md to set up the mirroring

Conditional execution of workflows, jobs and pipelines

On both platforms it is possible to make so that the execution of the jobs, steps or of the full workflow is dependent on some conditions.

Typical cases are:

  • Always: force the execution of jobs/steps no matter what

  • Only in case of a merge

  • Only in case of a tagged commit

  • Only on a particular branch

  • based on expressions involving environment variables

  • only when some files are changed

Self-hosting runners

There might be situations where you need to run the workflows/pipelines on resources you own, instead of relying on github.com or any specific gitlab server.

Possible reasons:

  • GitHub actions and GitLab CI/CD on gitlab.com have usage limits (for private repos), or reliability issues, and your test suites take too long

  • You need to test (or benchmark) your code with specific hardware and software (e.g., on HPC)

Both gitlab and github will schedule jobs on runners comparing the tags/labels of the runners and the tags/labels of the job: for a job to execute on a given runner the tags/labels of the job must be a subset of the tags/labels of the runner.

One can add a self-hosted runner to a repository.

Warning

It is a security risk to have a self-hosted runner attached to public repositories, because in that case an attacker could open a merge request and run malicious code as a part of a workflow that gets executed by your self-hosted runner.

Properly managing mirrors can alleviate this problem.

Most importantly:

  • A job launched on the self-hosted runner can access the whole machine/vm/container it is running on, and it runs in that context, so one might have to start the runner inside a container (but then, if a job itself requires to execute in a container, there might be issues)

  • the workflow files might need to be adjusted, in particular the value associated to runs-on for every job (tags) need to be a subset of the ones that identify the self-hosted runner.