How to document your research software

The lesson

  • In-code documentation
  • Writing good README files
  • Sphinx and Markdown
  • Deploying Sphinx documentation to GitHub Pages
  • Motivation and wishlist
  • Popular tools and solutions
    • Documentation Tools: comparison
    • HTML static site generators
    • Hosting Documentation on the Web
      • Plain Text formats: reStructuredText and Markdown
  • Summary

Supplementary material

  • Hosting websites/homepages on GitHub Pages

Reference

  • Shell crash course
  • List of exercises
  • Instructor guide
  • Credit and license

About

  • All lessons
  • CodeRefinery
  • Reusing
How to document your research software
  • Popular tools and solutions
  • Edit on GitHub

Popular tools and solutions

Questions

  • What tools are out there?

  • What are their pros and cons?

Objectives

  • Choose the right tool for the right reason.


Documentation Tools: comparison

Comparison of the tools for documentation we have discussed so far

Type

Convenient

Easy

Maintainabile

Searchable

Readable

LLM-friendly

Notes

in-code doc

✅✅

🟨

✅🟨

🟨

❌

✅🟨

❌for users

README

✅

✅

✅🟨

🟨

✅

✅

typically enough

HTML Generators

🟨

❌

✅🟨

✅

❌

✅✅

powerful

Wikis

🟨

✅

❌❌

✅

✅

❌

✅for non-programmers

Latex

🟨(?)

❌

❌🟨

🟨

✅ (?)

❌

✅Physics/Math, ❌copy/paste

Jupyter

🟨

🟨/❌

✅✅

🟨 (?)

✅

🟨

✅ validation tooling

What do we mean?

  • Convenience: for programmers who live in code.

  • Easiness: how easy is is to contribute and set up?

  • Maintainability is good for those tools that can be version-controlled along with the code. It is even better if it is easy to check automatically that the information is correct (does the output of a snippet of code match what is shown in the docs?)

  • Searchability: How easy is it to find the information we need?

  • Readability: Can the documentation be rendered in a way that makes it easy to read?

  • LLM-friendliness: how easy is to feed this documentation to an LLM?


HTML static site generators

There are many tools generate documentation that can be viewed locally, or hosted on the web.

Here are some HTML static site generators, relevant in our communities. These tools offer some or all of these features:

  • API Reference generation: source code is read, scan for docstrings and render them

  • Search: they offer a “whole site” search feature (non trivial, when viewing only one page). (if you can download )

  • Validation: check that the code snipped in the documentation match the real behaviour of the code.

  • Continuous checks: regenerate automatically every time you save, so that you can catch errors early

  • Sphinx ← this is how this lesson material is built

    • Generate HTML/PDF/LaTeX from RST and Markdown (MyST)

    • Basically all Python projects use Sphinx but Sphinx is not limited to Python.

    • Read the docs hosts public Sphinx documentation for free!

    • API Reference generation: via autodoc or autoapi

    • Search:

      • limited, keyword-based client-side (Javascript that runs in browser)

      • Full-text server-side on Read the docs

    • Validation: via doctest

  • MkDocs: A Markdown-first static site generator (with a vast system of plugins developed independently).

    • API Reference generation: via mkdocstrings

    • Search: search plugin for client-side (Javascript that runs in the browser - lunr.js) Project now (as of 2026) not maintained [1]

  • Doxygen:

    • API Reference generation: has also support for Python

  • pkgdown

    • API Reference generation: via roxygen2 and Rdconv

    • Uses RMarkdown and a LaTeX-like syntax

    • Search:

      • client-side (Javascript that runs in browser - fuse.js)

      • also typically available in RStudio

    Long-Form Documentation for R is typically contained in vignettes.

  • Doxygen

    • API Reference generation out of the box, generates static call graph

    • Focus on Documentation directly in the source code

    • MarkDown-like syntax, with its own flavour and special commands

    • Search:

      • limited keyword-based client-side

      • full text search server-side

  • Sphinx can be also used to generate documentation for C++ projects, using the XML output from Doxygen via Breathe

  • Doxygen:

    • API Reference generation out of the box, generates static call graph (but has limited Fortran parsing capabilities)

    • Focus on Documentation directly in the source code

    • MarkDown-like syntax, with its own flavour and special commands

    • Search:

      • limited keyword-based client-side

      • full text search server-side

  • FORD

    • Python-based

    • Search: client-side (Javascript that runs in the browser - lunr.js)

  • Documenter.jl

    • Using MarkDown (JuliaMarkdown flavour)

    • Parses Julia code and in-code documentation/docstrings

    • Search: client-side (but typically the whole site is loaded for search on every page)

    • Validation: runs the code and checks

RustDoc

  • Uses MarkDown (CommonMark flavour)

  • Search: client-side (Javascript that runs in the browser - elasticlunr.js)

  • Validation: validates code examples when run with --test

These are general-purpose static website generators that match the philosophy of the other tools presented so far, but might be better suited for blogging, reports or other kinds of publications:

  • Hugo

  • Hexo

  • Zola ← this is what we use for our project website and workshop websites

  • Jekyll, default for GitHub pages

  • Franklin.jl: focuses on technical blogging for the Julia community

  • Quarto converts markdown to websites, pdfs, ebooks and many other things (dynamic notebook-based documents)


Hosting Documentation on the Web

GitHub, GitLab, and Bitbucket make it possible to serve HTML pages:

  • GitHub Pages

  • Bitbucket Pages

  • GitLab Pages

Read The Docs is also free to use for open source code, and can be connected to common software forges.

Discussion

Do you know an awesome tool or feature that should be in this list? Let us know! (Open a PR)


Plain Text formats: reStructuredText and Markdown

# This is a section in Markdown   This is a section in RST
                                  ========================

## This is a subsection           This is a subsection
                                  --------------------

Nothing special needed for        Nothing special needed for
a normal paragraph.               a normal paragraph.

                                  ::

    This is a code block          This is a code block


**Bold** and *emphasized*.        **Bold** and *emphasized*.

A list:                           A list:
- this is an item                 - this is an item
- another item                    - another item

There is more: images,            There is more: images,
tables, links, ...                tables, links, ...
  • Two of the most popular lightweight markup languages.

  • reStructuredText (RST) has more features than Markdown but the choice is a matter of taste.

  • There are (unfortunately) many flavors of Markdown.

  • Motivation to stick to a standard text-based format: They make it easier to move the documentation to other tools which also expect a standard format, as the project/organization grows.

  • We use MyST flavored Markdown in the Sphinx and Markdown episode and the Hosting websites/homepages on GitHub Pages example.

  • Nice resource to learn Markdown: Learn Markdown in 60 seconds

  • Pandoc can convert between MD and RST (and many other formats).

Keypoints

  • READMEs are typically a good starting point

  • Some popular solutions make reproducibility and maintenance of multiple code versions difficult.

  • The landscape of tools is very diversified and every community has their own favourite.

  • The basic functionality of all Static site generators is very similar, but specific aspects (API ref generation, search, validation) differ.


[1]

After somewhat dramatic events (2026), MkDocs 1.x is now superseded by Zensical, which tries to keep compatibility wiht MkDocs 1.x. (MkDocs 2.0 is also being developed but projects and plugins based on 1.x will break).

Previous Next

© Copyright CodeRefinery contributors.

Built with Sphinx using a theme provided by Read the Docs.