How pre-commit Will Make Your (Coding) Life Easier

If you are coding in a team, you probably know the problem that everyone has a different coding style. If you are coding alone you might find yourself not following any coding style at all. This is where pre-commit comes in. It is a tool that allows you to run linters and code formatters in a pipeline. It is intended to be used with git hooks, so that the pipeline is run before you commit your code, but I also use it a lot to check my code locally after any changes.

How to set it up

If you are Python power you will probably install pre-commit using pip or conda. After that, you want to create a .pre-commit-config.yaml file in the root directory of your project. This file contains all the tools that you want to run in your pipeline. You can find a list of all supported tools here. In the following sections I will show you my configuration for python projects.

The complete configuration

This is how my .pre-commit-config.yaml file looks like:

repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.4.0
    hooks:
      # list of supported hooks: https://pre-commit.com/hooks.html
      - id: trailing-whitespace
      - id: end-of-file-fixer
      - id: check-yaml
      - id: check-added-large-files
      - id: check-case-conflict
      - id: debug-statements
      - id: detect-private-key

    # reformat code to the highest python version specified
  - repo: https://github.com/asottile/pyupgrade
    rev: v3.10.1
    hooks:
      - id: pyupgrade
        args: [--py311-plus, --keep-percent-format]

  - repo: https://github.com/PyCQA/autoflake
    rev: v2.2.0
    hooks:
      - id: autoflake
        args:
          - --in-place
          - --remove-all-unused-imports
          - --expand-star-imports
          - --remove-duplicate-keys
          - --remove-unused-variables

  # finds unreferenced (dead) python code
  - repo: https://github.com/jendrikseipp/vulture
    rev: "v2.7" # or any later Vulture version
    hooks:
      - id: vulture

  # python code formatting
  - repo: https://github.com/psf/black
    rev: 23.7.0
    hooks:
      - id: black

  # Reformat doc sctrings
  - repo: https://github.com/PyCQA/docformatter
    rev: v1.7.5
    hooks:
      - id: docformatter
        args:
          [
            --recursive,
            --in-place,
            --wrap-summaries,
            "120",
            --wrap-descriptions,
            "120",
          ]

  # python import sorting
  - repo: https://github.com/PyCQA/isort
    rev: 5.12.0
    hooks:
      - id: isort
        args: [--settings-path, "pyproject.toml"]

  # yaml formatting
  - repo: https://github.com/pre-commit/mirrors-prettier
    rev: v3.0.0
    hooks:
      - id: prettier
        args: [--max-line-length, "120", --end-of-line, "lf"]

  # python code analysis
  - repo: https://github.com/PyCQA/flake8
    rev: 6.1.0
    hooks:
      - id: flake8

I think the hooks in the first section are pretty self-explanatory. They are all from the pre-commit-hooks. The other tools I will explain in a little bit more detail:

Many tools can be configured either directly in the .pre-commit-config.yaml file or in a separate configuration file. Most tools these days support the pyproject.toml file. You can use the tool key to configure the tools, for example:

[tool.vulture]
exclude = ["conf/", "data/", "docs/", "notebooks/", "output*/", "logs/", "tests/"]
make_whitelist = false
min_confidence = 80
paths = ["my_project/"]
sort_by_size = true
verbose = false
ignore_names = ["args", "kwargs"]


[tool.isort]
profile = "black"
src_paths = ["my_project", "tests"]
line_length = 120
force_alphabetical_sort_within_sections = true


[tool.black]
line-length = 120
target-version = ['py39', 'py310']

The only tool that (not yet) supports the pyproject.toml file is flake8. For this I use the setup.cfg file. It contains the following:

[flake8]
max_line_length = 120
show_source = True
format = pylint
ignore =
    E203  # whitespace before ':'
    W605  # invalid escape sequence
    W503  # line break before binary operator
exclude =
    .git
    __pycache__
    data/*
    notebooks/*
    logs/*
    tests/*
    outputs/*

How to use it

After you have set up the configuration file, you can run the pipeline with the following command:

pre-commit run --all-files

This will run all the tools on all files in your repository and leaves you with beautiful and clean code. All the changes it can make automatically will be applied. If there are any changes that it cannot make automatically they will be shown to you, so that you can manually fix them.