Evolving QA tooling

On the origin of linters

QA support in Gentoo has been a fluid, amorphous goal over the project’s history. Throughout the years, developers have invented their own scripts and extensions to work around the limitations of official tooling. More recently, the relaxed standards have been tightened up a fair amount, but it should be possible to achieve more results with further improvement.

Beginning my tenure as an ebuild maintainer between 2005 and 2010, much of the development process revolved around CVS and repoman, both of which felt slow and antiquated even at the outset. Thankfully, CVS was swapped out for git in 2015, but repoman stuck around for years after that. While work was done on repoman over the years that followed, its overall design flaws were never corrected leading to it being officially retired in 2022 in favor of pkgcheck (and pkgdev).

Comparatively speaking, pkgcheck is much better designed than repoman; however, it still lags in many areas generally due in part to relying on pkgcore[1] and using an amalgamation of caching and hacks to obtain a modicum of performance via parallelization. In short, performance can still be drastically improved, but the work required to achieve such results is not easy.

Pkgcraft support

Similar to how pkgcheck builds on top of pkgcore, pkgcraft provides its core set of QA tooling via pkgcruft[2], an ebuild linter featuring a small subset of pkgcheck’s functionality with several extensions. As the project is quite new, its limited number of checks run the spectrum from bash parsing to dependency scanning.

An API for scanning and reports is also provided, allowing language bindings for pkgcruft or building report (de)serialization into web interfaces and other tools. For example, a client/server combination could be constructed that creates and responds to queries related to reports generated by given packages between certain commit ranges.

Looking towards the future, the current design allows extending its ability past ebuild repos to any viable targets that make sense on Gentoo systems. For example, it could be handy to scan binary repos for outdated packages, flag installed packages removed from the tree, or warn about USE flag settings in config files that aren’t relevant anymore. These types of tasks are often handled in a wide variety of places (or left to user implementation) at varying quality and performance levels.

Install

For those running Gentoo, it can be found in the main tree at dev-util/pkgcruft. Alternatively, it can be installed via cargo using the following commands:

Current release: cargo install pkgcruft

From git: cargo install pkgcruft --git https://github.com/pkgcraft/pkgcraft.git

Pre-built binaries are also provided for releases on supported platforms.

Metadata issues

Before going through usage patterns, it should be noted that pkgcraft currently doesn’t handle metadata generation in threaded contexts so pkgcruft will often crash when run against ebuilds with outdated metadata. Fixing this requires redesigning how pkgcraft interacts with its embedded bash interpreter, probably forcing the use of a process spawning daemon similar to pkgcore’s ebd (ebuild daemon), but handled natively instead of in bash.

A simple workaround involves incrementally generating metadata via running pk pkg metadata from any ebuild repo directory[3]. If that command completes successfully, then pkgcruft can be run from the same directory as well. On failure, the related errors should be fixed and metadata generated before attempting to run pkgcruft. So as a reference, pkgcruft can safely be run on writable repos via a command similar to the following:

  • pk pkg metadata && pkgcruft scan

It might be easiest to add a shell alias allowing for options to be specified for pkgcruft scan until pkgcraft’s metadata generation issue with threads is solved.

Usage

Much of the pkgcruft’s command-line interface mirrors that of pkgcheck as there are only so many ways to construct a linter and it aids mapping existing knowledge to a new tool. See the following commands for example usage:

Scanning

Scan the current directory assuming it’s inside an ebuild repo:

  • pkgcruft scan

Scan an unconfigured, external repo:

  • pkgcruft scan path/to/repo

Scan the configured gentoo repo:

  • pkgcruft scan --repo gentoo
  • pkgcruft scan '*::gentoo'

Scan all dev-python/* ebuilds in the configured gentoo repo:

  • pkgcruft scan --repo gentoo 'dev-python/*'
  • pkgcruft scan 'dev-python/*::gentoo'

See the help output for other scan-related options such as reporter support or report selection. Man pages and online documentation will also be provided in the future.

  • pkgcruft scan --help

Filtering

Native filtering support is included via the -f/--filters option allowing specific package versions matching various conditions to be targeted. Note that filters can be chained and inverted to further specify targets. Finally, only checks that operate on individual package versions can be run when filters are used, all others are automatically disabled.

Restrict to the latest version of all packages:

  • pkgcruft scan -f latest

Restrict to packages with only stable keywords:

  • pkgcruft scan -f stable

Restrict to unmasked packages:

  • pkgcruft scan -f '!masked'

Restrict to the latest, non-live version:

  • pkgcruft scan -f '!live' -f latest

Beyond statically defined filters, much more powerful package restrictions are supported and can be defined using a declarative query format that allows logical composition. More information relating to valid package restrictions will be available once better documentation is written for them and pkgcraft in general. Until that work has been done, see the following commands for example usage and syntax:

Restrict to non-live versions maintained by the python project:

  • pkgcruft scan -f '!live' -f "maintainers any email == 'python@gentoo.org'"

Restrict to packages without maintainers:

  • pkgcruft scan -f "maintainers is none"

Restrict to packages with RDEPEND containing dev-python/* and empty BDEPEND:

  • pkgcruft scan -f "rdepend any 'dev-python/*' && bdepend is none"

Replay

Similar to pkgcheck, replay support is provided as well supporting workflows that cache results and then replay them later, potentially using custom filters. Pkgcruft only supports serializing reports to newline-separated JSON objects at this time which can be done via the following command:

  • pkgcruft scan -R json > reports.json

The serialized reports file can then be passed to the replay subcommand to deserialize the reports.

  • pkgcruft replay reports.json

This functionality can be used to perform custom package filtering, sort the reports, or filter the report variants. See the following commands for some examples:

Replay all dev-python/* related reports, returning the total count:

  • pkgcruft replay -p 'dev-python/*' reports.json -R simple | wc -l

Replay all report variants generated by the Whitespace check:

  • pkgcruft replay -c Whitespace reports.json

Replay all python update reports:

  • pkgcruft replay -r PythonUpdate reports.json

Replay all reports in sorted order:

  • pkgcruft replay --sort reports.json

Benchmarks and performance

Rough benchmarks comparing pkgcruft and pkgcheck targeting a related check run over a semi-recent gentoo repo checkout on a modest laptop with 8 cores/16 threads (AMD Ryzen 7 5700U) using a midline SSD are as follows:

  • pkgcheck: pkgcheck scan -c PythonCompatCheck -j16 – approximately 5s
  • pkgcruft: pkgcruft scan -c PythonUpdate -j16 – approximately .56s

For comparative parallel efficiency, pkgcruft achieves the following with different amounts of jobs:

  • pkgcruft: pkgcruft scan -c PythonUpdate -j8 – approximately .65s
  • pkgcruft: pkgcruft scan -c PythonUpdate -j4 – approximately 1s
  • pkgcruft: pkgcruft scan -c PythonUpdate -j2 – approximately 2s
  • pkgcruft: pkgcruft scan -c PythonUpdate -j1 – approximately 4s

Note that these results are approximated averages for multiple runs without flushing memory caches. Initial runs of the same commands will be slower due to additional I/O latency.

While the python update check isn’t overly complex it does require querying the repo for package matches which is the most significant portion of its runtime. Little to no work has been done on querying performance for pkgcraft yet, so it may be possible to decrease the runtime before resorting to drastic changes such as a more performant metadata cache format.

While it should still be able to improve, pkgcruft already runs faster using a single thread than pkgcheck running on all available cores. Most of this probably comes from the implementation language which is further exhibited when restricting runs to single category and package targets where process startup time dominates. See the following results for the same check run in those contexts:

Targeting dev-python/*:

  • pkgcheck: pkgcheck scan -c PythonCompatCheck -j16 – approximately 1s
  • pkgcruft: pkgcruft scan -c PythonUpdate -j16 – approximately .13s

Targeting dev-python/jupyter-server:

  • pkgcheck: pkgcheck scan -c PythonCompatCheck -j16 – approximately .38s
  • pkgcruft: pkgcruft scan -c PythonUpdate -j16 – approximately .022s

Note that in the case of targeting a single package with multiple versions, pkgcruft currently doesn’t parallelize per version and thus could possibly half its runtime if that work is done.

Finally, in terms of memory usage pkgcruft usually consumes about an order of magnitude less than pkgcheck mostly due to python’s ownership model as rust can more easily use immutable references rather than cloning objects. Also, pkgcheck’s parallel design uses processes instead of threads due to python’s weaker concurrency support again due to historical language design[4] leading to more inefficiency. This difference may increase as more intensive checks or query caching is implemented as pkgcruft should be able to share writable objects between threads via locking or channels more readily than pkgcheck can in a performant manner between processes.

But is the duplicated effort worth it?

Even with some benchmarks showing potential, it may be hard to convince others that reworking QA scanning yet again is a worthwhile endeavor. This is a fair assessment as much work has gone into pkgcheck in order to bring it to its recent state underpinning Gentoo’s QA. When regarding this opinion, it helps to revisit why repoman was supplanted and discuss its relative performance difference compared to pkgcheck.

Disregarding the work done on enabling more extensive checks, it can be argued that pkgcheck’s performance differential allowed it to be more reasonably deployed at scale and is one of the main reasons Gentoo QA has noticeably improved in the last five to ten years. Instead of measuring a full tree scan in hours (or perhaps even days on slower machines) it can run in minutes. This has enabled Gentoo’s CI (continuous integration) setup to flag issues within a shorter time period after being pushed to the tree.

Pkgcheck’s main performance improvement over repoman came in terms of its design enabling much better internal parallelization support which repoman entirely lacked for the majority of its existence. However, single thread performance was much closer for similar use cases.

With that in mind, pkgcruft runs significantly faster than pkgcheck for single threaded comparisons of related checks before taking its more efficient parallelization design (threads vs processes) into account. Similar to the jump from repoman to pkgcheck, using pkgcruft could enable even more CI functionality that has never been seriously considered such as rejecting git pushes server-side due to invalid commits.

Whether this makes the reimplementation effort worthwhile is still debatable, but it’s hard to argue against a design that achieves similar results using an order of magnitude less time and space with little work done towards performance thus far. If nothing else, it exhibits a glimpse of potential gains if Gentoo can ever break free of its pythonic shackles.

Future work

As with all replacement projects, there are many features pkgcruft lacks when comparing it to pkgcheck. Besides the obvious check set differential, the following are a few ideas beyond what pkgcheck supports that could come to fruition if more work is completed.

Viable revdeps cache

Verifying reverse dependencies (revdeps) is related to many dependency-based checks most of which are limited in scope or have to run over the entire repo. For example, when removing packages pkgcheck needs to do a full tree visibility scan in order to verify package dependencies.

Leveraging a revdeps cache, this could be drastically simplified to checking a much smaller set of packages. The major issues with this feature are defining a cache format supporting relatively quick (de)serialization and restriction matching while also supporting incremental updates in a performant fashion.

Git commit hooks

None of the QA tools developed for Gentoo have been fast enough to run server-side per git push, rejecting invalid commits before they hit the tree. In theory, pkgcruft might be able to get there, running in the 50-500ms range depending on the set of checks enabled, amount of target packages, and hardware running them.

Properly supporting this while minding concurrent pushes requires a daemon that the git hook queues tasks on with some type of filtering to ignore commits that cause too many package metadata updates (as it would take too long to responsively update metadata and scan them for most systems). Further down the road, it could make sense to decouple pushing directly to the main branch and instead provide support for a merge queue backed by pkgcruft thus alleviating some of the runtime sensitive pressure allowing to move from sub-second goals to sub-minute especially if some sense of progress and status is provided for feedback.

Native git bisect support

Extending pkgcheck’s git support provided by pkgcheck scan --commits, it should be possible to natively support bisecting ebuild repo commit ranges to find a bad commit generating certain report variants. Gentoo CI supports this in some form for its notification setup, but implements it in a more scripted fashion preventing regular users from leveraging it without recreating a similar environment.

Pkgcruft could internally run the procedure using native git library support and expose it via a command such as pkgcruft bisect a..b. While this may be a workflow only used by more experienced devs, it would be handy to support natively instead of forcing users to roll their own scripts.


  1. Pkgcore ambles over the low bar set by portage’s design but has been showing its age since 2015 or so. It’s overly meta, leaning into python’s “everything is an object” tenet too much while hacking around the downsides of that approach for performance reasons. 

  2. Aiming to fight the neverending torrent of package cruft in ebuild repos. 

  3. Install pkgcraft-tools in order to use the pk command. 

  4. Python’s weaker threading support may be improved due to ongoing work to disable the GIL (global interpreter lock) in CPython 3.13; however, it’s still difficult to see how a language not designed for threading (outside usage such as asynchronous I/O) adapts while supporting both GIL and non-GIL functionality as currently, separate builds (having already gone through a compatibility fiasco during the py2 -> py3 era).