In the previous post on extending bash, using builtins was mentioned as a way to improve extensibility. Rather than writing native bash functions or spawning processes to run external tools, pkgcraft implements all its bash command support using builtins.
For example, the
inherit command used to load eclasses,
die, and all
install-related functionality (e.g.
doins) are all implemented as
builtins. This allows for a more seamless experience compared to pkgcore
which implements all of this natively in bash using a simple daemon that sends
messages via shared fds to communicate between the python and bash sides.
Note that these builtins are not readily available for use in regular bash since most are highly Gentoo specific, often rely on underlying build state, and aren’t built in a fashion that can be externally exposed. However, the design work done to support bash in rust also allows creating builtins compatible with standard bash.
For those interested in bash and rust, the following walkthrough explains how dynamic builtins work, describes some of the rust support required for interoperability, and discusses why they’re useful.
For background, bash includes builtins used daily by many, e.g.
source are all builtins. In addition to these, external builtins can be
loaded dynamically from shared object files via
dlopen to open the shared object and
dlsym to load the symbol
for the related builtin, if it exists. The builtin is then registered
internally as dynamically loaded and can be used until it is either
unloaded or the shell exits.
To remove a previously loaded builtin use:
Running a builtin is the same as running most other commands:
In terms of default execution precedence, similarly named functions come first, then builtins, and finally external binaries. This means if an in scope function and loaded builtin have the same name, running that name in the shell will run the function and not the builtin.
The version of bash installed by most distros should support dynamic builtins
inherently because bash itself doesn’t provide a disable mechanism; however,
Gentoo manually hacks the configure script to disable support by default. In
order to enable it, make sure to build bash with the
plugins USE flag
Builtins have access to nearly all bash’s underlying API; however, they are
mainly limited to running in command form using simple string arguments. In
other words, scoped builtins that form more complex expressions, e.g. bash’s
[[ ]], generally require parser and/or grammar level
changes that aren’t possible to achieve in a basic builtin.
One of the tricky parts supporting dynamic builtins in rust is that it has no
support of life before main or lib init similar to C. Therefore, we must
determine some way to provide external symbols for builtin structs that can’t
be initialized globally before init. To do this rust relies on linker support
for runtime initialization via
DT_INIT_ARRAY for ELF objects (and similar on
other platforms). This allows running a specified function during the
library loading process that replaces Option wrapped, globally defined,
static mutables with their actual builtin structs required by bash.
Beyond building the shared objects, pkgcraft provides support for interacting with bash’s C API in rust via scallop. This enables performing most anything that can be done natively in bash. For example, bash variables can be bound, unbound, and marked as readonly. However, it should be noted that scallop is a young project so it only supports what pkgcraft has needed thus far, has many rough edges, and doesn’t come close to wrapping all of bash’s exported API.
In addition to scallop, pkgcraft also provides pkgcraft-bash which is mainly an example project to create dynamic builtins. For our purposes, we’ll be exploring scallop and pkgcraft-bash while using them to demonstrate how rust-based builtins work.
First, the required tools for building the code should be installed. This
includes a recent rust compiler, C compiler, and a recent version of bash that
supports loading dynamic builtins from shared objects. I leave it up to the
reader to leverage
rustup and/or their distro’s package manager to install
the required build tools (and others such as
git that are implied
Next, the required pkgcraft subprojects must be pulled down. The easiest way to do this is to recursively clone pkgcraft-workspace which should include semi-recent submodule checkouts for all the subprojects:
From this workspace, the pkgcraft-bash project can be built via:
This should create the shared pkgcraft-bash library
target/debug/libpkgcraft_bash.so from which dynamic builtins can be loaded.
In order to aid in bash development with rust, scallop provides a rudimentary profiling builtin. To load and use it, see the following example:
In short, it profiles a user-specified command over a period of time while counting loops completed. This could be extended to run cache warmups and perform more accurate statistical analysis, but its current form works for simple benchmarking.
It’s quite fair to say that if you start benchmarking bash code then you probably shouldn’t be using bash; however, most Gentoo package managers include a relatively large amount of bash that should be optimized in cases where it runs often or in tight loops.
Pkgcraft leverages scallop to sidestep this entirely, allowing all native bash code required to support operating with ebuilds to be replaced with rust. Alongside that, this profile builtin helps highlight certain types of runtime regressions in pkgcraft’s builtin support.
Now that you have some experience with the
profile builtin, let’s compare the
performance of an actual rust-based builtin to similar functionality written
natively in bash for atom version comparisons.
First, download a copy of
eapi7-ver.eclass that contains the bash
implementation of the version comparison algorithm used in Gentoo for the
ver_test command in portage and pkgcore.
Next, check its performance using the
profile builtin. Note that if you
started a new bash shell, the
profile builtin will have to be reloaded.
With that baseline established for the native bash implementation, let’s create
a new builtin that wraps pkgcraft support to provide the same functionality.
It’s probably easiest to copy pkgcraft’s
ver_test builtin into pkgcraft-bash
with minor alterations in order to make it dynamically loadable.
Use the following diff that currently applies against the pkgcraft-bash repo to
ver_test support (or use it as a guide if it has fallen out of date).
Once the diff is applied, rebuild pkgcraft-bash with pkgcraft support enabled from the root of the workspace which will currently build the profile, atom, and ver_test builtins.
ver_test again making sure to use the builtin implementation.
From the result, note that the rust implementation is over 20x faster than the native bash version. Through further work this can potentially be improved with more changes to bash’s builtin support. For example, bash currently does a binary search in its builtins array to find if a matching builtin exists before executing it. This should be quicker to perform as a simple hash table lookup instead.
Overall, I personally find most programming languages to be more maintainable than bash in the long-term for any well-written code longer than relatively simple scripts. Add in rust’s ability to be exported via its FFI interface to any language that has C interoperability and it should become apparent why I prefer implementing such support in rust rather than bash.
If scallop keeps improving its wrapper API around bash, support for writing bash functionality in rust should continue to improve as well. Looking forward, it’s feasible something like bats could be written in rust or scallop’s functionality could be exported to another language, for example allowing python to natively interact with bash.
For the time being, I’ll just continue using it for one of the main reasons I created it: trying to avoid writing extensive code in bash.