What is it about?

The POSIX shell---the command-line interface to developer machines and servers, with bash and zsh as the most popular implementations---is to programming what glue is to carpentry: shell scripts are critical infrastructure in development, operations, analysis, maintenance, and forensics; developers and administrators interactively control their computers through the shell. The shell is a “glue” language because it is built to combine and connect existing programs, unlike traditional “general purpose” languages, which tend to build programs out of finer-grained parts. As programming languages go, the shell is ancient, dating back to the origins of UNIX. But compared to its venerable UNIX contemporary, the C programming language, tooling and support for the shell has lagged far behind even as the shell is more commonly used than ever. This paper is the latest in our line of work trying to build support for the POSIX shell. In this paper, we show how to apply out-of-order execution---a well known technique from computer architecture---in the context of the shell. How does it work? We take a shell script---say, a short program that specifies the steps to build a piece of software---and look at the commands in the shell script. A normal shell would just run these commands in order, one at a time... but we speculate: start running commands out-of-order, before they would normally run. Running things early is fraught: what if a later command depends on an earlier command? What if a later command wants to send a message on the network or otherwise affect the outside world? To safely run shell scripts out of order, we use tracing to identify when a speculated command was run too early and containment to protect ourselves from the effects of commands that were run too early. Our approach draws on ideas from transactions (we track “dependencies” and “commit” commands when there are no “conflicts” with prior commands), succeeding where other software approaches have failed. (Speculating programs in general purpose languages gets bogged down, as programs tend to have complex internal dependencies in memory. But shell scripts compose at a coarser granularity, and dependencies typically happen in the filesystem, not memory.)

Featured Image

Why is it important?

Developers turn their nose up at the shell---it is an old language with surprising and complex semantics. But the shell is only growing in use! We believe the shell is here to stay, and so it is incumbent on us to support it well. Our prototype shows very promising speedups on data analysis, software build, and developer/operations (DevOps) workloads. Speedups here mean savings in cost, of course, but they also mean more than that: performance improvements allow for changes in process. If a data analysis process that took ten minutes now takes one minute, it’s now possible to iterate interactively; if a pre-commit check on code that took four minutes now takes thirty seconds, developers can stay focused. Our work also highlights the need for better primitives for managing tracing and containment. We build on Riker (Curtsinger and Barowy, USENIX 2022 https://www.usenix.org/conference/atc22/presentation/curtsinger) for tracing; we built a new tool, try (https://github.com/binpash/try) for managing containment, which has quickly proven popular and useful in its own right, letting users safely try out scripts that might modify their whole system---say, installing or upgrading software. The try tool uses containment primitives to let the user explore the “world” with these changes, safely committing or rolling back as the user desires.

Perspectives

The shell is so widely used and so underserved! Tool building offers disproportionate impact. This is only the latest in our line of work on shell, and I’m particularly excited to see components and themes start to emerge: the JIT components from PaSh (Kallas et al. OSDI 2022 https://www.usenix.org/conference/osdi22/presentation/kallas); the parsing components from libdash/Smoosh (Greenberg and Blatt POPL 2020 https://dl.acm.org/doi/10.1145/3371111); analysis components from PaSh and Smoosh; tracing from Riker; the try tool. As we build up a toolkit for working with shell programs, not only will we be able to offer increasingly robust support, but it will be easier for others to build tools using our infrastructure.

Michael Greenberg
Stevens Institute of Technology

Read the Original

This page is a summary of: Executing Shell Scripts in the Wrong Order, Correctly, June 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3593856.3595891.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page