Plumbing the Deps of the Crate: Caching Rust Docker Builds

Plumbing the Deps of the Crate: Caching Rust Docker Builds

I’ve been using Docker for Rust code recently, but one of the problems is that using a Dockerfile like the following:

COPY . /opt/my_build_dir
RUN cargo build

Ends up causing the container to download the dependency crates. Every. Single. Time. It’s terrible. If you’ve ever had a 15 minute clean build for a Rust project you can see how much it sucks to just rebuild from scratch every time just to test out some changes.

How do we deal with this? What’s the solution? Let’s look at how Docker works with the COPY instruction and file system caching and then move to the solution I used to hack around the limitation of not being able to only build the dependencies of a crate.

COPY, Filesystems, and You

One of the things Docker does is that it caches the container at different layers. The commands FROM, COPY, RUN and CMD each create a different layer. This is good! That means if everything up to line 20 in my Dockerfile is the same, Docker can skip those build steps. How does COPY know it’s “invalid” and that the command needs to change again? It checks to see if the files referenced in the COPY command have changed. This is good! If I change a file then I would want the new file in the container so it can build properly.

Now see if you can spot the problem with the first snippet of a Dockerfile I shared above. Think you got it? If you’re not sure then think about this: if I use COPY . I’m saying that, “If anything at all in this directory changes, copy the data in”. You change the source files? A whole new copy of the directory is made as the command has to be rerun. This is also a new OS so even if you copy in a target directory Rust will still rebuild it, as the cached files in your target dir will no longer be valid for compilation.

This right here is why we’re downloading everything over and over and over again on changes. We haven’t created a layer that only builds the deps and caches it. What we really want is to only rebuild those deps when Cargo.toml/Cargo.lock change! How do we do it?

She sed, I know what it’s like to build deps

To be clear what we’re going to do is definitely a hack. It’s not pretty. It’s absolutely not elegant, but it works. We’re gonna use a small feature of Cargo.toml files, a bit of sed magic, and some clever uses of COPY to accomplish it. Okay let’s get started. First up you’ll need to open up the Dockerfile in your directory. I’m assuming you have cargo already in the container and are doing something that looks like this:

# The beginning stuff up to this point
COPY . /your_work_dir
RUN cargo build
# The rest of your file

The first thing we want to do is change our file to this instead:

# We'll get to what this file is below!
COPY dummy.rs /your_work_dir
# If this changed likely the Cargo.toml changed so lets trigger the
# recopying of it anyways
COPY Cargo.lock /your_work_dirCOPY Cargo.toml /your_work_dir
# We'll get to what this substitution is for but replace main.rs with
# lib.rs if this is a library
RUN sed -i 's/src/main.rs/dummy.rs/' Cargo.toml
# Drop release if you want debug builds. This step cache's our deps!
RUN cargo build --release
# Now return the file back to normal
RUN sed -i 's/dummy.rs/src/main.rs/' Cargo.toml
# Copy the rest of the files into the container
COPY . /your_work_dir
# Now this only builds our changes to things like src
RUN cargo build --release

Now in the top level of the directory create a file dummy.rs it should like this:

fn main() {}

That’s it! It’s a stub to get cargo to build the deps, but not build our actual code! If you’re building a library then use an empty file instead. Alright so the last bit of magic is to modify your Cargo.toml to contain this in it under the [package] section:

[[bin]]
name = "your_project_name"
path = "src/main.rs"

If you’re making a library then put:

[lib]
name = "your_project_name"
path = "src/lib.rs"

Remember those sed commands? They’ll actually change the line with path on it. It’ll change it to point at the dummy file, compile it (which it’s basically empty so not too much time), and all of the deps listed in the Cargo.toml file. Then it changes back to the actual entry point to build the correct project!

Like I said it’s an absolute hack but hey it works very well! Try it out and see how much you can speed your builds up by.

Closing thoughts

Ideally cargo would be smart enough to just do cargo build --dependencies-only (there is an issue open for this) and then having Docker setup to only rebuild changes to the src directory. For now though, the above is, while fragile, a good enough alternative to the issue. It would be great to see Rust’s excellent tooling to work in a way that it will meet production needs. I hope you’ll spend more time working on code and futzing with deployment configs now, rather than worrying about your build times!