Against mixing environment setup with code

Community Article Published June 17, 2024

When we develop and deploy software, we deal with code and other assets, but we also deal with the configuration, which includes the environment within which the technology operates. Experienced engineers and architects know how important it is to manage the machine separately from its environment. Doing so is a discipline, however, and it's easy for discipline to slip. I've noticed for a while that in a lot of example projects in the AI and data science space, it's become common to see code that for example automatically loads .env from the local drive. I think the first time I even heard of the likes of python-dotenv and python-decouple were in association with AI code, usually as a way of establishing API keys and other secrets.

This unfortunate lapse in discipline can cause many boundary problems as the blocks multiply in your AI application architecture, and we know how quickly this happens with AI projects. It's much better to have the program inertly read the variables passed in through the environment. One of the healthiest things the twelve-factor app methodology has brought along is this understanding of the utility of the classic UNIX environment, and with this article I hope to offer insights into how to do this right, where the environment is separately set up in deployment/config management, and the code reads it. No more leaking across this boundary.

Girl in front of burning house meme: Worked fine in dev. Ops problem now.

To be fair, environment variables are in many ways clunky, and differences across platforms, and even across program launcher and shell variants can make things tricky, but that's kind of the point—you tailor the environment set-up to the host environment, and all that matters is that the correct values get across to the program. At least with environment variables you are covered on almost any platform.

Note: this guide is very Linux/BSD-biased, and all code snippets have been tested on Mac OS X under the zsh shell.

Common tools/approaches for setting the environment

Start with a tiny Python code sample that reads the environment, envtest.py:

import os
print(os.getenv('HELLO', 'NOT FOUND'))

If you run:

HELLO="Wide world"
python envtest.py

you'll get NOT FOUND because the environment variable set in the first command is not automatically passed into the subshell for the python process of the second command.

Modify the shell's environment so that it sticks for future commands

You can export a variable, which will then be injected into the child shell for subsequent commands (you can later use unset to remove any variable):

export HELLO="Wide world"
python envtest.py

Output: Wide world

Note: you can always view an env var value using printenv

printenv HELLO

You might be tempted to use echo:

echo $HELLO

But the latter will surprise you in several situations, so I prefer printenv for such checking.

You'll want to be familiar with some of the implications of shell substitutions and escaping. For example, if you tried to use an exclamation point in the value:

export HELLO="Wide world!"
python envtest.py

You'd run into problems because ! demarcates shell history expansion. You can use a backslash escape:

export HELLO="Wide world\!"
python envtest.py

Same with $, which is used to demarcate shell variables.

export HELLO="Wide world\$"
python envtest.py

You can also use single quotes to avoid interpolation, but note that will prevent all expansion of variables as well.

export HELLO='Wide world!'
python envtest.py

There are other such characters you'd want to be aware of. Learn more ("Escaping Characters in Bash").

Good old dot or source

Anyone familiar with classic UNIX shells will probably have seen these. POSIX supports the . command for executing shell commands from a file. The bash shell, and derivatives such as zsh offer a near-synonym: source.

Create a spam1.sh with the following contents:

HELLO="Wide world\!"

Use . to load this, after setting to export mode.

unset HELLO
set -a     # Set to export all shell variables to the environment
source spam1.sh
set +a
python envtest.py

You can also use the . form, but then the specific path to the shell file (in this case ./) is required:

unset HELLO
set -a     # Set to export all shell variables to the environment
. ./spam1.sh
set +a
python envtest.py

Setting directly for subshell

Just as you want to minimize your use of global variables in programming, you probably want to minimize your use of global environment variables. You can set individual variables just for a given command:

HELLO="Wide world\!" python envtest.py

Output: Wide world!

HELLO='Wide world!' python envtest.py

You can set multiple variables inline, as well:

VAR1="val1" VAR2="val2" VAR3="val3" /path/to/command_to_run

This can of course get tedious, error-prone and hard to read, so you'll want to find ways to inject multiple environment variables as once.

Use python-dotenv [say what?!]

Yeah I started by associating python-dotenv with bad habits, but actually it's fine to use this in command-line mode, to set up the environment externally to your program.

Sample env file code, spam1.env:

HELLO=Wide world!

Note that this env file format, AKA DotEnv format, informally AKA property file format is different from shell file format. No single or double quotes, no spaces around the quals sign (which would be ignored in shell format, but would be invalid in the .env format), and no need to escape !, $, etc.

Here it is in action:

dotenv -f spam1.env run -- python envtest.py

Output: Wide world!

Check your launcher or other such app

I've noticed, for example, that it doesn't seem to be broad knowledge that the uvicorn web server implementation for Python/ASGI supports a --env-file option which can read a file and pass environment variables defined therein to the ASGI application.

Notes for Docker users

You can set environments in a variety of subtly different ways by using ENV directives in Dockerfiles, -e on docker run, or the various ways you can set environment for docker compose.

See also: "Passing Environment Variables to Docker Containers"

Special notes on secrets

You really should be using a secrets manager or vault for your API keys, secrets, etc. I use 1passwd, and I'll use that as an example. You can create an environment file with URI-based references to secrets (e.g. op://app-prod/db/password; you can copy secrets reference URLs from the UI). The op run command then scans environment variables for such references, replaces them in the environment (not in the file) with the corresponding values from the app, so that a specified command is run in a subprocess with the secrets loaded.

op run --env-file=secrets-obscured.env -- python some_db_code.py

Note: I've experienced situations where op run ignores non-1password reference variables in files when the same variable is already exported in the shell, I suspect this is a bug, but just to be sure, I've become used to using a belt plus suspenders approach:

dotenv -f secrets-obscured.env run -- op run --env-file=secrets-obscured.env -- python some_db_code.py

See also: "Secrets as environment variables in docker-compose files"

Associating properties files with the .env file extension in VS Code

I noticed VS Code by default understands that files named exactly .env are properties files, but is not as smart with filenames such as dev.env or prod.env. You can fix this by setting the files.associations preference, e.g.:

    "files.associations": {
        "*.env": "properties"
    }

Or through the GUI:

image/png

Hello wide world!

Put your API keys and other secrets into a secrets manager or vault, and study how best to set up different controlled environments in your deployment environment. You can then use local environment files or shell scripts while in development workflow, and leave the code itself completely oblivious as to how its environemnt gets established. You'll soon find your software management and evolution runs more smoothly, especially in a collaborative project.