A Universal Approach to Writing Dockerfiles

A Universal Approach to Writing Dockerfiles

A Guide to Creating Docker Images for Applications in Any Language

Most Docker tutorials focus on teaching how to write a Dockerfile for specific languages like Python, Go, or Java. But the fundamental process of containerizing an application remains the same, regardless of the technology stack.

This post is not about the syntax of a Dockerfile but about how to think when containerizing an application. Once you understand the approach, you can efficiently build Docker images for applications in any language.


1. Understand What Needs to Be in the Image

Before even touching a Dockerfile, ask:

  • What files are required to run the application? (Source code, dependencies, configuration files, compiled binaries)

    • The goal here is to identify only the necessary files and exclude unnecessary ones to keep the Docker image lightweight.
  • Where do dependencies live?

    Are they at the system-level, project-level, or runtime-only?

    • System-Level Dependencies are OS-level dependencies that need to be installed using a package manager like apt-get, yum, or apk. These are needed for compiling code or running system libraries but not necessarily needed in the final image.

    • Project-Level Dependencies are specific to the application and are installed via a package manager like pip (Python), npm (Node.js), go (Go), or Maven/Gradle (Java). Example: requirements.txt for Python or package.json for Node.js or go.mod for Go or pom.xml or build.gradle for Java. They are required for the application to function and are part of the project itself.

    • Runtime-Only Dependencies are the absolute minimum required for execution. Example: A Go application needs only the compiled binary at runtime. A Java application would need just the Java Runtime Environment (JRE) instead of a Java Development Kit (JDK).

  • What is needed to build vs. what is needed to run?

    For example, during the build process:

    • A Go application needs the full golang image to compile the binary.

    • A Java application needs a JDK (openjdk:21) to compile the code.

    • A Python app may need development tools to install certain packages.

However, they may only require minimal dependencies for the application to function at runtime. For example:

  • A Go application only needs the compiled binary and doesn’t need the full golang image, it can work with a minimal base image like scratch or alpine.

  • A Java application can run with just a JRE instead of a JDK.

  • A Python application needs only installed libraries, not the full build toolchain.

2. Choose the Right Base Image

Choosing the right base image is crucial for security, performance, and efficiency. This step becomes easy once you have answered the questions in the first step.

Some things to consider here:

Performance & Compatibility

  • Some base images are optimized for performance (scratch, alpine).

  • Some provide better compatibility because of their broader package support (debian, ubuntu).

Image size

Use minimal base images whenever possible. For example, for Python, using the slim variant can significantly reduce image size. For example, the python:3.13 image is 1.02 GB, while the python:3.13-slim image is only 121 MB. This drastic reduction is because the slim variant removes unnecessary development tools, documentation, and system utilities that are not required for running most Python applications.

Security

  • Use official images from Docker Hub or trusted sources.

  • Avoid images with unnecessary tools that increase vulnerability.

3. Make the Image Lighter with Multi-Stage Builds

Many Docker images work but are unoptimized and unnecessarily large. The key to optimization is separating the build environment from the runtime environment.

Here’s an example of a Multi-Stage Build for an application written in a compiled language (Go):

  1. Build Stage (Uses a full Go environment to compile the binary)

     FROM golang:1.24 AS builder
     WORKDIR /app
     COPY go.mod go.sum ./
     RUN go mod download
     COPY . .
     RUN go build -o myapp
    
  2. Runtime Stage (Uses a minimal base image to run the compiled binary)

     FROM scratch AS release
     WORKDIR /app
     COPY --from=builder /app/myapp ./
     ENTRYPOINT [ "./myapp" ]
    

    Using scratch as the final runtime base image reduces the size from 400 MB (golang:1.24) to 0 MB, speeding up deployments and saving storage. The final image is only as large as the Go binary itself because Go binaries are self-contained and statically compiled, eliminating the need for a runtime environment. Removing the toolchain reduces attack vectors, improves security, and lowers resource use, while ensuring faster startup times—ideal for cloud-native applications.

What about an interpreted language such as Python?

  1. Build Stage (Installs dependencies in a larger base image)
FROM python:3.13 AS builder
WORKDIR /app
COPY requirements.txt .  
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
  1. Runtime Stage (Uses a slim base image for efficiency)
FROM python:3.13-slim AS release
WORKDIR /app
COPY --from=builder /usr/local/lib/python3.13/site-packages /usr/local/lib/python3.13/site-packages
COPY . .
CMD ["python", "app.py"]

Unlike Go, where the compiled binary is self-contained, Python requires an interpreter. This means that we can't remove everything in the final image since it still needs a Python runtime. However, by installing dependencies in the build stage and copying only the installed libraries from site-packages, we avoid carrying over unnecessary build tools, caches, and temporary files. By excluding Python’s package manager (pip), compilers, and other build-time artifacts, and using Python’s slim variant as the final base image, we significantly reduce the image size while keeping only the necessary runtime and dependencies. This results in a smaller, more secure, and faster-to-deploy container.

4. Additional Best Practices to Consider

  • Use a Non-Root User for Security
    By default, containers run as the root user, which can be a security risk if an attacker gains access. Instead, create a dedicated non-root user with limited privileges to minimize risk.
    Example:

      RUN useradd --create-home --shell /usr/sbin/nologin appuser
      USER appuser
    
  • Reduce Layers by Chaining Commands
    Every RUN command in a Dockerfile creates a new image layer. Excess layers increase image size and slow down builds. To minimize this, chain commands in a single RUN statement using && and clean up unnecessary files in the same step.
    Example:

      RUN apt-get update && apt-get install -y \
          some-package \
          another-package \
          && rm -rf /var/lib/apt/lists/*
    
  • Install Dependencies Before Copying Application Code
    Docker caches layers, so placing frequently changing files (like source code) at the end improves cache efficiency. First, install dependencies, then copy the application code to avoid unnecessary re-installation of dependencies when making small code changes.
    Example: Python Dockerfile

      COPY requirements.txt .  
      RUN pip install --no-cache-dir -r requirements.txt
      COPY . .
    
  • Avoid Copying Unnecessary Files
    Use a .dockerignore file to exclude unnecessary files from your image. For example, you should not copy your .git directory or any file containing sensitive data, such as certificates or credentials. By not copying these files into the Docker image, we also reduce the build time and the image size.
    Example .dockerignore for a Python based application:

      # Byte-compiled / cached files
      __pycache__/
      *.py[cod]
      *.swp
      *.swo
      *.egg-info/
    
      # Virtual environments
      venv/
      .env/
      *.venv
      Pipfile
      Pipfile.lock
    
      # System files
      .DS_Store
    
      # Git & CI/CD
      .git/
      .gitignore
      .github/
    
      # IDE / Editor configs
      .vscode/
    
  • Avoid Using latest Tags in Production
    Always specify exact versions of base images instead of using latest to prevent unexpected breaking changes.
    Example:

      FROM python:3.13  # Good, because the exact version ensures compatibility.
      FROM python:latest # Bad because when the latest version changes, it might break the code
    

Wrapping up

Containerizing an application is about understanding what to build, what to include, and what to strip away—then translating that into a well-optimized Dockerfile.

Did I miss any key best practices that you swear by? How do you optimize your Docker images? Share your thoughts in the comments, I’d love to hear your perspective!