Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,7 @@ target
docker-squash.iml
**/image.tar
**/tox.tar

.cursor/*

*.tar
163 changes: 142 additions & 21 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ Features
- Can squash from a selected layer to the end (not always possible, depends on the image)
- Support for Docker 1.9 or newer (older releases may run perfectly fine too, try it!)
- Squashed image can be loaded back to the Docker daemon or stored as tar archive somewhere
- Automatic detection of input type (Docker image name vs tar file path)
- Works without Docker daemon when processing tar files

Installation
------------
Expand All @@ -49,34 +51,34 @@ Usage
::

$ docker-squash -h
usage: cli.py [-h] [-v] [--version] [-d] [-f FROM_LAYER] [-t TAG]
[--tmp-dir TMP_DIR] [--output-path OUTPUT_PATH]
image
usage: docker-squash [-h] [-v] [--version] [-f FROM_LAYER] [-t TAG] [-m MESSAGE] [-c] [--tmp-dir TMP_DIR]
[--output-path OUTPUT_PATH] [--load-image [LOAD_IMAGE]]
image

Docker layer squashing tool

positional arguments:
image Image to be squashed

optional arguments:
-h, --help show this help message and exit
-v, --verbose Verbose output
--version Show version and exit
-f FROM_LAYER, --from-layer FROM_LAYER
Number of layers to squash or ID of the layer (or image ID or image name) to squash from.
In case the provided value is an integer, specified number of layers will be squashed.
Every layer in the image will be squashed if the parameter is not provided.
-t TAG, --tag TAG Specify the tag to be used for the new image. If not specified no tag will be applied
-m MESSAGE, --message MESSAGE
image Image name or tar file path to be squashed. If a .tar file is provided, it will be processed without
requiring Docker daemon.

options:
-h, --help show this help message and exit
-v, --verbose Verbose output
--version Show version and exit
-f FROM_LAYER, --from-layer FROM_LAYER
Number of layers to squash or ID of the layer (or image ID or image name) to squash from. In case the
provided value is an integer, specified number of layers will be squashed. Every layer in the image will
be squashed if the parameter is not provided.
-t TAG, --tag TAG Specify the tag to be used for the squashed image (recommended). Without this, the squashed image will
have no repository tags to avoid overwriting the original image.
-m MESSAGE, --message MESSAGE
Specify a commit message (comment) for the new image.
-c, --cleanup Remove source image from Docker after squashing
--tmp-dir TMP_DIR Temporary directory to be created and used. This will NOT be deleted afterwards for
easier debugging.
--output-path OUTPUT_PATH
-c, --cleanup Remove source image from Docker after squashing
--tmp-dir TMP_DIR Temporary directory to be created and used. This will NOT be deleted afterwards for easier debugging.
--output-path OUTPUT_PATH
Path where the image may be stored after squashing.
--load-image [LOAD_IMAGE]
--load-image [LOAD_IMAGE]
Whether to load the image into Docker daemon after squashing
Default: true

Note that environment variables may be set as documented in `here <docs/environment_variables.adoc>`_.

Expand Down Expand Up @@ -216,3 +218,122 @@ Let's confirm the image structure now:
6ee235cf4473 3 weeks ago /bin/sh -c #(nop) LABEL name=CentOS Base Imag 0 B
474c2ee77fa3 3 weeks ago /bin/sh -c #(nop) ADD file:72852fc7626d233343 196.6 MB
1544084fad81 6 months ago /bin/sh -c #(nop) MAINTAINER The CentOS Proje 0 B

Working without Docker daemon
-----------------------------

Sometimes you may want to squash an image without direct access to Docker daemon (e.g., in CI/CD pipelines,
air-gapped environments, or when Docker is not running). You can provide a tar file path directly as the ``image``
parameter to process Docker images exported as tar files without requiring a Docker daemon connection.

**Step 1**: Export the image to a tar file using ``docker save``:

::

$ docker save -o source.tar jboss/wildfly:latest

**Step 2**: Squash the image from the tar file. Let's squash the last 8 layers:

Note: The tool automatically detects that ``source.tar`` is a tar file and processes it without Docker daemon.

::

$ docker-squash --tag jboss/wildfly:squashed -f 10 --output-path squashed.tar --load-image false source.tar
2025-08-20 07:58:45,338 tar_image.py:54 INFO Extracting tar image from source.tar
2025-08-20 07:58:45,598 tar_image.py:73 INFO Detected OCI format image
2025-08-20 07:58:45,599 tar_image.py:251 INFO Old image has 22 layers
2025-08-20 07:58:45,599 tar_image.py:284 INFO Checking if squashing is necessary...
2025-08-20 07:58:45,599 tar_image.py:298 INFO Attempting to squash last 10 layers...
2025-08-20 07:58:45,599 tar_image.py:306 INFO Starting squashing process...
2025-08-20 07:58:45,599 image.py:750 INFO Starting squashing for /tmp/docker-squash-7n3ui1ar/new/squashed/layer.tar...
2025-08-20 07:58:47,713 image.py:775 INFO Squashing file '/tmp/docker-squash-7n3ui1ar/old/blobs/sha256/f26d32e28c292aba76defcdd67c267000d31a6ac3ebdab5c850aba90ef834927'...
2025-08-20 07:58:49,041 image.py:923 INFO Squashing finished!
2025-08-20 07:58:49,953 tar_image.py:660 WARNING OCI output format not fully implemented - creating Docker format
2025-08-20 07:58:49,953 tar_image.py:570 INFO Using user-specified tag: jboss/wildfly:squashed
2025-08-20 07:58:50,028 tar_image.py:349 INFO Squashing completed successfully
2025-08-20 07:58:50,028 tar_image.py:359 INFO Original image size: 382.24 MB
2025-08-20 07:58:50,028 tar_image.py:360 INFO Squashed image size: 421.59 MB
2025-08-20 07:58:50,028 tar_image.py:363 INFO If the squashed image is larger than original it means that there were no meaningful files to squash and it just added metadata. Are you sure you specified correct parameters?
2025-08-20 07:58:50,028 cli.py:176 INFO New squashed image ID is sha256:7ebd48ca15f2e8d937a6bf3d77e0b865feddebd3ec8f11532d8a30c0000f2b67
2025-08-20 07:58:50,028 tar_image.py:766 INFO Exporting squashed image to squashed.tar
2025-08-20 07:58:51,257 tar_image.py:776 INFO Export completed successfully
2025-08-20 07:58:51,257 cli.py:191 INFO Done

**Step 3**: Load the squashed image back into Docker:

::

$ docker load -i squashed.tar
Loaded image: jboss/wildfly:squashed

Now you can verify the squashed image structure:

::

$ docker history jboss/wildfly:squashed
IMAGE CREATED CREATED BY SIZE COMMENT
a8c48d9906a7 About a minute ago 270MB Squashed layers
<missing> 4 years ago /bin/sh -c #(nop) USER jboss 0B
<missing> 4 years ago /bin/sh -c yum -y install java-11-openjdk-de… 239MB
<missing> 4 years ago /bin/sh -c #(nop) USER root 0B
<missing> 4 years ago /bin/sh -c #(nop) MAINTAINER Marek Goldmann… 0B
<missing> 4 years ago /bin/sh -c #(nop) USER jboss 0B
<missing> 4 years ago /bin/sh -c #(nop) WORKDIR /opt/jboss 0B
<missing> 4 years ago /bin/sh -c groupadd -r jboss -g 1000 && user… 406kB
<missing> 4 years ago /bin/sh -c yum update -y && yum -y install x… 33.5MB
<missing> 4 years ago /bin/sh -c #(nop) MAINTAINER Marek Goldmann… 0B
<missing> 5 years ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0B
<missing> 5 years ago /bin/sh -c #(nop) LABEL org.label-schema.sc… 0B
<missing> 5 years ago /bin/sh -c #(nop) ADD file:61908381d3142ffba… 222MB

**Key advantages of tar mode:**

- No Docker daemon required during squashing
- Works in CI/CD pipelines and restricted environments
- Supports both Docker format and OCI format images
- Maintains complete layer history compatibility
- Can process images on systems where Docker is not installed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would imagine that its helpful when working with podman as well

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely! That's a great point. The --input-tar feature is indeed very helpful for Podman users.

Since Podman uses podman save to export images in the same tar format as docker save, users can now:

# Export image with Podman
podman save myimage:latest -o image.tar

# Squash with docker-squash (no Docker daemon required)
docker-squash --input-tar image.tar --tag myimage:squashed --output-path squashed.tar

# Import back to Podman
podman load -i squashed.tar

This workflow is particularly valuable in environments where:

  • Only Podman is available (no Docker daemon)
  • Running in CI/CD pipelines with Podman
  • Working in rootless containers or restricted environments
  • Processing images offline without any container runtime

Should I add a Podman example to the documentation to highlight this use case?


**Podman compatibility:**

The squashed tar files are fully compatible with Podman. You can load them using:

::

$ podman load -i squashed.tar
Getting image source signatures
Copying blob 8055a1084cfa done |
Copying blob 613be09ab3c0 done |
Copying blob 3fbe1e874b0d done |
Copying blob 869989761eb2 done |
Copying blob 115463be137a done |
Copying config 7ebd48ca15 done |
Writing manifest to image destination
Loaded image: localhost/jboss/wildfly:squashed

$ podman history jboss/wildfly:squashed
ID CREATED CREATED BY SIZE COMMENT
7ebd48ca15f2 5 minutes ago 268MB Squashed layers
<missing> 4 years ago /bin/sh -c #(nop) USER jboss 0B
<missing> 4 years ago /bin/sh -c yum -y install java-11-openjdk-... 237MB
<missing> 4 years ago /bin/sh -c #(nop) USER root 0B
<missing> 4 years ago /bin/sh -c #(nop) MAINTAINER Marek Goldma... 0B
<missing> 4 years ago /bin/sh -c #(nop) USER jboss 0B
<missing> 4 years ago /bin/sh -c #(nop) WORKDIR /opt/jboss 0B
<missing> 4 years ago /bin/sh -c groupadd -r jboss -g 1000 && us... 374kB
<missing> 4 years ago /bin/sh -c yum update -y && yum -y install... 32.8MB
<missing> 4 years ago /bin/sh -c #(nop) MAINTAINER Marek Goldma... 0B
<missing> 5 years ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0B
<missing> 5 years ago /bin/sh -c #(nop) LABEL org.label-schema.... 0B
<missing> 5 years ago /bin/sh -c #(nop) ADD file:61908381d3142ff... 211MB
...

This enables docker-squash to work in Podman-only environments, rootless containers, and mixed container runtime scenarios.

**Important notes:**

- Always use ``--tag`` parameter to avoid overwriting the original image name
- Set ``--load-image false`` if you only want to export the squashed tar file
- Use ``--output-path`` to specify where the squashed tar should be saved
- The tool automatically detects input type (image name vs tar file) and image format (Docker vs OCI)
- Squashed images work seamlessly with both Docker and Podman
113 changes: 100 additions & 13 deletions docker_squash/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,11 @@ def run(self):
"--version", action="version", help="Show version and exit", version=version
)

parser.add_argument("image", help="Image to be squashed")
parser.add_argument(
"image",
help="Image name or tar file path to be squashed. If a .tar file is provided, it will be processed without requiring Docker daemon.",
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should investigate using exclusive groups for argparse - as that has built in support for having either the --input-tar or image option and would avoid the manual checks below.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also - I think its valid for output-path to be the same as input-tar (?) , should, in tar mode, this be the default?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great ! I have the code changes.


parser.add_argument(
"-f",
"--from-layer",
Expand All @@ -79,7 +83,7 @@ def run(self):
parser.add_argument(
"-t",
"--tag",
help="Specify the tag to be used for the new image. If not specified no tag will be applied",
help="Specify the tag to be used for the squashed image (recommended). Without this, the squashed image will have no repository tags to avoid overwriting the original image.",
)
parser.add_argument(
"-m",
Expand Down Expand Up @@ -118,18 +122,16 @@ def run(self):
self.log.setLevel(logging.INFO)

self.log.debug("Running version %s", version)

try:
squash.Squash(
log=self.log,
image=args.image,
from_layer=args.from_layer,
tag=args.tag,
comment=args.message,
output_path=args.output_path,
load_image=args.load_image,
tmp_dir=args.tmp_dir,
cleanup=args.cleanup,
).run()
# Auto-detect if input is tar file or image name
if self._is_tar_file(args.image):
self.log.debug(f"Detected tar file: {args.image}")
self._run_tar_mode(args)
else:
self.log.debug(f"Detected image name: {args.image}")
self._run_image_mode(args)

except KeyboardInterrupt:
self.log.error("Program interrupted by user, exiting...")
sys.exit(1)
Expand All @@ -150,6 +152,91 @@ def run(self):

sys.exit(1)

def _run_tar_mode(self, args):
from docker_squash.tar_image import TarImage

# Provide helpful guidance about --tag parameter
if not args.tag:
self.log.info(
"💡 Tip: Consider using --tag to specify a name for your squashed image"
)
self.log.info(" Example: --tag myimage:squashed")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does a tag make sense for an output tar? It is probably of only relevance if --load-image has been specified?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I respectfully disagree with this assessment. The --tag parameter is meaningful for output tar files regardless of the --load-image setting, here's why:

Tag is part of image metadata in tar format:

  • Docker/Podman tar format stores tags in manifest.json under RepoTags field
  • This metadata becomes part of the squashed tar file

Tag is useful in all scenarios:

  1. --load-image true: Image gets loaded with the specified tag
  2. --load-image false + --output-path: The output tar contains tag metadata, so when someone later runs docker load -i squashed.tar, the image will have the proper tag
  3. Distribution: Tagged tar files are more useful when shared with others

Without --tag, the consequences are significant:

# Without tag - image loads but has no name
$ docker load -i squashed.tar
Loaded image ID: sha256:abc123...
$ docker images
REPOSITORY TAG IMAGE ID
<none> <none> sha256:abc123... # Hard to identify!

# With tag - much more usable
$ docker load -i squashed.tar
Loaded image: myapp:squashed
$ docker images
REPOSITORY TAG IMAGE ID
myapp squashed sha256:abc123... # Clear identification

The tip message encourages good practices for tar-based workflows, not just --load-image scenarios. The tag becomes part of the portable tar artifact.


tar_image = TarImage(
log=self.log,
tar_path=args.image, # 这里改为 args.image
from_layer=args.from_layer,
tmp_dir=args.tmp_dir,
tag=args.tag,
comment=args.message,
)

try:
new_image_id = tar_image.squash()
self.log.info("New squashed image ID is %s" % new_image_id)

if not args.output_path:
import os

self.output_path = os.path.join(
os.path.dirname(args.image), f"squashed-{new_image_id[:12]}.tar"
)

if args.output_path:
tar_image.export_tar_archive(args.output_path)

if args.load_image:
tar_image.load_squashed_image()

self.log.info("Done")

finally:
if not args.tmp_dir:
tar_image.cleanup()

def _run_image_mode(self, args):
squash.Squash(
log=self.log,
image=args.image,
from_layer=args.from_layer,
tag=args.tag,
comment=args.message,
output_path=args.output_path,
load_image=args.load_image,
tmp_dir=args.tmp_dir,
cleanup=args.cleanup,
).run()

def _is_tar_file(self, input_path):
"""Detect if input is a tar file or image name"""
import os
import tarfile

# Check if it's a file path that exists
if os.path.isfile(input_path):
# Check if it's a valid tar file
try:
with tarfile.open(input_path, "r"):
return True
except (tarfile.TarError, OSError):
return False

# Check if it ends with .tar extension
if input_path.endswith((".tar", ".tar.gz", ".tgz")):
return True

# Check for obvious file path patterns
if (
input_path.startswith(("/")) # Absolute path
or input_path.startswith(("./")) # Current dir
or input_path.startswith(("../")) # Parent dir
or input_path.startswith(("~/"))
): # Home dir
return True

# Otherwise assume it's an image name (even if it contains '/')
return False


def run():
cli = CLI()
Expand Down
Loading