Q35 of 40 · Git
A QA repo is growing uncontrollably because large binary test fixtures (video recordings, DB dumps) are committed directly. How do you solve this?
Short answer
Short answer: Replace binaries with Git LFS pointers (`git lfs track '*.mp4'`), which stores the actual blobs in an external store and keeps the repo lean. For large DB dumps, store them in object storage (S3/GCS) and fetch in CI with a script. Use `git sparse-checkout` to avoid checking out unneeded tree subtrees.
Detail
Git is a poor store for large binaries: every clone downloads the full history, so a 500 MB video committed once bloats every developer's clone by 500 MB forever. Three complementary solutions:
Git LFS (Large File Storage): installs a filter that replaces large files with small text pointers in the repo. The actual content is stored on an LFS server (GitHub, GitLab, or self-hosted). git lfs track "*.mp4" adds a pattern to .gitattributes, and subsequent git add of matching files stores pointers. Clones only download LFS objects they checkout (lazy fetch). Major caveat: LFS objects are not garbage-collected automatically and can incur storage costs.
External object storage: for artefacts that don't need to be versioned alongside code (DB dumps, large baseline snapshots), store them in S3/GCS with versioned keys. CI scripts download them on demand. This fully decouples test data from code history.
git sparse-checkout (Git 2.25+): lets a developer clone only specific directories. If large fixtures live in test/fixtures/large/, a developer working on unit tests can do git sparse-checkout set src/ test/unit/ and never download the large tree. Useful for monorepo structures.
Cleaning up history: if large files are already committed, use git filter-repo (the modern replacement for git filter-branch) to rewrite history and remove them. This requires all team members to re-clone.
// EXAMPLE
# --- Git LFS setup ---
# Install LFS (once per machine)
git lfs install
# Track patterns (updates .gitattributes)
git lfs track "*.mp4"
git lfs track "*.zip"
git lfs track "test/fixtures/large/*.sql"
# Commit the .gitattributes update
git add .gitattributes
git commit -m "chore: track large binaries with Git LFS"
# Normal add/commit — LFS handles the rest
git add test/fixtures/recordings/checkout-flow.mp4
git commit -m "test: add checkout flow recording fixture"
# Verify LFS is tracking files
git lfs ls-files
# --- Sparse checkout (skip large fixture directories) ---
git clone --filter=blob:none --sparse https://github.com/company/repo.git
cd repo
git sparse-checkout set src/ test/unit/ test/integration/
# Large fixture tree in test/fixtures/large/ is NOT downloaded
# --- Remove accidental binary commit from history ---
git filter-repo --path test/fixtures/large/dump.sql --invert-paths