mirror of
https://codeberg.org/Codeberg/Documentation.git
synced 2026-06-16 05:13:54 -07:00
# Changelog The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). ## Fixed - Lint issues: - Line length ([MD013](https://github.com/DavidAnson/markdownlint/blob/v0.38.0/doc/md013.md)). - Heading levels should only increment by one level at a time ([MD001](https://github.com/DavidAnson/markdownlint/blob/v0.38.0/doc/md001.md)). - Link fragments should be valid ([MD051](https://github.com/DavidAnson/markdownlint/blob/v0.38.0/doc/md051.md)). --- Reviewed-on: https://codeberg.org/Codeberg/Documentation/pulls/629 Reviewed-by: Gusted <gusted@noreply.codeberg.org> Co-authored-by: Javier Pérez <walpo@noreply.codeberg.org> Co-committed-by: Javier Pérez <walpo@noreply.codeberg.org>
157 lines
6.1 KiB
Markdown
157 lines
6.1 KiB
Markdown
---
|
||
eleventyNavigation:
|
||
key: ReducingGitSize
|
||
title: Reducing the size of a repository
|
||
parent: Git
|
||
---
|
||
|
||
The best way to keep your repository at a manageable size is [to use the `.gitignore` feature](/git/git-ignore/) to make
|
||
sure large files are not being part of your git repository in the first place.
|
||
But sometimes it can be too late, as you already committed large files in the past.
|
||
Or maybe at some point you legitimately needed some larger files, but now they are obsolete.
|
||
|
||
{% admonition "Tip" %}
|
||
|
||
The removal of files from the Git history is also useful if you accidentally pushed secrets such as passwords,
|
||
API keys or other private information to your repository.
|
||
Without rewriting the history, this information would permanently linger in your Git history.
|
||
|
||
{% endadmonition %}
|
||
|
||
Either way, you might want to permanently remove files from your Git repository, to shrink its overall size or to remove
|
||
individual files that weren't meant to be committed to it.
|
||
|
||
The first step to achieving this, is deleting the files from the current state of your branch,
|
||
using the regular `git rm filename.txt` approach.
|
||
Once these files are no longer in the current `HEAD` of your branch, you can rewrite the history of your Git repository.
|
||
This ensures that the files do not remain in the overall Git history, where they would continue to take up space,
|
||
or in case of accidentally committed secrets, remain accessible to others.
|
||
|
||
{% admonition "warning" %}
|
||
|
||
**Rewriting your Git history is a destructive process.**
|
||
You should make a backup of your repository before attempting rewriting the history.
|
||
|
||
Additionally, once you have rewritten the history, everyone who has made a working copy of your
|
||
repository will have ensure that they have copies with the correct, new history.
|
||
|
||
If your repository has open pull requests, this will introduce conflicts for these, too.
|
||
|
||
{% endadmonition %}
|
||
|
||
## Getting started
|
||
|
||
### Installing the necessary tools
|
||
|
||
Install [git-filter-repo](https://github.com/newren/git-filter-repo), which will be used to actually
|
||
perform the history rewrites.
|
||
Optionally, you can also install [git-sizer](https://github.com/github/git-sizer/tree/master),
|
||
if you want to find large files that are no longer used in your history.
|
||
|
||
### Make a mirror clone of the repository
|
||
|
||
This step will ensure that you get a full copy of your repository, including all your references.
|
||
|
||
You can use the `git clone --mirror` flag to create such a clone,. e.g.,
|
||
|
||
```shell
|
||
git clone --mirror git@codeberg.org:your_user_name/your_repo.git
|
||
```
|
||
|
||
It's important that you clone your repository using the `--mirror` flag to be able to rewrite the history.
|
||
|
||
## Identifying files to remove & removing them
|
||
|
||
### Optionally: Run git-filter-repo’s analyze command
|
||
|
||
This optional step can help you identify files that are already deleted from the current state of your branch,
|
||
but that are still in the history.
|
||
For example, this would help for finding be large build files or assets that were accidentally committed and already,
|
||
but still take space in the history.
|
||
|
||
To identify such files, you can run:
|
||
|
||
```shell
|
||
git-filter-repo --analyze
|
||
head filter-repo/analysis/*-{all,deleted}-sizes.txt
|
||
|
||
# and/or
|
||
|
||
git-sizer -v
|
||
```
|
||
|
||
### Run git-filter-repo to rewrite the history and permanently remove the files
|
||
|
||
This is the actual history-rewriting that will take place on your local mirror clone.
|
||
As a reminder: You should only try to use `git-filter-repo` to remove files from your repository's history that are
|
||
already no longer in use in your current `HEAD`.
|
||
|
||
For example, let's assume you at some point accidentally had committed a folder `dist/`.
|
||
You already removed it from the current state of your repository using `git rm -r dist/`.
|
||
But to minimize the size of your repo, you now also want to remove it from your history.
|
||
|
||
To achieve this, you can run:
|
||
|
||
```shell
|
||
git-filter-repo --path dist/ --invert-paths
|
||
```
|
||
|
||
You give `dist/` as the path, and using `--invert-paths` you tell `git-filter-repo` that you want to keep all files,
|
||
except the ones specified using `--path`.
|
||
|
||
1. Run git-filter-repo analyze/git-sizer again to check that the repository size has indeed been reduced:
|
||
|
||
```shell
|
||
git-filter-repo --analyze --force
|
||
head filter-repo/analysis/*-{all,deleted}-sizes.txt
|
||
|
||
# and/or
|
||
|
||
git-sizer -v
|
||
```
|
||
|
||
So far, all of these changes have been applied to your local mirror copy of your repository,
|
||
once you are happy with the rewritten history (e.g. that the size has successfully been reduced or
|
||
that the secrets were removed), you can push those changes to Codeberg.
|
||
|
||
## Replacing the history on Codeberg
|
||
|
||
{% admonition "warning" %}
|
||
|
||
**The following step replaces your repository's history on Codeberg in a destructive operation.**
|
||
|
||
Ensure that you have a pre-rewrite backup of your repository somewhere, otherwise you will not be able
|
||
to undo in case anything goes wrong.
|
||
|
||
{% endadmonition %}
|
||
|
||
1. Turn off the mirror flag and carry out force pushes to your remote
|
||
|
||
```shell
|
||
git config --unset remote.origin.mirror
|
||
git push origin --force 'refs/heads/*'
|
||
git push origin --force 'refs/tags/*'
|
||
git push origin --force 'refs/replace/*'
|
||
```
|
||
|
||
{% admonition "info" %}
|
||
|
||
Remember: Everyone who has a working copy of your repository will now need to move to that new history as well.
|
||
|
||
For people who don't have any ongoing local work, the easiest way to ensure the correct history is to check out a fresh clone.
|
||
|
||
For users who have on-going local work, the following steps should work, unless it includes now-deleted files:
|
||
|
||
1. `git fetch` the remote (not `pull`, to avoid errors/warnings)
|
||
2. `git checkout` the local branch that has current work
|
||
3. `git rebase origin/branch` with `origin` being the remote for the repository and `branch` being
|
||
the branch you are working against.
|
||
|
||
This will rebase the local work onto the remote's main branch (or the branch you are working against at).
|
||
|
||
{% endadmonition %}
|
||
|
||
## Further reading
|
||
|
||
- [The manual of `git-filter-repo`](https://htmlpreview.github.io/?https://github.com/newren/git-filter-repo/blob/docs/html/git-filter-repo.html)
|
||
- [Section on rewriting Git history in the book Pro Git 2nd edition](https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History)
|