Preserve GitHub History Forever
Keep the entire story, not just the code
GitHub accounts get banned, repos go private, and owners rage-delete history. If you care about the full timeline—issues, releases, wiki—Gitea Mirror snapshots everything on a schedule so the story survives in your homelab.
Requirements
- Running Gitea Mirror (follow the backup playbook)
- GitHub PAT with
repoenabled (add theread:orgcheckbox underadmin:orgwhen you archive organization repositories; leave write/admin unchecked) - Destination Gitea with enough disk for cloned repos + attachments
- Optional: object storage or snapshots for long-term archiving of the mirror volume
Step-by-step
1. Set archival-friendly defaults
In Configuration → Connections, open Content & Data:
- Enable Mirror metadata and choose the components you care about (issues, pull requests, labels, milestones, wiki).
- Enable Mirror releases and raise the Latest releases limit if you need a deeper history of release assets.
- Toggle Git LFS (Large File Storage) so binaries follow the repository, assuming LFS is enabled in your Gitea instance.
2. Create an “Archive” organization in Gitea
- In Gitea, create an org like
github-archiveand grant read-only access to everyone who needs the history. - Back in Gitea Mirror under Configuration → Connections, pick the Preserve structure strategy (or set a destination organization) so repos land in that archive org.
- Tighten permissions in Gitea—disable pushes for regular users so the archive stays immutable while the service updates it via its token.

Keep every GitHub project visible in the repositories dashboard while routing mirrors into a dedicated archive organization.
3. Choose retention & cadence
- In Configuration → Automation, enable Automatic syncing and set the interval (
1hkeeps fast-moving repos current;12his usually enough for archives). - Turn on Handle orphaned repositories automatically and leave the action on Archive so anything deleted upstream is preserved locally but marked read-only.
- Bump the Latest releases limit or run an occasional manual sync from the Repositories table when you need older release assets.
4. Record provenance
- Add a README or label inside the archive organization that captures the upstream URL, first mirrored date, and token owner.
- Export a CSV from the Repositories view or hit
/api/eventsquarterly so you retain a human-friendly change log. - Store the configuration export (
/api/export) alongside your disaster-recovery docs in case you need to rebuild the service.
5. Back up the backup
- Snapshots: Use ZFS/BTRFS or Proxmox backups on the mirror’s data volume.
- Offsite:
restic/rclonethedata/directory to a NAS or object store. - Test: Restore to a test Gitea instance and spot-check history every few months.
Verify the archive
- Delete a draft issue on GitHub.
- Wait for the next sync; open the issue in Gitea—you should still see the original content.
- Compare
git tag -lin both remotes to ensure releases match. - Use
git lfs ls-filesto confirm large assets made it across.
Maintenance checklist
- Rotate tokens annually and document the rotation date in the repo README.
- Monitor disk growth; configure
persistence.sizeif you run the Helm chart. - Log anomalies—failed runs, conflicts—in your homelab journal to track trends.
Related playbooks
FAQ
Does this preserve issues, pull requests, and releases?
Yes—enable Mirror metadata and Mirror releases from Configuration → Connections → Content & Data. Pull requests copy as enriched issues, keeping discussion and labels.
What happens if a GitHub repo is deleted or goes private?
Turn on Handle orphaned repositories automatically and use Archive to keep a read-only copy locally. Delete enforces a strict mirror, removing the repo.
How much storage will I need long-term?
Plan for repo size plus attachments and LFS. Monitor the mirror’s data/ volume growth and consider ZFS/BTRFS snapshots or object storage for older archives.
