-
-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Multi repo folders #22588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi repo folders #22588
Conversation
(with the caveat that I'm just a user and sysadmin of gitea, not a dev) This is a neat feature, but personally, I would solve this with LVM on Linux or zfs on FreeBSD, or maybe even something like Ceph for a bigger system. Expandable storage is neat, but is already solved reliably at lower levels. Teaching Gitea this is going to be impose all kinds of subtle bugs, like that forking a repo cannot be done efficiently with hardlinks anymore. |
How does forking use hardlinks? I didn't think you could have hardlinks to a folder under Linux... |
I'm running a local build of Gitea: Internally, this repo looks like:
then I make a fork under a different user: The folders aren't hardlinked, but their contents are:
eg. these are both inode 12231819
This is because per git-clone(1):
|
That's pretty neat, I didn't know git did that. I guess because I rarely clone a local repo it's never come up! |
I'm sorry to burst your bubble! This seems like a really cool idea. And I've only administered gitea using basic posix filesystems, and since I understand them best I plan to stick with them. But if there's already alternate backends in Gitea anyway then maybe this'd be handy for others. |
But I would be careful, hardlinks aren't the only thing that breaks if you start splitting up onto multiple mount points. Repo size counting might get bonked on the nose. And permissions might get weird, since different filesystems sometimes interpret UIDs and modes differently (I've seen this lots of times with CIFS aka samba). I've also seen git hooks get skipped when a repo is on certain remote file systems -- and gitea relies on git hooks to enforce enforce consistency between git and gitea when stuff gets uploaded; so I'm worrying about a situation where some repos end up on a remote filesystem with subtly different rules and no clear reason to users why their content or LFS files are misbehaving but only sometimes. |
Hi, sysadmin from Codeberg here. While I appreciate the effort, I can only second that this is not an easy job and prone to all kinds of possible bugs. You'd likely need a system that comes close to GitLab / Gitaly for distributed Git storage. What will happen if a user or repo gets renamed, or transferred? What about forks, like mentioned above? At Codeberg, we are using Ceph. Our experience is that it works well, but Git operations need to be really low-latency. So just adding an NFS mount or something like this for remote storage will not work anyway (I tried this locally once, and even basic operations took minutes from time to time). So I really think that this responsibility should not be part of Gitea, but on a lower level. |
I think #22775 will be helpful for this. |
First attempt at extending the backend storage of Gitea so that it can support multiple folders (i.e. mount points) for repos, enabling multiple filesystems (local or remote) with repo storage locations hashed across them. This enables more horizontal scaling of storage.
Further work is required to enable the varying of the number of folders as a day-2 operational task.
Most of the logic is in cmd/serv.go and models/user/user.go (this could perhaps be merged), the rest is to enable configuration and installation tasks.