Why GitHub?

james@lemm.ee to No Stupid Questions@lemmy.world – 92 points –

I can't help but notice most (all I've seen anyway) of the federated projects are hosted on GitHub. GitLab is also not federated, but can be self hosted and has at least discussed it.

I am fully aware of my bias for GitLab over GitHub, but I still wonder why is those things? Is there a federated source hosting project?

53

IMHO federation doesn't bring any real benefits to git and introduces a lot of risks.

The git protocol, if you will, already allows developers to backup and move their repositories as needed. And the primary concern with source control is having a stable and secure place to host it. GitHub already provides that, free of charge.

Introducing federation, how do you control who can and cannot make changes to your codebase? How do you ensure you maintain access if a server goes down?

So while it's nice that you can self host and federate git with GitLab, what value does that provide over the status quo? And how do those benefits outweigh the risks outlined above?

You bring up some good points. I agree on the risk, even though I'm a fan I find federated tools harder to get started with.

I agree git is decentralized, but services like GitHub are not. They're more than just hosting code. They're issues, wiki's, CI/CD, peer reviews, etc.

how do you control who can and cannot make changes to your codebase?

I'd image it's the same as now. Except now you could say @everyone@that-server is cool and can contribute, or @those-guys@over-there shouldn't even be allowed to see this code.

How do you ensure you maintain access if a server goes down?

How do you do this on GitHub?

what value does that provide over the status quo?

I feel like this is the root of fediverse problems. It's easy to send your first tweet, but that first toot takes some effort (I just learned they're called toots).

Btw I appreciate the fediverse and decentralization as much as the next guy, heck I'm even writing software for the fediverse. But I feel like there's a handful of people out there that want to try and apply the fediverse concept to everything. Similar to what happened with Blockchain. Everyone and everything had to be implemented via Blockchain even if it didn't make sense in the end.

IMO though, GitHub is just one "instance" in an already decentralized system. Sure it may be the largest but it's already incredibly simple for me to move and host my code anywhere else. GitHub's instance just happens to provide the best set of tools and features available to me.

But back to my original concerns. Let's assume you have an ActivityPub based git hosting system. For the sake of argument let's assume that there's two instances in this federation today. Let's just call them Hub and Lab....

Say I create an account on Hub and upload my repository there. I then clone it and start working... It gets federated to Lab... But the admin on Lab just decides to push a commit to it directly because reasons... Hub can now do a few things:

  1. They could just de-federate but who knows what will happen to that repo now.
  2. Hub could reject the commit, but now we're in a similar boat, effectively the repo has been forked and you can't really reconcile the histories between the two. Anyone on Lab can't use that repo anymore.
  3. Accept the change. But now I'm stuck with a repo with unauthorized edits.

Similarly if Hub was to go down for whatever reason. Let's assume we have a system in place that effectively prevents the above scenario from happening... If I didn't create an account on Lab prior to Hub going down I now no longer have the authorization to make changes to that repository. I'm now forced to fork my own repository and continue my work from the fork. But all of my users may still be looking for updates to the original repository. Telling everyone about the new location becomes a headache.

There's also issues of how do you handle private repositories? This is something that the fediverse can't solve. So all repos in the fediverse would HAVE to be public.

And yes, if GitHub went down today, I'd have similar issues, but that's why you have backups. And git already has a solution for that outside the fediverse. Long story short, the solutions that the fediverse provides aren't problems that exist for git and it raises additional problems that now have to be solved. Trying to apply the fediverse to git is akin to "a solution in search of a problem", IMHO.

I would like to add that git is a pretty good example of data people have backups for. I don't care if Github blows up tomorrow because I have repos on my disk. Even if my disk also dies, my friends have my repos cloned so I wouldn't lose much

Git is already decentralized - every contributor has a copy of the repo on their own machine.

At that point, it's just about using what's most popular. I have a slight preference toward gitlab myself, but the prevalence of github means I still push most of my projects to there, just because I'm already visiting the website so often.

But it is still 1 centralized server that has the code and serves it to you. Its like to say "The internet is federated as i have copied some memes onto my pc"

No, that's not quite how git works. Everyone who's cloned the repo has a complete copy of the code — at least at the time they cloned/checked it out. If GitHub, Gitlab, BitBucket or whatever goes away, you can keep working without it, provided that people know how to use a remote from another machine. Git really is decentralized even if people tend to use it in a centralized fashion.

Edit: Spelling.

I agree with both of you (not sure why the one got so many downvotes).

Git is not centralized. GitHub, GitLab, Bitbucket, Gitea, is a centralized server.

These services are more than just git repositories. They're issue tracking, merge/pull requests, wikis, CI/CD, etc. If the service is lost, the source is still out there but it could be quite the pain to get going again.

(Unless they shallow cloned)

You'd still have a complete copy of the current HEAD, you'd just be missing a bunch of history depending on the depth at which you cloned.

It is supporting my point btw. The internet is federated then. For example: "With my copy of a meme is a complete copy too. If Lemmy goes away it keeps working without it. And i can share it al along :)" If git(hub) is federated then is everything on the internet too.

It's really not. In your example, the meme would be decentralized — not "the internet". Also, I think you're confusing "git" with services offering "git and more" such as GitHub, GitLab, etc.

Yup. Thats my point. GIT IS DECENTRALIZED NOT FEDERATED.

:D You said it yourself

Literally no one but you has used the word "federated" in his thread of comments... You responded to the original comment about git being decentralized by saying "it's still 1 centralized server that has the code". I corrected you, because that's not how git works, and now I'm not sure what the fuck you're on about...

Edit: Screenshot, in case you forget.

Sry i was wrong XD lmao. I thought you wanted to proof that git is federated

Once you move off the free tier, GitLab is quite expensive in comparison. That might be part of it.

Open source projects enjoy free gitlab ultimate or whatever it's called

I've never moved off the free tier, even when self hosting. Although I did get to use a paid version at a previous job and I admit some of those features were nice.

Are they paying on GitHub?

From the view of a small team that actually paid for GitLab Bronze: Their pricing is a mess and they keep changing things. We went with GitLab at first, Bronze tier, everything was great.

Then they removed Bronze tier (which was $4 per user per month) and only offered a premium tier from then on, $20 per user per month. Which is insane if you look at GitHub pricing.

So instead of paying that much we went with the free tier afterwards. Then GitLab limited free tier repos to 5 users max. Which was yet again annoying and we had to act on that.

In the end the company moved to GitHub, all we wanted was a stable solution we pay for and be left in peace. GitLab kept messing with things and wasting developer hours (Damn meetings with management). GitHub still has a $4 per user per month tier, GitLab.. wtf.. just raised the price again to $29 per user per month. Are they insane?

As a user, I have to say gitlab isn't exactly top tier, either.

We use a selfhosted instance at my company and it a) eats resources like a badly configured Bitcoin miner and b) keeps not only changing it's UI, but also puts 8000 different things in it.

Yeah, I have no clue how they make software that's so damn inefficient.

Don't even get me started, for example I bought a personal license from Jira (Atlassian) to run on my Linux server. Tiny university project, 5 users (with no one using it most of the time) and the thing ate up all my memory and used half my CPU cores just by idling. That server also hosted Minecraft, which used less resources than that..

Jira suffers from the problem that you can configure almost everything. Pouring that into an efficient data structure (in memory and in SQL) is almost impossible, so it always has to do tons of overhead.

I feel like ironically this freedom to adapt it to you needs makes it not only slow, but also unusable, because every bullshit business requirement gets some artefact in Jira. Even though 95% of the workflows look exactly the same in practice.

Yeah, I have no clue how they make software that’s so damn inefficient.

Software is a gas.

I haven't run into the resource issue (running in docker), but yeah I wish I could turn off some UI features. We never need to upload designs so why do I have to look at it on every issue?

You can turn off features by default! Check your gitlab.rb and you will find even more stuff they have bundled in there that is off, like a matter most server 😀

When I looked some could be disabled and some couldn't 🤷
I'm not a devops engineer I only play one when no one else is willing.

I'm forced to agree, GitLab's pricing could be easier to understand and more competitive.

I haven't ran into the 5 user limit; I suspect that's not a limit of the self-hosted version. I will say it's a pain to get a clear understanding of what is available and what's not on the free edition when self hosting... also there are 2 free editions (community and unlicensed enterprise) now which adds to the confusion.

I use github for public projects. Easily discoverable and it works extremely well.

It's also free for FOSS.

For private/internal, inuse gitea. Very small, lightweight, but works perfectly

UI and pricing aside (I don't have much direct experience of either on GitLab), GitHub is, AFAIK, by far the most popular and therefore it's easier to get your project discovered and get other developers to contribute.

I do kind of think that by centralising so much stuff on a website owned by Microsoft we are running the risk of another Reddit-like situation where GitHub turns sharply anti-user in an attempt to monetise in the future. But for the moment, the network effects are real and significant.

I don't think I've ever discovered projects by perusing GitHub. It's always the "fork this" link on a project page or a link from an article.

I've learned I don't use most of the internet the way everyone else does, so my anecdotal evidence is nothing to go by. 🤣

Drive by contributions are just as frequent in gitlab as github. I've been involved in moving multiple large free software projects from github to gitlab and there was zero difference seen in discoverability, in fact contributions grew on gitlab, although there is no data to suggest it was because of gitlab, as it could have been because projects aged and grew

As a large repo maintainer I use gitlab because they are much friendlier to open source devs

Legacy I would say. Github used to be the first and the best.

Now they are literally selling out work of open source developers with copilot, but their service is honestly good

Legacy I would say. Github used to be the first and the best.

I know this is the answer, but I'm sad when the answer is "because we've always done it that way".

3 more...

GitHub is (mostly) free and central, as well as being the default for the majority of developers. Gitlab handles some things differently, scales less well (from what I can tell) and is honestly just different (change is bad /s).

I think the fact it isn’t federated is a point in favor of GitHub. If something goes wrong in a federation protocol then there’s no impact to the code bases.

Wait what? I'm running three different self hosted gitlab, and they aren't resource hogs at all. One has thousands of active users. Maybe if you are trying to run on a rpi or something you are going to have issues... But come on, Bitcoin miner? That is hyperbole

Let's not forget the EULA difference. GitHub will own your code to a greater degree than the alternatives, in order to feed their AI. I don't think you can opt out?