I for my part have not migrated to GIT cause i do not need the extra hoops like staging area or syncing whole repos over network. I have always a server at my side and can work with checkout/checkin. This implies a hard requirement at interface definitions, cause people can't just alter them. Seeing people struggle with all the problems introduced with the git way of work I feel there is still a big market for not git. People are introduced to git and stop asking if there isn't something more good to there workfow. Excuse my english please, im a non native speaker
So many tools are tightly coupled to git or just assume that everybody uses git. I think that it's hard for any new VCS to gain traction without good git compatibility.
For instance Pijul might very well be a lot better than git / jj. I wouldn't know, I haven't bothered trying it because all the projects I need to work in use git. But since jj has great git compatibility, I actually have been able to adopt it because of its git backend.
A new VCS that doesn't have git compatibility at its core is going to have a really hard time overcoming network effects.
> The unsafe versions of these things literally throw out history and replace it with a fiction that whoever did the final operation wrote everything, or that the original author wrote something possibly very divergent from what they actually wrote.
When I'm rebasing my own work and editing history, that's exactly what I'm looking to accomplish, though.
The team I work for tends to use GitHub's "Squash and merge" button a lot. I find it to be the best of both worlds: the `develop` branch gets a single commit per PR, with a summary of what was done (I always edit the summary commit message and reduce it from a copy of 20 commit messages down to just the most important 4-5, deleting the entirety of messages like "fixup" or "address review comments"). But the PR on GitHub also preserves the history of the commits, so anyone who wants to look at the messy commit history can follow the PR link and see the actual commits.
I'm sure there are other Git forges that would support a similar workflow, with a "Squash and Merge" button or equivalent, but my team hasn't felt any need to migrate away from GitHub so I've never yet investigated that in detail.
Only downside I've found to this workflow is that it would make it harder to migrate to a different Git forge in the future: unless you're very careful with the migration, the PR numbers are likely to be different (perhaps resetting at 1, even) and the other forge won't end up with the commits that are on GitHub's copy of the repo but no longer on any active branch (we also use the "auto-delete branches when you hit the merge button" option). But it would still be possible for a migration tool to handle this correctly: look at all PRs on GitHub, grab the commits from them, and migrate them to Merge Requests on the new forge.
It boggles my mind that instead of this being a UI projection, git instead ingrains a process where developers habitually destroy their history (and bisection options, and merge conflict resolution), therein loading an additional footgun that goes off every now and again when it turns out a now-squashed branch was the basis of (or merged into) some other branch
It’s important to note that not all history is worth keeping, and keeping a dozen commits titled “fix” fixing build / CI errors from the original changes are a lot worse for bisecting than squashing it all into just one.
I very much prefer keeping histories by default (both my personal workflows and the tools I build default to that) but squash is a valuable tool.
Yes, a good Git log viewer that would auto-squash branches down to a summary, and allow "expanding" them, would be useful. But the way branching and merging creates confusing train-track graphs is IMHO one of the reasons why many teams end up using the squash-and-merge workflow. There's definitely room for improvement there...
Git doesn't do that. People needlessly destroying history do that.
Git will happily let you merge branches and preserve the history there. GP seems to like that history being in PRs only on github instead. I don't get why, that just seems worse to me.
I would assume most people who would enable an "auto squash" option also aren't carefully creating and curating commits. Bisect is useless if half your commits are broken. People regularly make commits that don't even build, much less pass QA and deliver a valid version of the software. These are works in progress, broken versions and should be deleted.
If you actually do like to deliver the correct number of commits then it's frustrating to work with people who don't care. In that case I would suggest making the squash optional but you could also try selling your team on doing smaller commits. In my experience you either "get it" or you don't, though. I've never successfully got someone to understand small commits.
I've read much of the HN discussion on the previous post, a skimmed the rest, but I didn't see a couple of things addressed:
First, how could you make this deal with copies and renames? It seems to me like the pure version of this would require a weave of your whole repository.
Second, how different is this from something like jujutsu? As in, of course it's different, your primary data structure is a weave. But jj keeps all of the old commits around for any change (and prevents git from garbage collecting them by maintaining refs from the op log). So in theory, you could replay the entire history of a file at a particular commit by tracing back through the evolog. That, plus the exact diff algorithm, seems like enough to recreate the weave at any point in time (well, at any commit), and so you could think of this whole thing as a caching layer on top of what jj already provides. I'm not saying you would want to implement it like that, but conceptually I don't see the difference and so there might be useful "in between" implementations to consider.
In fact, you could even specify different diff algorithms for different commits if you really wanted to. Which would be a bit of a mess, because you'd have to store that and a weave would be a function of those diff algorithms and when they were used, but it would at least be possible. (Cohen's system could do this too, at the cost of tracking lots of stuff that currently it doesn't need or want to track.) I'm skeptical that this would be useful except in a very limited sense (eg you could switch diff algorithms and have all new commits use the new one, without needing to rebuild your entire repository). It breaks distributed scenarios -- everyone has to agree on which diff to use for each commit. It's just something that falls out of having the complete history available.
I'm cheating with jj a bit here, since normally you won't be pushing the evolog to remotes so in practice you probably don't have the complete history. In practice, when pushing to a remote you might want to materialize a weave or a weave-like "compiled history" and push that too/instead, just like in Cohen's model, if you really wanted to do this. And that would come with limitations on the diff used for history-less usage, since the weave has to assume a specific deterministic diff.
> Oddly they don’t seem to have figured out the generation counting trick, which is something I did come up with over twenty years ago. Combining the two ideas is what allows for there to be no reference to commit ids in the history and have the entire algorithm be structural.
Can you say more about this? What exactly is this trick you’re talking about? What are the benefits?
(That seems to be an archive of the old revctrl.org pages from a while back; most likely Bram Cohen has a blog somewhere explaining it in his own words - probably about 2003, at a guess)
I for my part have not migrated to GIT cause i do not need the extra hoops like staging area or syncing whole repos over network. I have always a server at my side and can work with checkout/checkin. This implies a hard requirement at interface definitions, cause people can't just alter them. Seeing people struggle with all the problems introduced with the git way of work I feel there is still a big market for not git. People are introduced to git and stop asking if there isn't something more good to there workfow. Excuse my english please, im a non native speaker
At least jj tries to fix this partly, and I feel like its good enough for what I do. Most of my projects nowadays are solo, so I can keep it simple.
So many tools are tightly coupled to git or just assume that everybody uses git. I think that it's hard for any new VCS to gain traction without good git compatibility.
For instance Pijul might very well be a lot better than git / jj. I wouldn't know, I haven't bothered trying it because all the projects I need to work in use git. But since jj has great git compatibility, I actually have been able to adopt it because of its git backend.
A new VCS that doesn't have git compatibility at its core is going to have a really hard time overcoming network effects.
You're talking about a new VCS but I don't know how even git with sha-256 will gain any traction because I don't see how you can support both...
The great thing about jj is that it is backend agnostic; it is the best gateway to newer version control systems.
> The unsafe versions of these things literally throw out history and replace it with a fiction that whoever did the final operation wrote everything, or that the original author wrote something possibly very divergent from what they actually wrote.
When I'm rebasing my own work and editing history, that's exactly what I'm looking to accomplish, though.
The team I work for tends to use GitHub's "Squash and merge" button a lot. I find it to be the best of both worlds: the `develop` branch gets a single commit per PR, with a summary of what was done (I always edit the summary commit message and reduce it from a copy of 20 commit messages down to just the most important 4-5, deleting the entirety of messages like "fixup" or "address review comments"). But the PR on GitHub also preserves the history of the commits, so anyone who wants to look at the messy commit history can follow the PR link and see the actual commits.
I'm sure there are other Git forges that would support a similar workflow, with a "Squash and Merge" button or equivalent, but my team hasn't felt any need to migrate away from GitHub so I've never yet investigated that in detail.
Only downside I've found to this workflow is that it would make it harder to migrate to a different Git forge in the future: unless you're very careful with the migration, the PR numbers are likely to be different (perhaps resetting at 1, even) and the other forge won't end up with the commits that are on GitHub's copy of the repo but no longer on any active branch (we also use the "auto-delete branches when you hit the merge button" option). But it would still be possible for a migration tool to handle this correctly: look at all PRs on GitHub, grab the commits from them, and migrate them to Merge Requests on the new forge.
It boggles my mind that instead of this being a UI projection, git instead ingrains a process where developers habitually destroy their history (and bisection options, and merge conflict resolution), therein loading an additional footgun that goes off every now and again when it turns out a now-squashed branch was the basis of (or merged into) some other branch
It’s important to note that not all history is worth keeping, and keeping a dozen commits titled “fix” fixing build / CI errors from the original changes are a lot worse for bisecting than squashing it all into just one.
I very much prefer keeping histories by default (both my personal workflows and the tools I build default to that) but squash is a valuable tool.
Yes, a good Git log viewer that would auto-squash branches down to a summary, and allow "expanding" them, would be useful. But the way branching and merging creates confusing train-track graphs is IMHO one of the reasons why many teams end up using the squash-and-merge workflow. There's definitely room for improvement there...
For sure. It just bugs me that we're stuck between two bad options.
Now let's also talk about renames...
Git doesn't do that. People needlessly destroying history do that.
Git will happily let you merge branches and preserve the history there. GP seems to like that history being in PRs only on github instead. I don't get why, that just seems worse to me.
I would assume most people who would enable an "auto squash" option also aren't carefully creating and curating commits. Bisect is useless if half your commits are broken. People regularly make commits that don't even build, much less pass QA and deliver a valid version of the software. These are works in progress, broken versions and should be deleted.
If you actually do like to deliver the correct number of commits then it's frustrating to work with people who don't care. In that case I would suggest making the squash optional but you could also try selling your team on doing smaller commits. In my experience you either "get it" or you don't, though. I've never successfully got someone to understand small commits.
I've read much of the HN discussion on the previous post, a skimmed the rest, but I didn't see a couple of things addressed:
First, how could you make this deal with copies and renames? It seems to me like the pure version of this would require a weave of your whole repository.
Second, how different is this from something like jujutsu? As in, of course it's different, your primary data structure is a weave. But jj keeps all of the old commits around for any change (and prevents git from garbage collecting them by maintaining refs from the op log). So in theory, you could replay the entire history of a file at a particular commit by tracing back through the evolog. That, plus the exact diff algorithm, seems like enough to recreate the weave at any point in time (well, at any commit), and so you could think of this whole thing as a caching layer on top of what jj already provides. I'm not saying you would want to implement it like that, but conceptually I don't see the difference and so there might be useful "in between" implementations to consider.
In fact, you could even specify different diff algorithms for different commits if you really wanted to. Which would be a bit of a mess, because you'd have to store that and a weave would be a function of those diff algorithms and when they were used, but it would at least be possible. (Cohen's system could do this too, at the cost of tracking lots of stuff that currently it doesn't need or want to track.) I'm skeptical that this would be useful except in a very limited sense (eg you could switch diff algorithms and have all new commits use the new one, without needing to rebuild your entire repository). It breaks distributed scenarios -- everyone has to agree on which diff to use for each commit. It's just something that falls out of having the complete history available.
I'm cheating with jj a bit here, since normally you won't be pushing the evolog to remotes so in practice you probably don't have the complete history. In practice, when pushing to a remote you might want to materialize a weave or a weave-like "compiled history" and push that too/instead, just like in Cohen's model, if you really wanted to do this. And that would come with limitations on the diff used for history-less usage, since the weave has to assume a specific deterministic diff.
You can generically represent copies and renamed (aka "moves") in a CRDT, OT or CTM using a Portal: https://braid.org/meeting-62/portals
Discussion on the previous post in this series: https://news.ycombinator.com/item?id=47478401
> Oddly they don’t seem to have figured out the generation counting trick, which is something I did come up with over twenty years ago. Combining the two ideas is what allows for there to be no reference to commit ids in the history and have the entire algorithm be structural.
Can you say more about this? What exactly is this trick you’re talking about? What are the benefits?
https://github.com/bramcohen/manyana?tab=readme-ov-file#why-...
But someone may need to explain it to me.
Not the OP, but probably this: https://tonyg.github.io/revctrl.org/GenerationCounting.html
(That seems to be an archive of the old revctrl.org pages from a while back; most likely Bram Cohen has a blog somewhere explaining it in his own words - probably about 2003, at a guess)
Someone make a TLA+ model for this bad boy