Re: Proposed merge rules Monty Zukowski (mzukowski@bco.com)
Tue, 18 Apr 2000 09:35:19 -0700

> I understand the theory. The question is: what history needs
> to be tracked
> in order to get this right, and how is the integration
> algorithm to go about
> constructing the integration diff?
>
> Actually, the case you raise isn't that bad. The really tough
> nut comes if
> you take Monty's changes 3,4, and 6 of 7. If you do this kind
> of thing,
> merely applying the missing diffs will often break things.
>
> Also, suppose you do this and check it in first. What happens
> when *Monty*
> then does a join? This would seem to require recognizing
> which changes have
> already been incorporated.

Well maybe my comments about the selective-undo algorithm were appropriate. It was actually designed for groupware environments where multiple people would work on the same file locally and synchronize occasionally. You can treat all changes as inserts or deletes applied to a particular version of a file in a particular order. Given any sequence of changes to a file you can transpose operations and apply inverses to eliminate the changes you don't want but preserve the integrity of the changes you do want. For instance in the above example, say Monty's change 5 deleted a method that JCR really wanted to keep. As you said, merely applying the diffs will break things, because that delete shifted the meaning of the changes in 6 and 7. If you were to use the selective undo algorithm, you could "undo" change 5, which would modify the text-indices of operations 6 and 7 to be semantically correct.

Technically what you need to do to "undo" 5 is to insert 5's inverse (5') into the history right after 5 and then transpose 5' to the current state of the document. The transposition modifies 6 and 7 to be semantically correct. Along the way, of course, you are also checking for conflicts because you can know that it makes no sense to insert text into a block that has been deleted later. The very cool thing about this algorithm is that it is independent of the problem space--as long as you define transpose() and conflict() for your operations, you can take advantage of the selective undo algorithm. Changes could be represented at a higher semantic level--such as "rename field a to b in class c". That's outside the scope of DCMS (to represent changes as anything but text), but in a way isn't it really how you want to work with code?

But even sticking with just insert and delete you can properly apply any set of changes by transposing the operations through the history list, which is actually a history tree, right? Back an operation up to the parent and then down to another child branch, etc. The algorithm lets you know whether the change is "possible" or not.

It's a wacky algorithm to get your head around, it took me several readings of Prakash and Knister's paper to comprehend it. Hopefully my paper explains it in a way that is easier to understand.

If you want to be able to manage changes this way you will probably need a little bit more than a command line interface.

What do you think? Is this relevant?

What does it mean to have a branch which selectively takes changes out of various revisions and then labels them as a particular version of a branch? I think that's what you are talking about doing.

Monty