merge(third_party/git): Merge squashed git subtree at v2.23.0

Merge commit '1b593e1ea4' as 'third_party/git'
This commit is contained in:
Vincent Ambo 2020-01-11 23:36:56 +00:00
commit 7ef0d62730
3629 changed files with 1139935 additions and 0 deletions

View file

@ -0,0 +1,216 @@
From: Junio C Hamano <gitster@pobox.com>
Date: Wed, 07 May 2014 13:15:39 -0700
Subject: Beginner question on "Pull is mostly evil"
Abstract: This how-to explains a method for keeping a
project's history correct when using git pull.
Content-type: text/asciidoc
Keep authoritative canonical history correct with git pull
==========================================================
Sometimes a new project integrator will end up with project history
that appears to be "backwards" from what other project developers
expect. This howto presents a suggested integration workflow for
maintaining a central repository.
Suppose that that central repository has this history:
------------
---o---o---A
------------
which ends at commit `A` (time flows from left to right and each node
in the graph is a commit, lines between them indicating parent-child
relationship).
Then you clone it and work on your own commits, which leads you to
have this history in *your* repository:
------------
---o---o---A---B---C
------------
Imagine your coworker did the same and built on top of `A` in *his*
repository in the meantime, and then pushed it to the
central repository:
------------
---o---o---A---X---Y---Z
------------
Now, if you `git push` at this point, because your history that leads
to `C` lacks `X`, `Y` and `Z`, it will fail. You need to somehow make
the tip of your history a descendant of `Z`.
One suggested way to solve the problem is "fetch and then merge", aka
`git pull`. When you fetch, your repository will have a history like
this:
------------
---o---o---A---B---C
\
X---Y---Z
------------
Once you run merge after that, while still on *your* branch, i.e. `C`,
you will create a merge `M` and make the history look like this:
------------
---o---o---A---B---C---M
\ /
X---Y---Z
------------
`M` is a descendant of `Z`, so you can push to update the central
repository. Such a merge `M` does not lose any commit in both
histories, so in that sense it may not be wrong, but when people want
to talk about "the authoritative canonical history that is shared
among the project participants", i.e. "the trunk", they often view
it as "commits you see by following the first-parent chain", and use
this command to view it:
------------
$ git log --first-parent
------------
For all other people who observed the central repository after your
coworker pushed `Z` but before you pushed `M`, the commit on the trunk
used to be `o-o-A-X-Y-Z`. But because you made `M` while you were on
`C`, `M`'s first parent is `C`, so by pushing `M` to advance the
central repository, you made `X-Y-Z` a side branch, not on the trunk.
You would rather want to have a history of this shape:
------------
---o---o---A---X---Y---Z---M'
\ /
B-----------C
------------
so that in the first-parent chain, it is clear that the project first
did `X` and then `Y` and then `Z` and merged a change that consists of
two commits `B` and `C` that achieves a single goal. You may have
worked on fixing the bug #12345 with these two patches, and the merge
`M'` with swapped parents can say in its log message "Merge
fix-bug-12345". Having a way to tell `git pull` to create a merge
but record the parents in reverse order may be a way to do so.
Note that I said "achieves a single goal" above, because this is
important. "Swapping the merge order" only covers a special case
where the project does not care too much about having unrelated
things done on a single merge but cares a lot about first-parent
chain.
There are multiple schools of thought about the "trunk" management.
1. Some projects want to keep a completely linear history without any
merges. Obviously, swapping the merge order would not match their
taste. You would need to flatten your history on top of the
updated upstream to result in a history of this shape instead:
+
------------
---o---o---A---X---Y---Z---B---C
------------
+
with `git pull --rebase` or something.
2. Some projects tolerate merges in their history, but do not worry
too much about the first-parent order, and allow fast-forward
merges. To them, swapping the merge order does not hurt, but
it is unnecessary.
3. Some projects want each commit on the "trunk" to do one single
thing. The output of `git log --first-parent` in such a project
would show either a merge of a side branch that completes a single
theme, or a single commit that completes a single theme by itself.
If your two commits `B` and `C` (or they may even be two groups of
commits) were solving two independent issues, then the merge `M'`
we made in the earlier example by swapping the merge order is
still not up to the project standard. It merges two unrelated
efforts `B` and `C` at the same time.
For projects in the last category (Git itself is one of them),
individual developers would want to prepare a history more like
this:
------------
C0--C1--C2 topic-c
/
---o---o---A master
\
B0--B1--B2 topic-b
------------
That is, keeping separate topics on separate branches, perhaps like
so:
------------
$ git clone $URL work && cd work
$ git checkout -b topic-b master
$ ... work to create B0, B1 and B2 to complete one theme
$ git checkout -b topic-c master
$ ... same for the theme of topic-c
------------
And then
------------
$ git checkout master
$ git pull --ff-only
------------
would grab `X`, `Y` and `Z` from the upstream and advance your master
branch:
------------
C0--C1--C2 topic-c
/
---o---o---A---X---Y---Z master
\
B0--B1--B2 topic-b
------------
And then you would merge these two branches separately:
------------
$ git merge topic-b
$ git merge topic-c
------------
to result in
------------
C0--C1---------C2
/ \
---o---o---A---X---Y---Z---M---N
\ /
B0--B1-----B2
------------
and push it back to the central repository.
It is very much possible that while you are merging topic-b and
topic-c, somebody again advanced the history in the central repository
to put `W` on top of `Z`, and make your `git push` fail.
In such a case, you would rewind to discard `M` and `N`, update the
tip of your 'master' again and redo the two merges:
------------
$ git reset --hard origin/master
$ git pull --ff-only
$ git merge topic-b
$ git merge topic-c
------------
The procedure will result in a history that looks like this:
------------
C0--C1--------------C2
/ \
---o---o---A---X---Y---Z---W---M'--N'
\ /
B0--B1---------B2
------------
See also http://git-blame.blogspot.com/2013/09/fun-with-first-parent-history.html

View file

@ -0,0 +1,449 @@
From: Junio C Hamano <gitster@pobox.com>
Date: Wed, 21 Nov 2007 16:32:55 -0800
Subject: Addendum to "MaintNotes"
Abstract: Imagine that Git development is racing along as usual, when our friendly
neighborhood maintainer is struck down by a wayward bus. Out of the
hordes of suckers (loyal developers), you have been tricked (chosen) to
step up as the new maintainer. This howto will show you "how to" do it.
Content-type: text/asciidoc
How to maintain Git
===================
Activities
----------
The maintainer's Git time is spent on three activities.
- Communication (45%)
Mailing list discussions on general design, fielding user
questions, diagnosing bug reports; reviewing, commenting on,
suggesting alternatives to, and rejecting patches.
- Integration (50%)
Applying new patches from the contributors while spotting and
correcting minor mistakes, shuffling the integration and
testing branches, pushing the results out, cutting the
releases, and making announcements.
- Own development (5%)
Scratching my own itch and sending proposed patch series out.
The Policy
----------
The policy on Integration is informally mentioned in "A Note
from the maintainer" message, which is periodically posted to
this mailing list after each feature release is made.
- Feature releases are numbered as vX.Y.0 and are meant to
contain bugfixes and enhancements in any area, including
functionality, performance and usability, without regression.
- One release cycle for a feature release is expected to last for
eight to ten weeks.
- Maintenance releases are numbered as vX.Y.Z and are meant
to contain only bugfixes for the corresponding vX.Y.0 feature
release and earlier maintenance releases vX.Y.W (W < Z).
- 'master' branch is used to prepare for the next feature
release. In other words, at some point, the tip of 'master'
branch is tagged with vX.Y.0.
- 'maint' branch is used to prepare for the next maintenance
release. After the feature release vX.Y.0 is made, the tip
of 'maint' branch is set to that release, and bugfixes will
accumulate on the branch, and at some point, the tip of the
branch is tagged with vX.Y.1, vX.Y.2, and so on.
- 'next' branch is used to publish changes (both enhancements
and fixes) that (1) have worthwhile goal, (2) are in a fairly
good shape suitable for everyday use, (3) but have not yet
demonstrated to be regression free. New changes are tested
in 'next' before merged to 'master'.
- 'pu' branch is used to publish other proposed changes that do
not yet pass the criteria set for 'next'.
- The tips of 'master' and 'maint' branches will not be rewound to
allow people to build their own customization on top of them.
Early in a new development cycle, 'next' is rewound to the tip of
'master' once, but otherwise it will not be rewound until the end
of the cycle.
- Usually 'master' contains all of 'maint' and 'next' contains all
of 'master'. 'pu' contains all the topics merged to 'next', but
is rebuilt directly on 'master'.
- The tip of 'master' is meant to be more stable than any
tagged releases, and the users are encouraged to follow it.
- The 'next' branch is where new action takes place, and the
users are encouraged to test it so that regressions and bugs
are found before new topics are merged to 'master'.
Note that before v1.9.0 release, the version numbers used to be
structured slightly differently. vX.Y.Z were feature releases while
vX.Y.Z.W were maintenance releases for vX.Y.Z.
A Typical Git Day
-----------------
A typical Git day for the maintainer implements the above policy
by doing the following:
- Scan mailing list. Respond with review comments, suggestions
etc. Kibitz. Collect potentially usable patches from the
mailing list. Patches about a single topic go to one mailbox (I
read my mail in Gnus, and type \C-o to save/append messages in
files in mbox format).
- Write his own patches to address issues raised on the list but
nobody has stepped up solving. Send it out just like other
contributors do, and pick them up just like patches from other
contributors (see above).
- Review the patches in the saved mailboxes. Edit proposed log
message for typofixes and clarifications, and add Acks
collected from the list. Edit patch to incorporate "Oops,
that should have been like this" fixes from the discussion.
- Classify the collected patches and handle 'master' and
'maint' updates:
- Obviously correct fixes that pertain to the tip of 'maint'
are directly applied to 'maint'.
- Obviously correct fixes that pertain to the tip of 'master'
are directly applied to 'master'.
- Other topics are not handled in this step.
This step is done with "git am".
$ git checkout master ;# or "git checkout maint"
$ git am -sc3 mailbox
$ make test
In practice, almost no patch directly goes to 'master' or
'maint'.
- Review the last issue of "What's cooking" message, review the
topics ready for merging (topic->master and topic->maint). Use
"Meta/cook -w" script (where Meta/ contains a checkout of the
'todo' branch) to aid this step.
And perform the merge. Use "Meta/Reintegrate -e" script (see
later) to aid this step.
$ Meta/cook -w last-issue-of-whats-cooking.mbox
$ git checkout master ;# or "git checkout maint"
$ echo ai/topic | Meta/Reintegrate -e ;# "git merge ai/topic"
$ git log -p ORIG_HEAD.. ;# final review
$ git diff ORIG_HEAD.. ;# final review
$ make test ;# final review
- Handle the remaining patches:
- Anything unobvious that is applicable to 'master' (in other
words, does not depend on anything that is still in 'next'
and not in 'master') is applied to a new topic branch that
is forked from the tip of 'master'. This includes both
enhancements and unobvious fixes to 'master'. A topic
branch is named as ai/topic where "ai" is two-letter string
named after author's initial and "topic" is a descriptive name
of the topic (in other words, "what's the series is about").
- An unobvious fix meant for 'maint' is applied to a new
topic branch that is forked from the tip of 'maint'. The
topic is named as ai/maint-topic.
- Changes that pertain to an existing topic are applied to
the branch, but:
- obviously correct ones are applied first;
- questionable ones are discarded or applied to near the tip;
- Replacement patches to an existing topic are accepted only
for commits not in 'next'.
The above except the "replacement" are all done with:
$ git checkout ai/topic ;# or "git checkout -b ai/topic master"
$ git am -sc3 mailbox
while patch replacement is often done by:
$ git format-patch ai/topic~$n..ai/topic ;# export existing
then replace some parts with the new patch, and reapplying:
$ git checkout ai/topic
$ git reset --hard ai/topic~$n
$ git am -sc3 -s 000*.txt
The full test suite is always run for 'maint' and 'master'
after patch application; for topic branches the tests are run
as time permits.
- Merge maint to master as needed:
$ git checkout master
$ git merge maint
$ make test
- Merge master to next as needed:
$ git checkout next
$ git merge master
$ make test
- Review the last issue of "What's cooking" again and see if topics
that are ready to be merged to 'next' are still in good shape
(e.g. has there any new issue identified on the list with the
series?)
- Prepare 'jch' branch, which is used to represent somewhere
between 'master' and 'pu' and often is slightly ahead of 'next'.
$ Meta/Reintegrate master..pu >Meta/redo-jch.sh
The result is a script that lists topics to be merged in order to
rebuild 'pu' as the input to Meta/Reintegrate script. Remove
later topics that should not be in 'jch' yet. Add a line that
consists of '### match next' before the name of the first topic
in the output that should be in 'jch' but not in 'next' yet.
- Now we are ready to start merging topics to 'next'. For each
branch whose tip is not merged to 'next', one of three things can
happen:
- The commits are all next-worthy; merge the topic to next;
- The new parts are of mixed quality, but earlier ones are
next-worthy; merge the early parts to next;
- Nothing is next-worthy; do not do anything.
This step is aided with Meta/redo-jch.sh script created earlier.
If a topic that was already in 'next' gained a patch, the script
would list it as "ai/topic~1". To include the new patch to the
updated 'next', drop the "~1" part; to keep it excluded, do not
touch the line. If a topic that was not in 'next' should be
merged to 'next', add it at the end of the list. Then:
$ git checkout -B jch master
$ Meta/redo-jch.sh -c1
to rebuild the 'jch' branch from scratch. "-c1" tells the script
to stop merging at the first line that begins with '###'
(i.e. the "### match next" line you added earlier).
At this point, build-test the result. It may reveal semantic
conflicts (e.g. a topic renamed a variable, another added a new
reference to the variable under its old name), in which case
prepare an appropriate merge-fix first (see appendix), and
rebuild the 'jch' branch from scratch, starting at the tip of
'master'.
Then do the same to 'next'
$ git checkout next
$ sh Meta/redo-jch.sh -c1 -e
The "-e" option allows the merge message that comes from the
history of the topic and the comments in the "What's cooking" to
be edited. The resulting tree should match 'jch' as the same set
of topics are merged on 'master'; otherwise there is a mismerge.
Investigate why and do not proceed until the mismerge is found
and rectified.
$ git diff jch next
When all is well, clean up the redo-jch.sh script with
$ sh Meta/redo-jch.sh -u
This removes topics listed in the script that have already been
merged to 'master'. This may lose '### match next' marker;
add it again to the appropriate place when it happens.
- Rebuild 'pu'.
$ Meta/Reintegrate master..pu >Meta/redo-pu.sh
Edit the result by adding new topics that are not still in 'pu'
in the script. Then
$ git checkout -B pu jch
$ sh Meta/redo-pu.sh
When all is well, clean up the redo-pu.sh script with
$ sh Meta/redo-pu.sh -u
Double check by running
$ git branch --no-merged pu
to see there is no unexpected leftover topics.
At this point, build-test the result for semantic conflicts, and
if there are, prepare an appropriate merge-fix first (see
appendix), and rebuild the 'pu' branch from scratch, starting at
the tip of 'jch'.
- Update "What's cooking" message to review the updates to
existing topics, newly added topics and graduated topics.
This step is helped with Meta/cook script.
$ Meta/cook
This script inspects the history between master..pu, finds tips
of topic branches, compares what it found with the current
contents in Meta/whats-cooking.txt, and updates that file.
Topics not listed in the file but are found in master..pu are
added to the "New topics" section, topics listed in the file that
are no longer found in master..pu are moved to the "Graduated to
master" section, and topics whose commits changed their states
(e.g. used to be only in 'pu', now merged to 'next') are updated
with change markers "<<" and ">>".
Look for lines enclosed in "<<" and ">>"; they hold contents from
old file that are replaced by this integration round. After
verifying them, remove the old part. Review the description for
each topic and update its doneness and plan as needed. To review
the updated plan, run
$ Meta/cook -w
which will pick up comments given to the topics, such as "Will
merge to 'next'", etc. (see Meta/cook script to learn what kind
of phrases are supported).
- Compile, test and install all four (five) integration branches;
Meta/Dothem script may aid this step.
- Format documentation if the 'master' branch was updated;
Meta/dodoc.sh script may aid this step.
- Push the integration branches out to public places; Meta/pushall
script may aid this step.
Observations
------------
Some observations to be made.
* Each topic is tested individually, and also together with other
topics cooking first in 'pu', then in 'jch' and then in 'next'.
Until it matures, no part of it is merged to 'master'.
* A topic already in 'next' can get fixes while still in
'next'. Such a topic will have many merges to 'next' (in
other words, "git log --first-parent next" will show many
"Merge branch 'ai/topic' to next" for the same topic.
* An unobvious fix for 'maint' is cooked in 'next' and then
merged to 'master' to make extra sure it is Ok and then
merged to 'maint'.
* Even when 'next' becomes empty (in other words, all topics
prove stable and are merged to 'master' and "git diff master
next" shows empty), it has tons of merge commits that will
never be in 'master'.
* In principle, "git log --first-parent master..next" should
show nothing but merges (in practice, there are fixup commits
and reverts that are not merges).
* Commits near the tip of a topic branch that are not in 'next'
are fair game to be discarded, replaced or rewritten.
Commits already merged to 'next' will not be.
* Being in the 'next' branch is not a guarantee for a topic to
be included in the next feature release. Being in the
'master' branch typically is.
Appendix
--------
Preparing a "merge-fix"
~~~~~~~~~~~~~~~~~~~~~~~
A merge of two topics may not textually conflict but still have
conflict at the semantic level. A classic example is for one topic
to rename an variable and all its uses, while another topic adds a
new use of the variable under its old name. When these two topics
are merged together, the reference to the variable newly added by
the latter topic will still use the old name in the result.
The Meta/Reintegrate script that is used by redo-jch and redo-pu
scripts implements a crude but usable way to work this issue around.
When the script merges branch $X, it checks if "refs/merge-fix/$X"
exists, and if so, the effect of it is squashed into the result of
the mechanical merge. In other words,
$ echo $X | Meta/Reintegrate
is roughly equivalent to this sequence:
$ git merge --rerere-autoupdate $X
$ git commit
$ git cherry-pick -n refs/merge-fix/$X
$ git commit --amend
The goal of this "prepare a merge-fix" step is to come up with a
commit that can be squashed into a result of mechanical merge to
correct semantic conflicts.
After finding that the result of merging branch "ai/topic" to an
integration branch had such a semantic conflict, say pu~4, check the
problematic merge out on a detached HEAD, edit the working tree to
fix the semantic conflict, and make a separate commit to record the
fix-up:
$ git checkout pu~4
$ git show -s --pretty=%s ;# double check
Merge branch 'ai/topic' to pu
$ edit
$ git commit -m 'merge-fix/ai/topic' -a
Then make a reference "refs/merge-fix/ai/topic" to point at this
result:
$ git update-ref refs/merge-fix/ai/topic HEAD
Then double check the result by asking Meta/Reintegrate to redo the
merge:
$ git checkout pu~5 ;# the parent of the problem merge
$ echo ai/topic | Meta/Reintegrate
$ git diff pu~4
This time, because you prepared refs/merge-fix/ai/topic, the
resulting merge should have been tweaked to include the fix for the
semantic conflict.
Note that this assumes that the order in which conflicting branches
are merged does not change. If the reason why merging ai/topic
branch needs this merge-fix is because another branch merged earlier
to the integration branch changed the underlying assumption ai/topic
branch made (e.g. ai/topic branch added a site to refer to a
variable, while the other branch renamed that variable and adjusted
existing use sites), and if you changed redo-jch (or redo-pu) script
to merge ai/topic branch before the other branch, then the above
merge-fix should not be applied while merging ai/topic, but should
instead be applied while merging the other branch. You would need
to move the fix to apply to the other branch, perhaps like this:
$ mf=refs/merge-fix
$ git update-ref $mf/$the_other_branch $mf/ai/topic
$ git update-ref -d $mf/ai/topic

View file

@ -0,0 +1,106 @@
From: Eric S. Raymond <esr@thyrsus.com>
Abstract: This is how-to documentation for people who want to add extension
commands to Git. It should be read alongside api-builtin.txt.
Content-type: text/asciidoc
How to integrate new subcommands
================================
This is how-to documentation for people who want to add extension
commands to Git. It should be read alongside api-builtin.txt.
Runtime environment
-------------------
Git subcommands are standalone executables that live in the Git exec
path, normally /usr/lib/git-core. The git executable itself is a
thin wrapper that knows where the subcommands live, and runs them by
passing command-line arguments to them.
(If "git foo" is not found in the Git exec path, the wrapper
will look in the rest of your $PATH for it. Thus, it's possible
to write local Git extensions that don't live in system space.)
Implementation languages
------------------------
Most subcommands are written in C or shell. A few are written in
Perl.
While we strongly encourage coding in portable C for portability,
these specific scripting languages are also acceptable. We won't
accept more without a very strong technical case, as we don't want
to broaden the Git suite's required dependencies. Import utilities,
surgical tools, remote helpers and other code at the edges of the
Git suite are more lenient and we allow Python (and even Tcl/tk),
but they should not be used for core functions.
This may change in the future. Especially Python is not allowed in
core because we need better Python integration in the Git Windows
installer before we can be confident people in that environment
won't experience an unacceptably large loss of capability.
C commands are normally written as single modules, named after the
command, that link a collection of functions called libgit. Thus,
your command 'git-foo' would normally be implemented as a single
"git-foo.c" (or "builtin/foo.c" if it is to be linked to the main
binary); this organization makes it easy for people reading the code
to find things.
See the CodingGuidelines document for other guidance on what we consider
good practice in C and shell, and api-builtin.txt for the support
functions available to built-in commands written in C.
What every extension command needs
----------------------------------
You must have a man page, written in asciidoc (this is what Git help
followed by your subcommand name will display). Be aware that there is
a local asciidoc configuration and macros which you should use. It's
often helpful to start by cloning an existing page and replacing the
text content.
You must have a test, written to report in TAP (Test Anything Protocol).
Tests are executables (usually shell scripts) that live in the 't'
subdirectory of the tree. Each test name begins with 't' and a sequence
number that controls where in the test sequence it will be executed;
conventionally the rest of the name stem is that of the command
being tested.
Read the file t/README to learn more about the conventions to be used
in writing tests, and the test support library.
Integrating a command
---------------------
Here are the things you need to do when you want to merge a new
subcommand into the Git tree.
1. Don't forget to sign off your patch!
2. Append your command name to one of the variables BUILTIN_OBJS,
EXTRA_PROGRAMS, SCRIPT_SH, SCRIPT_PERL or SCRIPT_PYTHON.
3. Drop its test in the t directory.
4. If your command is implemented in an interpreted language with a
p-code intermediate form, make sure .gitignore in the main directory
includes a pattern entry that ignores such files. Python .pyc and
.pyo files will already be covered.
5. If your command has any dependency on a particular version of
your language, document it in the INSTALL file.
6. There is a file command-list.txt in the distribution main directory
that categorizes commands by type, so they can be listed in appropriate
subsections in the documentation's summary command list. Add an entry
for yours. To understand the categories, look at command-list.txt
in the main directory. If the new command is part of the typical Git
workflow and you believe it common enough to be mentioned in 'git help',
map this command to a common group in the column [common].
7. Give the maintainer one paragraph to include in the RelNotes file
to describe the new feature; a good place to do so is in the cover
letter [PATCH 0/n].
That's all there is to it.

View file

@ -0,0 +1,164 @@
From: Junio C Hamano <gitster@pobox.com>
To: git@vger.kernel.org
Cc: Petr Baudis <pasky@suse.cz>, Linus Torvalds <torvalds@osdl.org>
Subject: Re: sending changesets from the middle of a git tree
Date: Sun, 14 Aug 2005 18:37:39 -0700
Abstract: In this article, JC talks about how he rebases the
public "pu" branch using the core Git tools when he updates
the "master" branch, and how "rebase" works. Also discussed
is how this applies to individual developers who sends patches
upstream.
Content-type: text/asciidoc
How to rebase from an internal branch
=====================================
--------------------------------------
Petr Baudis <pasky@suse.cz> writes:
> Dear diary, on Sun, Aug 14, 2005 at 09:57:13AM CEST, I got a letter
> where Junio C Hamano <junkio@cox.net> told me that...
>> Linus Torvalds <torvalds@osdl.org> writes:
>>
>> > Junio, maybe you want to talk about how you move patches from your "pu"
>> > branch to the real branches.
>>
> Actually, wouldn't this be also precisely for what StGIT is intended to?
--------------------------------------
Exactly my feeling. I was sort of waiting for Catalin to speak
up. With its basing philosophical ancestry on quilt, this is
the kind of task StGIT is designed to do.
I just have done a simpler one, this time using only the core
Git tools.
I had a handful of commits that were ahead of master in pu, and I
wanted to add some documentation bypassing my usual habit of
placing new things in pu first. At the beginning, the commit
ancestry graph looked like this:
*"pu" head
master --> #1 --> #2 --> #3
So I started from master, made a bunch of edits, and committed:
$ git checkout master
$ cd Documentation; ed git.txt ...
$ cd ..; git add Documentation/*.txt
$ git commit -s
After the commit, the ancestry graph would look like this:
*"pu" head
master^ --> #1 --> #2 --> #3
\
\---> master
The old master is now master^ (the first parent of the master).
The new master commit holds my documentation updates.
Now I have to deal with "pu" branch.
This is the kind of situation I used to have all the time when
Linus was the maintainer and I was a contributor, when you look
at "master" branch being the "maintainer" branch, and "pu"
branch being the "contributor" branch. Your work started at the
tip of the "maintainer" branch some time ago, you made a lot of
progress in the meantime, and now the maintainer branch has some
other commits you do not have yet. And "git rebase" was written
with the explicit purpose of helping to maintain branches like
"pu". You _could_ merge master to pu and keep going, but if you
eventually want to cherrypick and merge some but not necessarily
all changes back to the master branch, it often makes later
operations for _you_ easier if you rebase (i.e. carry forward
your changes) "pu" rather than merge. So I ran "git rebase":
$ git checkout pu
$ git rebase master pu
What this does is to pick all the commits since the current
branch (note that I now am on "pu" branch) forked from the
master branch, and forward port these changes.
master^ --> #1 --> #2 --> #3
\ *"pu" head
\---> master --> #1' --> #2' --> #3'
The diff between master^ and #1 is applied to master and
committed to create #1' commit with the commit information (log,
author and date) taken from commit #1. On top of that #2' and #3'
commits are made similarly out of #2 and #3 commits.
Old #3 is not recorded in any of the .git/refs/heads/ file
anymore, so after doing this you will have dangling commit if
you ran fsck-cache, which is normal. After testing "pu", you
can run "git prune" to get rid of those original three commits.
While I am talking about "git rebase", I should talk about how
to do cherrypicking using only the core Git tools.
Let's go back to the earlier picture, with different labels.
You, as an individual developer, cloned upstream repository and
made a couple of commits on top of it.
*your "master" head
upstream --> #1 --> #2 --> #3
You would want changes #2 and #3 incorporated in the upstream,
while you feel that #1 may need further improvements. So you
prepare #2 and #3 for e-mail submission.
$ git format-patch master^^ master
This creates two files, 0001-XXXX.patch and 0002-XXXX.patch. Send
them out "To: " your project maintainer and "Cc: " your mailing
list. You could use contributed script git-send-email if
your host has necessary perl modules for this, but your usual
MUA would do as long as it does not corrupt whitespaces in the
patch.
Then you would wait, and you find out that the upstream picked
up your changes, along with other changes.
where *your "master" head
upstream --> #1 --> #2 --> #3
used \
to be \--> #A --> #2' --> #3' --> #B --> #C
*upstream head
The two commits #2' and #3' in the above picture record the same
changes your e-mail submission for #2 and #3 contained, but
probably with the new sign-off line added by the upstream
maintainer and definitely with different committer and ancestry
information, they are different objects from #2 and #3 commits.
You fetch from upstream, but not merge.
$ git fetch upstream
This leaves the updated upstream head in .git/FETCH_HEAD but
does not touch your .git/HEAD or .git/refs/heads/master.
You run "git rebase" now.
$ git rebase FETCH_HEAD master
Earlier, I said that rebase applies all the commits from your
branch on top of the upstream head. Well, I lied. "git rebase"
is a bit smarter than that and notices that #2 and #3 need not
be applied, so it only applies #1. The commit ancestry graph
becomes something like this:
where *your old "master" head
upstream --> #1 --> #2 --> #3
used \ your new "master" head*
to be \--> #A --> #2' --> #3' --> #B --> #C --> #1'
*upstream
head
Again, "git prune" would discard the disused commits #1-#3 and
you continue on starting from the new "master" head, which is
the #1' commit.
-jc

View file

@ -0,0 +1,90 @@
Subject: [HOWTO] Using post-update hook
Message-ID: <7vy86o6usx.fsf@assigned-by-dhcp.cox.net>
From: Junio C Hamano <gitster@pobox.com>
Date: Fri, 26 Aug 2005 18:19:10 -0700
Abstract: In this how-to article, JC talks about how he
uses the post-update hook to automate Git documentation page
shown at https://www.kernel.org/pub/software/scm/git/docs/.
Content-type: text/asciidoc
How to rebuild from update hook
===============================
The pages under https://www.kernel.org/pub/software/scm/git/docs/
are built from Documentation/ directory of the git.git project
and needed to be kept up-to-date. The www.kernel.org/ servers
are mirrored and I was told that the origin of the mirror is on
the machine $some.kernel.org, on which I was given an account
when I took over Git maintainership from Linus.
The directories relevant to this how-to are these two:
/pub/scm/git/git.git/ The public Git repository.
/pub/software/scm/git/docs/ The HTML documentation page.
So I made a repository to generate the documentation under my
home directory over there.
$ cd
$ mkdir doc-git && cd doc-git
$ git clone /pub/scm/git/git.git/ docgen
What needs to happen is to update the $HOME/doc-git/docgen/
working tree, build HTML docs there and install the result in
/pub/software/scm/git/docs/ directory. So I wrote a little
script:
$ cat >dododoc.sh <<\EOF
#!/bin/sh
cd $HOME/doc-git/docgen || exit
unset GIT_DIR
git pull /pub/scm/git/git.git/ master &&
cd Documentation &&
make install-webdoc
EOF
Initially I used to run this by hand whenever I push into the
public Git repository. Then I did a cron job that ran twice a
day. The current round uses the post-update hook mechanism,
like this:
$ cat >/pub/scm/git/git.git/hooks/post-update <<\EOF
#!/bin/sh
#
# An example hook script to prepare a packed repository for use over
# dumb transports.
#
# To enable this hook, make this file executable by "chmod +x post-update".
case " $* " in
*' refs/heads/master '*)
echo $HOME/doc-git/dododoc.sh | at now
;;
esac
exec git-update-server-info
EOF
$ chmod +x /pub/scm/git/git.git/hooks/post-update
There are four things worth mentioning:
- The update-hook is run after the repository accepts a "git
push", under my user privilege. It is given the full names
of refs that have been updated as arguments. My post-update
runs the dododoc.sh script only when the master head is
updated.
- When update-hook is run, GIT_DIR is set to '.' by the calling
receive-pack. This is inherited by the dododoc.sh run via
the "at" command, and needs to be unset; otherwise, "git
pull" it does into $HOME/doc-git/docgen/ repository would not
work correctly.
- The stdout of update hook script is not connected to git
push; I run the heavy part of the command inside "at", to
receive the execution report via e-mail.
- This is still crude and does not protect against simultaneous
make invocations stomping on each other. I would need to add
some locking mechanism for this.

View file

@ -0,0 +1,144 @@
Date: Fri, 9 Nov 2007 08:28:38 -0800 (PST)
From: Linus Torvalds <torvalds@linux-foundation.org>
Subject: corrupt object on git-gc
Abstract: Some tricks to reconstruct blob objects in order to fix
a corrupted repository.
Content-type: text/asciidoc
How to recover a corrupted blob object
======================================
-----------------------------------------------------------
On Fri, 9 Nov 2007, Yossi Leybovich wrote:
>
> Did not help still the repository look for this object?
> Any one know how can I track this object and understand which file is it
-----------------------------------------------------------
So exactly *because* the SHA-1 hash is cryptographically secure, the hash
itself doesn't actually tell you anything, in order to fix a corrupt
object you basically have to find the "original source" for it.
The easiest way to do that is almost always to have backups, and find the
same object somewhere else. Backups really are a good idea, and Git makes
it pretty easy (if nothing else, just clone the repository somewhere else,
and make sure that you do *not* use a hard-linked clone, and preferably
not the same disk/machine).
But since you don't seem to have backups right now, the good news is that
especially with a single blob being corrupt, these things *are* somewhat
debuggable.
First off, move the corrupt object away, and *save* it. The most common
cause of corruption so far has been memory corruption, but even so, there
are people who would be interested in seeing the corruption - but it's
basically impossible to judge the corruption until we can also see the
original object, so right now the corrupt object is useless, but it's very
interesting for the future, in the hope that you can re-create a
non-corrupt version.
-----------------------------------------------------------
So:
> ib]$ mv .git/objects/4b/9458b3786228369c63936db65827de3cc06200 ../
-----------------------------------------------------------
This is the right thing to do, although it's usually best to save it under
it's full SHA-1 name (you just dropped the "4b" from the result ;).
Let's see what that tells us:
-----------------------------------------------------------
> ib]$ git-fsck --full
> broken link from tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
> to blob 4b9458b3786228369c63936db65827de3cc06200
> missing blob 4b9458b3786228369c63936db65827de3cc06200
-----------------------------------------------------------
Ok, I removed the "dangling commit" messages, because they are just
messages about the fact that you probably have rebased etc, so they're not
at all interesting. But what remains is still very useful. In particular,
we now know which tree points to it!
Now you can do
git ls-tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
which will show something like
100644 blob 8d14531846b95bfa3564b58ccfb7913a034323b8 .gitignore
100644 blob ebf9bf84da0aab5ed944264a5db2a65fe3a3e883 .mailmap
100644 blob ca442d313d86dc67e0a2e5d584b465bd382cbf5c COPYING
100644 blob ee909f2cc49e54f0799a4739d24c4cb9151ae453 CREDITS
040000 tree 0f5f709c17ad89e72bdbbef6ea221c69807009f6 Documentation
100644 blob 1570d248ad9237e4fa6e4d079336b9da62d9ba32 Kbuild
100644 blob 1c7c229a092665b11cd46a25dbd40feeb31661d9 MAINTAINERS
...
and you should now have a line that looks like
10064 blob 4b9458b3786228369c63936db65827de3cc06200 my-magic-file
in the output. This already tells you a *lot* it tells you what file the
corrupt blob came from!
Now, it doesn't tell you quite enough, though: it doesn't tell what
*version* of the file didn't get correctly written! You might be really
lucky, and it may be the version that you already have checked out in your
working tree, in which case fixing this problem is really simple, just do
git hash-object -w my-magic-file
again, and if it outputs the missing SHA-1 (4b945..) you're now all done!
But that's the really lucky case, so let's assume that it was some older
version that was broken. How do you tell which version it was?
The easiest way to do it is to do
git log --raw --all --full-history -- subdirectory/my-magic-file
and that will show you the whole log for that file (please realize that
the tree you had may not be the top-level tree, so you need to figure out
which subdirectory it was in on your own), and because you're asking for
raw output, you'll now get something like
commit abc
Author:
Date:
..
:100644 100644 4b9458b... newsha... M somedirectory/my-magic-file
commit xyz
Author:
Date:
..
:100644 100644 oldsha... 4b9458b... M somedirectory/my-magic-file
and this actually tells you what the *previous* and *subsequent* versions
of that file were! So now you can look at those ("oldsha" and "newsha"
respectively), and hopefully you have done commits often, and can
re-create the missing my-magic-file version by looking at those older and
newer versions!
If you can do that, you can now recreate the missing object with
git hash-object -w <recreated-file>
and your repository is good again!
(Btw, you could have ignored the fsck, and started with doing a
git log --raw --all
and just looked for the sha of the missing object (4b9458b..) in that
whole thing. It's up to you - Git does *have* a lot of information, it is
just missing one particular blob version.
Trying to recreate trees and especially commits is *much* harder. So you
were lucky that it's a blob. It's quite possible that you can recreate the
thing.
Linus

View file

@ -0,0 +1,479 @@
Date: Wed, 16 Oct 2013 04:34:01 -0400
From: Jeff King <peff@peff.net>
Subject: pack corruption post-mortem
Abstract: Recovering a corrupted object when no good copy is available.
Content-type: text/asciidoc
How to recover an object from scratch
=====================================
I was recently presented with a repository with a corrupted packfile,
and was asked if the data was recoverable. This post-mortem describes
the steps I took to investigate and fix the problem. I thought others
might find the process interesting, and it might help somebody in the
same situation.
********************************
Note: In this case, no good copy of the repository was available. For
the much easier case where you can get the corrupted object from
elsewhere, see link:recover-corrupted-blob-object.html[this howto].
********************************
I started with an fsck, which found a problem with exactly one object
(I've used $pack and $obj below to keep the output readable, and also
because I'll refer to them later):
-----------
$ git fsck
error: $pack SHA1 checksum mismatch
error: index CRC mismatch for object $obj from $pack at offset 51653873
error: inflate: data stream error (incorrect data check)
error: cannot unpack $obj from $pack at offset 51653873
-----------
The pack checksum failing means a byte is munged somewhere, and it is
presumably in the object mentioned (since both the index checksum and
zlib were failing).
Reading the zlib source code, I found that "incorrect data check" means
that the adler-32 checksum at the end of the zlib data did not match the
inflated data. So stepping the data through zlib would not help, as it
did not fail until the very end, when we realize the CRC does not match.
The problematic bytes could be anywhere in the object data.
The first thing I did was pull the broken data out of the packfile. I
needed to know how big the object was, which I found out with:
------------
$ git show-index <$idx | cut -d' ' -f1 | sort -n | grep -A1 51653873
51653873
51664736
------------
Show-index gives us the list of objects and their offsets. We throw away
everything but the offsets, and then sort them so that our interesting
offset (which we got from the fsck output above) is followed immediately
by the offset of the next object. Now we know that the object data is
10863 bytes long, and we can grab it with:
------------
dd if=$pack of=object bs=1 skip=51653873 count=10863
------------
I inspected a hexdump of the data, looking for any obvious bogosity
(e.g., a 4K run of zeroes would be a good sign of filesystem
corruption). But everything looked pretty reasonable.
Note that the "object" file isn't fit for feeding straight to zlib; it
has the git packed object header, which is variable-length. We want to
strip that off so we can start playing with the zlib data directly. You
can either work your way through it manually (the format is described in
link:../technical/pack-format.html[Documentation/technical/pack-format.txt]),
or you can walk through it in a debugger. I did the latter, creating a
valid pack like:
------------
# pack magic and version
printf 'PACK\0\0\0\2' >tmp.pack
# pack has one object
printf '\0\0\0\1' >>tmp.pack
# now add our object data
cat object >>tmp.pack
# and then append the pack trailer
/path/to/git.git/t/helper/test-tool sha1 -b <tmp.pack >trailer
cat trailer >>tmp.pack
------------
and then running "git index-pack tmp.pack" in the debugger (stop at
unpack_raw_entry). Doing this, I found that there were 3 bytes of header
(and the header itself had a sane type and size). So I stripped those
off with:
------------
dd if=object of=zlib bs=1 skip=3
------------
I ran the result through zlib's inflate using a custom C program. And
while it did report the error, I did get the right number of output
bytes (i.e., it matched git's size header that we decoded above). But
feeding the result back to "git hash-object" didn't produce the same
sha1. So there were some wrong bytes, but I didn't know which. The file
happened to be C source code, so I hoped I could notice something
obviously wrong with it, but I didn't. I even got it to compile!
I also tried comparing it to other versions of the same path in the
repository, hoping that there would be some part of the diff that didn't
make sense. Unfortunately, this happened to be the only revision of this
particular file in the repository, so I had nothing to compare against.
So I took a different approach. Working under the guess that the
corruption was limited to a single byte, I wrote a program to munge each
byte individually, and try inflating the result. Since the object was
only 10K compressed, that worked out to about 2.5M attempts, which took
a few minutes.
The program I used is here:
----------------------------------------------
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <signal.h>
#include <zlib.h>
static int try_zlib(unsigned char *buf, int len)
{
/* make this absurdly large so we don't have to loop */
static unsigned char out[1024*1024];
z_stream z;
int ret;
memset(&z, 0, sizeof(z));
inflateInit(&z);
z.next_in = buf;
z.avail_in = len;
z.next_out = out;
z.avail_out = sizeof(out);
ret = inflate(&z, 0);
inflateEnd(&z);
return ret >= 0;
}
/* eye candy */
static int counter = 0;
static void progress(int sig)
{
fprintf(stderr, "\r%d", counter);
alarm(1);
}
int main(void)
{
/* oversized so we can read the whole buffer in */
unsigned char buf[1024*1024];
int len;
unsigned i, j;
signal(SIGALRM, progress);
alarm(1);
len = read(0, buf, sizeof(buf));
for (i = 0; i < len; i++) {
unsigned char c = buf[i];
for (j = 0; j <= 0xff; j++) {
buf[i] = j;
counter++;
if (try_zlib(buf, len))
printf("i=%d, j=%x\n", i, j);
}
buf[i] = c;
}
alarm(0);
fprintf(stderr, "\n");
return 0;
}
----------------------------------------------
I compiled and ran with:
-------
gcc -Wall -Werror -O3 munge.c -o munge -lz
./munge <zlib
-------
There were a few false positives early on (if you write "no data" in the
zlib header, zlib thinks it's just fine :) ). But I got a hit about
halfway through:
-------
i=5642, j=c7
-------
I let it run to completion, and got a few more hits at the end (where it
was munging the CRC to match our broken data). So there was a good
chance this middle hit was the source of the problem.
I confirmed by tweaking the byte in a hex editor, zlib inflating the
result (no errors!), and then piping the output into "git hash-object",
which reported the sha1 of the broken object. Success!
I fixed the packfile itself with:
-------
chmod +w $pack
printf '\xc7' | dd of=$pack bs=1 seek=51659518 conv=notrunc
chmod -w $pack
-------
The `\xc7` comes from the replacement byte our "munge" program found.
The offset 51659518 is derived by taking the original object offset
(51653873), adding the replacement offset found by "munge" (5642), and
then adding back in the 3 bytes of git header we stripped.
After that, "git fsck" ran clean.
As for the corruption itself, I was lucky that it was indeed a single
byte. In fact, it turned out to be a single bit. The byte 0xc7 was
corrupted to 0xc5. So presumably it was caused by faulty hardware, or a
cosmic ray.
And the aborted attempt to look at the inflated output to see what was
wrong? I could have looked forever and never found it. Here's the diff
between what the corrupted data inflates to, versus the real data:
--------------
- cp = strtok (arg, "+");
+ cp = strtok (arg, ".");
--------------
It tweaked one byte and still ended up as valid, readable C that just
happened to do something totally different! One takeaway is that on a
less unlucky day, looking at the zlib output might have actually been
helpful, as most random changes would actually break the C code.
But more importantly, git's hashing and checksumming noticed a problem
that easily could have gone undetected in another system. The result
still compiled, but would have caused an interesting bug (that would
have been blamed on some random commit).
The adventure continues...
--------------------------
I ended up doing this again! Same entity, new hardware. The assumption
at this point is that the old disk corrupted the packfile, and then the
corruption was migrated to the new hardware (because it was done by
rsync or similar, and no fsck was done at the time of migration).
This time, the affected blob was over 20 megabytes, which was far too
large to do a brute-force on. I followed the instructions above to
create the `zlib` file. I then used the `inflate` program below to pull
the corrupted data from that. Examining that output gave me a hint about
where in the file the corruption was. But now I was working with the
file itself, not the zlib contents. So knowing the sha1 of the object
and the approximate area of the corruption, I used the `sha1-munge`
program below to brute-force the correct byte.
Here's the inflate program (it's essentially `gunzip` but without the
`.gz` header processing):
--------------------------
#include <stdio.h>
#include <string.h>
#include <zlib.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
/*
* oversized so we can read the whole buffer in;
* this could actually be switched to streaming
* to avoid any memory limitations
*/
static unsigned char buf[25 * 1024 * 1024];
static unsigned char out[25 * 1024 * 1024];
int len;
z_stream z;
int ret;
len = read(0, buf, sizeof(buf));
memset(&z, 0, sizeof(z));
inflateInit(&z);
z.next_in = buf;
z.avail_in = len;
z.next_out = out;
z.avail_out = sizeof(out);
ret = inflate(&z, 0);
if (ret != Z_OK && ret != Z_STREAM_END)
fprintf(stderr, "initial inflate failed (%d)\n", ret);
fprintf(stderr, "outputting %lu bytes", z.total_out);
fwrite(out, 1, z.total_out, stdout);
return 0;
}
--------------------------
And here is the `sha1-munge` program:
--------------------------
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <signal.h>
#include <openssl/sha.h>
#include <stdlib.h>
/* eye candy */
static int counter = 0;
static void progress(int sig)
{
fprintf(stderr, "\r%d", counter);
alarm(1);
}
static const signed char hexval_table[256] = {
-1, -1, -1, -1, -1, -1, -1, -1, /* 00-07 */
-1, -1, -1, -1, -1, -1, -1, -1, /* 08-0f */
-1, -1, -1, -1, -1, -1, -1, -1, /* 10-17 */
-1, -1, -1, -1, -1, -1, -1, -1, /* 18-1f */
-1, -1, -1, -1, -1, -1, -1, -1, /* 20-27 */
-1, -1, -1, -1, -1, -1, -1, -1, /* 28-2f */
0, 1, 2, 3, 4, 5, 6, 7, /* 30-37 */
8, 9, -1, -1, -1, -1, -1, -1, /* 38-3f */
-1, 10, 11, 12, 13, 14, 15, -1, /* 40-47 */
-1, -1, -1, -1, -1, -1, -1, -1, /* 48-4f */
-1, -1, -1, -1, -1, -1, -1, -1, /* 50-57 */
-1, -1, -1, -1, -1, -1, -1, -1, /* 58-5f */
-1, 10, 11, 12, 13, 14, 15, -1, /* 60-67 */
-1, -1, -1, -1, -1, -1, -1, -1, /* 68-67 */
-1, -1, -1, -1, -1, -1, -1, -1, /* 70-77 */
-1, -1, -1, -1, -1, -1, -1, -1, /* 78-7f */
-1, -1, -1, -1, -1, -1, -1, -1, /* 80-87 */
-1, -1, -1, -1, -1, -1, -1, -1, /* 88-8f */
-1, -1, -1, -1, -1, -1, -1, -1, /* 90-97 */
-1, -1, -1, -1, -1, -1, -1, -1, /* 98-9f */
-1, -1, -1, -1, -1, -1, -1, -1, /* a0-a7 */
-1, -1, -1, -1, -1, -1, -1, -1, /* a8-af */
-1, -1, -1, -1, -1, -1, -1, -1, /* b0-b7 */
-1, -1, -1, -1, -1, -1, -1, -1, /* b8-bf */
-1, -1, -1, -1, -1, -1, -1, -1, /* c0-c7 */
-1, -1, -1, -1, -1, -1, -1, -1, /* c8-cf */
-1, -1, -1, -1, -1, -1, -1, -1, /* d0-d7 */
-1, -1, -1, -1, -1, -1, -1, -1, /* d8-df */
-1, -1, -1, -1, -1, -1, -1, -1, /* e0-e7 */
-1, -1, -1, -1, -1, -1, -1, -1, /* e8-ef */
-1, -1, -1, -1, -1, -1, -1, -1, /* f0-f7 */
-1, -1, -1, -1, -1, -1, -1, -1, /* f8-ff */
};
static inline unsigned int hexval(unsigned char c)
{
return hexval_table[c];
}
static int get_sha1_hex(const char *hex, unsigned char *sha1)
{
int i;
for (i = 0; i < 20; i++) {
unsigned int val;
/*
* hex[1]=='\0' is caught when val is checked below,
* but if hex[0] is NUL we have to avoid reading
* past the end of the string:
*/
if (!hex[0])
return -1;
val = (hexval(hex[0]) << 4) | hexval(hex[1]);
if (val & ~0xff)
return -1;
*sha1++ = val;
hex += 2;
}
return 0;
}
int main(int argc, char **argv)
{
/* oversized so we can read the whole buffer in */
static unsigned char buf[25 * 1024 * 1024];
char header[32];
int header_len;
unsigned char have[20], want[20];
int start, len;
SHA_CTX orig;
unsigned i, j;
if (!argv[1] || get_sha1_hex(argv[1], want)) {
fprintf(stderr, "usage: sha1-munge <sha1> [start] <file.in\n");
return 1;
}
if (argv[2])
start = atoi(argv[2]);
else
start = 0;
len = read(0, buf, sizeof(buf));
header_len = sprintf(header, "blob %d", len) + 1;
fprintf(stderr, "using header: %s\n", header);
/*
* We keep a running sha1 so that if you are munging
* near the end of the file, we do not have to re-sha1
* the unchanged earlier bytes
*/
SHA1_Init(&orig);
SHA1_Update(&orig, header, header_len);
if (start)
SHA1_Update(&orig, buf, start);
signal(SIGALRM, progress);
alarm(1);
for (i = start; i < len; i++) {
unsigned char c;
SHA_CTX x;
#if 0
/*
* deletion -- this would not actually work in practice,
* I think, because we've already committed to a
* particular size in the header. Ditto for addition
* below. In those cases, you'd have to do the whole
* sha1 from scratch, or possibly keep three running
* "orig" sha1 computations going.
*/
memcpy(&x, &orig, sizeof(x));
SHA1_Update(&x, buf + i + 1, len - i - 1);
SHA1_Final(have, &x);
if (!memcmp(have, want, 20))
printf("i=%d, deletion\n", i);
#endif
/*
* replacement -- note that this tries each of the 256
* possible bytes. If you suspect a single-bit flip,
* it would be much shorter to just try the 8
* bit-flipped variants.
*/
c = buf[i];
for (j = 0; j <= 0xff; j++) {
buf[i] = j;
memcpy(&x, &orig, sizeof(x));
SHA1_Update(&x, buf + i, len - i);
SHA1_Final(have, &x);
if (!memcmp(have, want, 20))
printf("i=%d, j=%02x\n", i, j);
}
buf[i] = c;
#if 0
/* addition */
for (j = 0; j <= 0xff; j++) {
unsigned char extra = j;
memcpy(&x, &orig, sizeof(x));
SHA1_Update(&x, &extra, 1);
SHA1_Update(&x, buf + i, len - i);
SHA1_Final(have, &x);
if (!memcmp(have, want, 20))
printf("i=%d, addition=%02x", i, j);
}
#endif
SHA1_Update(&orig, buf + i, 1);
counter++;
}
alarm(0);
fprintf(stderr, "\r%d\n", counter);
return 0;
}
--------------------------

View file

@ -0,0 +1,273 @@
Date: Fri, 19 Dec 2008 00:45:19 -0800
From: Linus Torvalds <torvalds@linux-foundation.org>, Junio C Hamano <gitster@pobox.com>
Subject: Re: Odd merge behaviour involving reverts
Abstract: Sometimes a branch that was already merged to the mainline
is later found to be faulty. Linus and Junio give guidance on
recovering from such a premature merge and continuing development
after the offending branch is fixed.
Message-ID: <7vocz8a6zk.fsf@gitster.siamese.dyndns.org>
References: <alpine.LFD.2.00.0812181949450.14014@localhost.localdomain>
Content-type: text/asciidoc
How to revert a faulty merge
============================
Alan <alan@clueserver.org> said:
I have a master branch. We have a branch off of that that some
developers are doing work on. They claim it is ready. We merge it
into the master branch. It breaks something so we revert the merge.
They make changes to the code. they get it to a point where they say
it is ok and we merge again.
When examined, we find that code changes made before the revert are
not in the master branch, but code changes after are in the master
branch.
and asked for help recovering from this situation.
The history immediately after the "revert of the merge" would look like
this:
---o---o---o---M---x---x---W
/
---A---B
where A and B are on the side development that was not so good, M is the
merge that brings these premature changes into the mainline, x are changes
unrelated to what the side branch did and already made on the mainline,
and W is the "revert of the merge M" (doesn't W look M upside down?).
IOW, `"diff W^..W"` is similar to `"diff -R M^..M"`.
Such a "revert" of a merge can be made with:
$ git revert -m 1 M
After the developers of the side branch fix their mistakes, the history
may look like this:
---o---o---o---M---x---x---W---x
/
---A---B-------------------C---D
where C and D are to fix what was broken in A and B, and you may already
have some other changes on the mainline after W.
If you merge the updated side branch (with D at its tip), none of the
changes made in A or B will be in the result, because they were reverted
by W. That is what Alan saw.
Linus explains the situation:
Reverting a regular commit just effectively undoes what that commit
did, and is fairly straightforward. But reverting a merge commit also
undoes the _data_ that the commit changed, but it does absolutely
nothing to the effects on _history_ that the merge had.
So the merge will still exist, and it will still be seen as joining
the two branches together, and future merges will see that merge as
the last shared state - and the revert that reverted the merge brought
in will not affect that at all.
So a "revert" undoes the data changes, but it's very much _not_ an
"undo" in the sense that it doesn't undo the effects of a commit on
the repository history.
So if you think of "revert" as "undo", then you're going to always
miss this part of reverts. Yes, it undoes the data, but no, it doesn't
undo history.
In such a situation, you would want to first revert the previous revert,
which would make the history look like this:
---o---o---o---M---x---x---W---x---Y
/
---A---B-------------------C---D
where Y is the revert of W. Such a "revert of the revert" can be done
with:
$ git revert W
This history would (ignoring possible conflicts between what W and W..Y
changed) be equivalent to not having W or Y at all in the history:
---o---o---o---M---x---x-------x----
/
---A---B-------------------C---D
and merging the side branch again will not have conflict arising from an
earlier revert and revert of the revert.
---o---o---o---M---x---x-------x-------*
/ /
---A---B-------------------C---D
Of course the changes made in C and D still can conflict with what was
done by any of the x, but that is just a normal merge conflict.
On the other hand, if the developers of the side branch discarded their
faulty A and B, and redone the changes on top of the updated mainline
after the revert, the history would have looked like this:
---o---o---o---M---x---x---W---x---x
/ \
---A---B A'--B'--C'
If you reverted the revert in such a case as in the previous example:
---o---o---o---M---x---x---W---x---x---Y---*
/ \ /
---A---B A'--B'--C'
where Y is the revert of W, A' and B' are rerolled A and B, and there may
also be a further fix-up C' on the side branch. `"diff Y^..Y"` is similar
to `"diff -R W^..W"` (which in turn means it is similar to `"diff M^..M"`),
and `"diff A'^..C'"` by definition would be similar but different from that,
because it is a rerolled series of the earlier change. There will be a
lot of overlapping changes that result in conflicts. So do not do "revert
of revert" blindly without thinking..
---o---o---o---M---x---x---W---x---x
/ \
---A---B A'--B'--C'
In the history with rebased side branch, W (and M) are behind the merge
base of the updated branch and the tip of the mainline, and they should
merge without the past faulty merge and its revert getting in the way.
To recap, these are two very different scenarios, and they want two very
different resolution strategies:
- If the faulty side branch was fixed by adding corrections on top, then
doing a revert of the previous revert would be the right thing to do.
- If the faulty side branch whose effects were discarded by an earlier
revert of a merge was rebuilt from scratch (i.e. rebasing and fixing,
as you seem to have interpreted), then re-merging the result without
doing anything else fancy would be the right thing to do.
(See the ADDENDUM below for how to rebuild a branch from scratch
without changing its original branching-off point.)
However, there are things to keep in mind when reverting a merge (and
reverting such a revert).
For example, think about what reverting a merge (and then reverting the
revert) does to bisectability. Ignore the fact that the revert of a revert
is undoing it - just think of it as a "single commit that does a lot".
Because that is what it does.
When you have a problem you are chasing down, and you hit a "revert this
merge", what you're hitting is essentially a single commit that contains
all the changes (but obviously in reverse) of all the commits that got
merged. So it's debugging hell, because now you don't have lots of small
changes that you can try to pinpoint which _part_ of it changes.
But does it all work? Sure it does. You can revert a merge, and from a
purely technical angle, Git did it very naturally and had no real
troubles. It just considered it a change from "state before merge" to
"state after merge", and that was it. Nothing complicated, nothing odd,
nothing really dangerous. Git will do it without even thinking about it.
So from a technical angle, there's nothing wrong with reverting a merge,
but from a workflow angle it's something that you generally should try to
avoid.
If at all possible, for example, if you find a problem that got merged
into the main tree, rather than revert the merge, try _really_ hard to
bisect the problem down into the branch you merged, and just fix it, or
try to revert the individual commit that caused it.
Yes, it's more complex, and no, it's not always going to work (sometimes
the answer is: "oops, I really shouldn't have merged it, because it wasn't
ready yet, and I really need to undo _all_ of the merge"). So then you
really should revert the merge, but when you want to re-do the merge, you
now need to do it by reverting the revert.
ADDENDUM
Sometimes you have to rewrite one of a topic branch's commits *and* you can't
change the topic's branching-off point. Consider the following situation:
P---o---o---M---x---x---W---x
\ /
A---B---C
where commit W reverted commit M because it turned out that commit B was wrong
and needs to be rewritten, but you need the rewritten topic to still branch
from commit P (perhaps P is a branching-off point for yet another branch, and
you want be able to merge the topic into both branches).
The natural thing to do in this case is to checkout the A-B-C branch and use
"rebase -i P" to change commit B. However this does not rewrite commit A,
because "rebase -i" by default fast-forwards over any initial commits selected
with the "pick" command. So you end up with this:
P---o---o---M---x---x---W---x
\ /
A---B---C <-- old branch
\
B'---C' <-- naively rewritten branch
To merge A-B'-C' into the mainline branch you would still have to first revert
commit W in order to pick up the changes in A, but then it's likely that the
changes in B' will conflict with the original B changes re-introduced by the
reversion of W.
However, you can avoid these problems if you recreate the entire branch,
including commit A:
A'---B'---C' <-- completely rewritten branch
/
P---o---o---M---x---x---W---x
\ /
A---B---C
You can merge A'-B'-C' into the mainline branch without worrying about first
reverting W. Mainline's history would look like this:
A'---B'---C'------------------
/ \
P---o---o---M---x---x---W---x---M2
\ /
A---B---C
But if you don't actually need to change commit A, then you need some way to
recreate it as a new commit with the same changes in it. The rebase command's
--no-ff option provides a way to do this:
$ git rebase [-i] --no-ff P
The --no-ff option creates a new branch A'-B'-C' with all-new commits (all the
SHA IDs will be different) even if in the interactive case you only actually
modify commit B. You can then merge this new branch directly into the mainline
branch and be sure you'll get all of the branch's changes.
You can also use --no-ff in cases where you just add extra commits to the topic
to fix it up. Let's revisit the situation discussed at the start of this howto:
P---o---o---M---x---x---W---x
\ /
A---B---C----------------D---E <-- fixed-up topic branch
At this point, you can use --no-ff to recreate the topic branch:
$ git checkout E
$ git rebase --no-ff P
yielding
A'---B'---C'------------D'---E' <-- recreated topic branch
/
P---o---o---M---x---x---W---x
\ /
A---B---C----------------D---E
You can merge the recreated branch into the mainline without reverting commit W,
and mainline's history will look like this:
A'---B'---C'------------D'---E'
/ \
P---o---o---M---x---x---W---x---M2
\ /
A---B---C

View file

@ -0,0 +1,187 @@
From: Junio C Hamano <gitster@pobox.com>
To: git@vger.kernel.org
Subject: [HOWTO] Reverting an existing commit
Abstract: In this article, JC gives a small real-life example of using
'git revert' command, and using a temporary branch and tag for safety
and easier sanity checking.
Date: Mon, 29 Aug 2005 21:39:02 -0700
Content-type: text/asciidoc
Message-ID: <7voe7g3uop.fsf@assigned-by-dhcp.cox.net>
How to revert an existing commit
================================
One of the changes I pulled into the 'master' branch turns out to
break building Git with GCC 2.95. While they were well-intentioned
portability fixes, keeping things working with gcc-2.95 was also
important. Here is what I did to revert the change in the 'master'
branch and to adjust the 'pu' branch, using core Git tools and
barebone Porcelain.
First, prepare a throw-away branch in case I screw things up.
------------------------------------------------
$ git checkout -b revert-c99 master
------------------------------------------------
Now I am on the 'revert-c99' branch. Let's figure out which commit to
revert. I happen to know that the top of the 'master' branch is a
merge, and its second parent (i.e. foreign commit I merged from) has
the change I would want to undo. Further I happen to know that that
merge introduced 5 commits or so:
------------------------------------------------
$ git show-branch --more=4 master master^2 | head
* [master] Merge refs/heads/portable from http://www.cs.berkeley....
! [master^2] Replace C99 array initializers with code.
--
- [master] Merge refs/heads/portable from http://www.cs.berkeley....
*+ [master^2] Replace C99 array initializers with code.
*+ [master^2~1] Replace unsetenv() and setenv() with older putenv().
*+ [master^2~2] Include sys/time.h in daemon.c.
*+ [master^2~3] Fix ?: statements.
*+ [master^2~4] Replace zero-length array decls with [].
* [master~1] tutorial note about git branch
------------------------------------------------
The '--more=4' above means "after we reach the merge base of refs,
show until we display four more common commits". That last commit
would have been where the "portable" branch was forked from the main
git.git repository, so this would show everything on both branches
since then. I just limited the output to the first handful using
'head'.
Now I know 'master^2~4' (pronounce it as "find the second parent of
the 'master', and then go four generations back following the first
parent") is the one I would want to revert. Since I also want to say
why I am reverting it, the '-n' flag is given to 'git revert'. This
prevents it from actually making a commit, and instead 'git revert'
leaves the commit log message it wanted to use in '.msg' file:
------------------------------------------------
$ git revert -n master^2~4
$ cat .msg
Revert "Replace zero-length array decls with []."
This reverts 6c5f9baa3bc0d63e141e0afc23110205379905a4 commit.
$ git diff HEAD ;# to make sure what we are reverting makes sense.
$ make CC=gcc-2.95 clean test ;# make sure it fixed the breakage.
$ make clean test ;# make sure it did not cause other breakage.
------------------------------------------------
The reverted change makes sense (from reading the 'diff' output), does
fix the problem (from 'make CC=gcc-2.95' test), and does not cause new
breakage (from the last 'make test'). I'm ready to commit:
------------------------------------------------
$ git commit -a -s ;# read .msg into the log,
# and explain why I am reverting.
------------------------------------------------
I could have screwed up in any of the above steps, but in the worst
case I could just have done 'git checkout master' to start over.
Fortunately I did not have to; what I have in the current branch
'revert-c99' is what I want. So merge that back into 'master':
------------------------------------------------
$ git checkout master
$ git merge revert-c99 ;# this should be a fast-forward
Updating from 10d781b9caa4f71495c7b34963bef137216f86a8 to e3a693c...
cache.h | 8 ++++----
commit.c | 2 +-
ls-files.c | 2 +-
receive-pack.c | 2 +-
server-info.c | 2 +-
5 files changed, 8 insertions(+), 8 deletions(-)
------------------------------------------------
There is no need to redo the test at this point. We fast-forwarded
and we know 'master' matches 'revert-c99' exactly. In fact:
------------------------------------------------
$ git diff master..revert-c99
------------------------------------------------
says nothing.
Then we rebase the 'pu' branch as usual.
------------------------------------------------
$ git checkout pu
$ git tag pu-anchor pu
$ git rebase master
* Applying: Redo "revert" using three-way merge machinery.
First trying simple merge strategy to cherry-pick.
* Applying: Remove git-apply-patch-script.
First trying simple merge strategy to cherry-pick.
Simple cherry-pick fails; trying Automatic cherry-pick.
Removing Documentation/git-apply-patch-script.txt
Removing git-apply-patch-script
* Applying: Document "git cherry-pick" and "git revert"
First trying simple merge strategy to cherry-pick.
* Applying: mailinfo and applymbox updates
First trying simple merge strategy to cherry-pick.
* Applying: Show commits in topo order and name all commits.
First trying simple merge strategy to cherry-pick.
* Applying: More documentation updates.
First trying simple merge strategy to cherry-pick.
------------------------------------------------
The temporary tag 'pu-anchor' is me just being careful, in case 'git
rebase' screws up. After this, I can do these for sanity check:
------------------------------------------------
$ git diff pu-anchor..pu ;# make sure we got the master fix.
$ make CC=gcc-2.95 clean test ;# make sure it fixed the breakage.
$ make clean test ;# make sure it did not cause other breakage.
------------------------------------------------
Everything is in the good order. I do not need the temporary branch
or tag anymore, so remove them:
------------------------------------------------
$ rm -f .git/refs/tags/pu-anchor
$ git branch -d revert-c99
------------------------------------------------
It was an emergency fix, so we might as well merge it into the
'release candidate' branch, although I expect the next release would
be some days off:
------------------------------------------------
$ git checkout rc
$ git pull . master
Packing 0 objects
Unpacking 0 objects
* commit-ish: e3a693c... refs/heads/master from .
Trying to merge e3a693c... into 8c1f5f0... using 10d781b...
Committed merge 7fb9b7262a1d1e0a47bbfdcbbcf50ce0635d3f8f
cache.h | 8 ++++----
commit.c | 2 +-
ls-files.c | 2 +-
receive-pack.c | 2 +-
server-info.c | 2 +-
5 files changed, 8 insertions(+), 8 deletions(-)
------------------------------------------------
And the final repository status looks like this:
------------------------------------------------
$ git show-branch --more=1 master pu rc
! [master] Revert "Replace zero-length array decls with []."
! [pu] git-repack: Add option to repack all objects.
* [rc] Merge refs/heads/master from .
---
+ [pu] git-repack: Add option to repack all objects.
+ [pu~1] More documentation updates.
+ [pu~2] Show commits in topo order and name all commits.
+ [pu~3] mailinfo and applymbox updates
+ [pu~4] Document "git cherry-pick" and "git revert"
+ [pu~5] Remove git-apply-patch-script.
+ [pu~6] Redo "revert" using three-way merge machinery.
- [rc] Merge refs/heads/master from .
++* [master] Revert "Replace zero-length array decls with []."
- [rc~1] Merge refs/heads/master from .
... [master~1] Merge refs/heads/portable from http://www.cs.berkeley....
------------------------------------------------

View file

@ -0,0 +1,94 @@
From: Junio C Hamano <gitster@pobox.com>
Subject: Separating topic branches
Abstract: In this article, JC describes how to separate topic branches.
Content-type: text/asciidoc
How to separate topic branches
==============================
This text was originally a footnote to a discussion about the
behaviour of the git diff commands.
Often I find myself doing that [running diff against something other
than HEAD] while rewriting messy development history. For example, I
start doing some work without knowing exactly where it leads, and end
up with a history like this:
"master"
o---o
\ "topic"
o---o---o---o---o---o
At this point, "topic" contains something I know I want, but it
contains two concepts that turned out to be completely independent.
And often, one topic component is larger than the other. It may
contain more than two topics.
In order to rewrite this mess to be more manageable, I would first do
"diff master..topic", to extract the changes into a single patch, start
picking pieces from it to get logically self-contained units, and
start building on top of "master":
$ git diff master..topic >P.diff
$ git checkout -b topicA master
... pick and apply pieces from P.diff to build
... commits on topicA branch.
o---o---o
/ "topicA"
o---o"master"
\ "topic"
o---o---o---o---o---o
Before doing each commit on "topicA" HEAD, I run "diff HEAD"
before update-index the affected paths, or "diff --cached HEAD"
after. Also I would run "diff --cached master" to make sure
that the changes are only the ones related to "topicA". Usually
I do this for smaller topics first.
After that, I'd do the remainder of the original "topic", but
for that, I do not start from the patchfile I extracted by
comparing "master" and "topic" I used initially. Still on
"topicA", I extract "diff topic", and use it to rebuild the
other topic:
$ git diff -R topic >P.diff ;# --cached also would work fine
$ git checkout -b topicB master
... pick and apply pieces from P.diff to build
... commits on topicB branch.
"topicB"
o---o---o---o---o
/
/o---o---o
|/ "topicA"
o---o"master"
\ "topic"
o---o---o---o---o---o
After I am done, I'd try a pretend-merge between "topicA" and
"topicB" in order to make sure I have not missed anything:
$ git pull . topicA ;# merge it into current "topicB"
$ git diff topic
"topicB"
o---o---o---o---o---* (pretend merge)
/ /
/o---o---o----------'
|/ "topicA"
o---o"master"
\ "topic"
o---o---o---o---o---o
The last diff better not to show anything other than cleanups
for crufts. Then I can finally clean things up:
$ git branch -D topic
$ git reset --hard HEAD^ ;# nuke pretend merge
"topicB"
o---o---o---o---o
/
/o---o---o
|/ "topicA"
o---o"master"

View file

@ -0,0 +1,285 @@
From: Rutger Nijlunsing <rutger@nospam.com>
Subject: Setting up a Git repository which can be pushed into and pulled from over HTTP(S).
Date: Thu, 10 Aug 2006 22:00:26 +0200
Content-type: text/asciidoc
How to setup Git server over http
=================================
NOTE: This document is from 2006. A lot has happened since then, and this
document is now relevant mainly if your web host is not CGI capable.
Almost everyone else should instead look at linkgit:git-http-backend[1].
Since Apache is one of those packages people like to compile
themselves while others prefer the bureaucrat's dream Debian, it is
impossible to give guidelines which will work for everyone. Just send
some feedback to the mailing list at git@vger.kernel.org to get this
document tailored to your favorite distro.
What's needed:
- Have an Apache web-server
On Debian:
$ apt-get install apache2
To get apache2 by default started,
edit /etc/default/apache2 and set NO_START=0
- can edit the configuration of it.
This could be found under /etc/httpd, or refer to your Apache documentation.
On Debian: this means being able to edit files under /etc/apache2
- can restart it.
'apachectl --graceful' might do. If it doesn't, just stop and
restart apache. Be warning that active connections to your server
might be aborted by this.
On Debian:
$ /etc/init.d/apache2 restart
or
$ /etc/init.d/apache2 force-reload
(which seems to do the same)
This adds symlinks from the /etc/apache2/mods-enabled to
/etc/apache2/mods-available.
- have permissions to chown a directory
- have Git installed on the client, and
- either have Git installed on the server or have a webdav client on
the client.
In effect, this means you're going to be root, or that you're using a
preconfigured WebDAV server.
Step 1: setup a bare Git repository
-----------------------------------
At the time of writing, git-http-push cannot remotely create a Git
repository. So we have to do that at the server side with Git. Another
option is to generate an empty bare repository at the client and copy
it to the server with a WebDAV client (which is the only option if Git
is not installed on the server).
Create the directory under the DocumentRoot of the directories served
by Apache. As an example we take /usr/local/apache2, but try "grep
DocumentRoot /where/ever/httpd.conf" to find your root:
$ cd /usr/local/apache/htdocs
$ mkdir my-new-repo.git
On Debian:
$ cd /var/www
$ mkdir my-new-repo.git
Initialize a bare repository
$ cd my-new-repo.git
$ git --bare init
Change the ownership to your web-server's credentials. Use `"grep ^User
httpd.conf"` and `"grep ^Group httpd.conf"` to find out:
$ chown -R www.www .
On Debian:
$ chown -R www-data.www-data .
If you do not know which user Apache runs as, you can alternatively do
a "chmod -R a+w .", inspect the files which are created later on, and
set the permissions appropriately.
Restart apache2, and check whether http://server/my-new-repo.git gives
a directory listing. If not, check whether apache started up
successfully.
Step 2: enable DAV on this repository
-------------------------------------
First make sure the dav_module is loaded. For this, insert in httpd.conf:
LoadModule dav_module libexec/httpd/libdav.so
AddModule mod_dav.c
Also make sure that this line exists which is the file used for
locking DAV operations:
DAVLockDB "/usr/local/apache2/temp/DAV.lock"
On Debian these steps can be performed with:
Enable the dav and dav_fs modules of apache:
$ a2enmod dav_fs
(just to be sure. dav_fs might be unneeded, I don't know)
$ a2enmod dav
The DAV lock is located in /etc/apache2/mods-available/dav_fs.conf:
DAVLockDB /var/lock/apache2/DAVLock
Of course, it can point somewhere else, but the string is actually just a
prefix in some Apache configurations, and therefore the _directory_ has to
be writable by the user Apache runs as.
Then, add something like this to your httpd.conf
<Location /my-new-repo.git>
DAV on
AuthType Basic
AuthName "Git"
AuthUserFile /usr/local/apache2/conf/passwd.git
Require valid-user
</Location>
On Debian:
Create (or add to) /etc/apache2/conf.d/git.conf :
<Location /my-new-repo.git>
DAV on
AuthType Basic
AuthName "Git"
AuthUserFile /etc/apache2/passwd.git
Require valid-user
</Location>
Debian automatically reads all files under /etc/apache2/conf.d.
The password file can be somewhere else, but it has to be readable by
Apache and preferably not readable by the world.
Create this file by
$ htpasswd -c /usr/local/apache2/conf/passwd.git <user>
On Debian:
$ htpasswd -c /etc/apache2/passwd.git <user>
You will be asked a password, and the file is created. Subsequent calls
to htpasswd should omit the '-c' option, since you want to append to the
existing file.
You need to restart Apache.
Now go to http://<username>@<servername>/my-new-repo.git in your
browser to check whether it asks for a password and accepts the right
password.
On Debian:
To test the WebDAV part, do:
$ apt-get install litmus
$ litmus http://<servername>/my-new-repo.git <username> <password>
Most tests should pass.
A command-line tool to test WebDAV is cadaver. If you prefer GUIs, for
example, konqueror can open WebDAV URLs as "webdav://..." or
"webdavs://...".
If you're into Windows, from XP onwards Internet Explorer supports
WebDAV. For this, do Internet Explorer -> Open Location ->
http://<servername>/my-new-repo.git [x] Open as webfolder -> login .
Step 3: setup the client
------------------------
Make sure that you have HTTP support, i.e. your Git was built with
libcurl (version more recent than 7.10). The command 'git http-push' with
no argument should display a usage message.
Then, add the following to your $HOME/.netrc (you can do without, but will be
asked to input your password a _lot_ of times):
machine <servername>
login <username>
password <password>
...and set permissions:
chmod 600 ~/.netrc
If you want to access the web-server by its IP, you have to type that in,
instead of the server name.
To check whether all is OK, do:
curl --netrc --location -v http://<username>@<servername>/my-new-repo.git/HEAD
...this should give something like 'ref: refs/heads/master', which is
the content of the file HEAD on the server.
Now, add the remote in your existing repository which contains the project
you want to export:
$ git-config remote.upload.url \
http://<username>@<servername>/my-new-repo.git/
It is important to put the last '/'; Without it, the server will send
a redirect which git-http-push does not (yet) understand, and git-http-push
will repeat the request infinitely.
Step 4: make the initial push
-----------------------------
From your client repository, do
$ git push upload master
This pushes branch 'master' (which is assumed to be the branch you
want to export) to repository called 'upload', which we previously
defined with git-config.
Using a proxy:
--------------
If you have to access the WebDAV server from behind an HTTP(S) proxy,
set the variable 'all_proxy' to `http://proxy-host.com:port`, or
`http://login-on-proxy:passwd-on-proxy@proxy-host.com:port`. See 'man
curl' for details.
Troubleshooting:
----------------
If git-http-push says
Error: no DAV locking support on remote repo http://...
then it means the web-server did not accept your authentication. Make sure
that the user name and password matches in httpd.conf, .netrc and the URL
you are uploading to.
If git-http-push shows you an error (22/502) when trying to MOVE a blob,
it means that your web-server somehow does not recognize its name in the
request; This can happen when you start Apache, but then disable the
network interface. A simple restart of Apache helps.
Errors like (22/502) are of format (curl error code/http error
code). So (22/404) means something like 'not found' at the server.
Reading /usr/local/apache2/logs/error_log is often helpful.
On Debian: Read /var/log/apache2/error.log instead.
If you access HTTPS locations, Git may fail verifying the SSL
certificate (this is return code 60). Setting http.sslVerify=false can
help diagnosing the problem, but removes security checks.
Debian References: http://www.debian-administration.org/articles/285
Authors
Johannes Schindelin <Johannes.Schindelin@gmx.de>
Rutger Nijlunsing <git@wingding.demon.nl>
Matthieu Moy <Matthieu.Moy@imag.fr>

View file

@ -0,0 +1,192 @@
From: Junio C Hamano <gitster@pobox.com> and Carl Baldwin <cnb@fc.hp.com>
Subject: control access to branches.
Date: Thu, 17 Nov 2005 23:55:32 -0800
Message-ID: <7vfypumlu3.fsf@assigned-by-dhcp.cox.net>
Abstract: An example hooks/update script is presented to
implement repository maintenance policies, such as who can push
into which branch and who can make a tag.
Content-type: text/asciidoc
How to use the update hook
==========================
When your developer runs git-push into the repository,
git-receive-pack is run (either locally or over ssh) as that
developer, so is hooks/update script. Quoting from the relevant
section of the documentation:
Before each ref is updated, if $GIT_DIR/hooks/update file exists
and executable, it is called with three parameters:
$GIT_DIR/hooks/update refname sha1-old sha1-new
The refname parameter is relative to $GIT_DIR; e.g. for the
master head this is "refs/heads/master". Two sha1 are the
object names for the refname before and after the update. Note
that the hook is called before the refname is updated, so either
sha1-old is 0{40} (meaning there is no such ref yet), or it
should match what is recorded in refname.
So if your policy is (1) always require fast-forward push
(i.e. never allow "git-push repo +branch:branch"), (2) you
have a list of users allowed to update each branch, and (3) you
do not let tags to be overwritten, then you can use something
like this as your hooks/update script.
[jc: editorial note. This is a much improved version by Carl
since I posted the original outline]
----------------------------------------------------
#!/bin/bash
umask 002
# If you are having trouble with this access control hook script
# you can try setting this to true. It will tell you exactly
# why a user is being allowed/denied access.
verbose=false
# Default shell globbing messes things up downstream
GLOBIGNORE=*
function grant {
$verbose && echo >&2 "-Grant- $1"
echo grant
exit 0
}
function deny {
$verbose && echo >&2 "-Deny- $1"
echo deny
exit 1
}
function info {
$verbose && echo >&2 "-Info- $1"
}
# Implement generic branch and tag policies.
# - Tags should not be updated once created.
# - Branches should only be fast-forwarded unless their pattern starts with '+'
case "$1" in
refs/tags/*)
git rev-parse --verify -q "$1" &&
deny >/dev/null "You can't overwrite an existing tag"
;;
refs/heads/*)
# No rebasing or rewinding
if expr "$2" : '0*$' >/dev/null; then
info "The branch '$1' is new..."
else
# updating -- make sure it is a fast-forward
mb=$(git merge-base "$2" "$3")
case "$mb,$2" in
"$2,$mb") info "Update is fast-forward" ;;
*) noff=y; info "This is not a fast-forward update.";;
esac
fi
;;
*)
deny >/dev/null \
"Branch is not under refs/heads or refs/tags. What are you trying to do?"
;;
esac
# Implement per-branch controls based on username
allowed_users_file=$GIT_DIR/info/allowed-users
username=$(id -u -n)
info "The user is: '$username'"
if test -f "$allowed_users_file"
then
rc=$(cat $allowed_users_file | grep -v '^#' | grep -v '^$' |
while read heads user_patterns
do
# does this rule apply to us?
head_pattern=${heads#+}
matchlen=$(expr "$1" : "${head_pattern#+}")
test "$matchlen" = ${#1} || continue
# if non-ff, $heads must be with the '+' prefix
test -n "$noff" &&
test "$head_pattern" = "$heads" && continue
info "Found matching head pattern: '$head_pattern'"
for user_pattern in $user_patterns; do
info "Checking user: '$username' against pattern: '$user_pattern'"
matchlen=$(expr "$username" : "$user_pattern")
if test "$matchlen" = "${#username}"
then
grant "Allowing user: '$username' with pattern: '$user_pattern'"
fi
done
deny "The user is not in the access list for this branch"
done
)
case "$rc" in
grant) grant >/dev/null "Granting access based on $allowed_users_file" ;;
deny) deny >/dev/null "Denying access based on $allowed_users_file" ;;
*) ;;
esac
fi
allowed_groups_file=$GIT_DIR/info/allowed-groups
groups=$(id -G -n)
info "The user belongs to the following groups:"
info "'$groups'"
if test -f "$allowed_groups_file"
then
rc=$(cat $allowed_groups_file | grep -v '^#' | grep -v '^$' |
while read heads group_patterns
do
# does this rule apply to us?
head_pattern=${heads#+}
matchlen=$(expr "$1" : "${head_pattern#+}")
test "$matchlen" = ${#1} || continue
# if non-ff, $heads must be with the '+' prefix
test -n "$noff" &&
test "$head_pattern" = "$heads" && continue
info "Found matching head pattern: '$head_pattern'"
for group_pattern in $group_patterns; do
for groupname in $groups; do
info "Checking group: '$groupname' against pattern: '$group_pattern'"
matchlen=$(expr "$groupname" : "$group_pattern")
if test "$matchlen" = "${#groupname}"
then
grant "Allowing group: '$groupname' with pattern: '$group_pattern'"
fi
done
done
deny "None of the user's groups are in the access list for this branch"
done
)
case "$rc" in
grant) grant >/dev/null "Granting access based on $allowed_groups_file" ;;
deny) deny >/dev/null "Denying access based on $allowed_groups_file" ;;
*) ;;
esac
fi
deny >/dev/null "There are no more rules to check. Denying access"
----------------------------------------------------
This uses two files, $GIT_DIR/info/allowed-users and
allowed-groups, to describe which heads can be pushed into by
whom. The format of each file would look like this:
refs/heads/master junio
+refs/heads/pu junio
refs/heads/cogito$ pasky
refs/heads/bw/.* linus
refs/heads/tmp/.* .*
refs/tags/v[0-9].* junio
With this, Linus can push or create "bw/penguin" or "bw/zebra"
or "bw/panda" branches, Pasky can do only "cogito", and JC can
do master and pu branches and make versioned tags. And anybody
can do tmp/blah branches. The '+' sign at the pu record means
that JC can make non-fast-forward pushes on it.

View file

@ -0,0 +1,54 @@
Content-type: text/asciidoc
How to use git-daemon
=====================
Git can be run in inetd mode and in stand alone mode. But all you want is
let a coworker pull from you, and therefore need to set up a Git server
real quick, right?
Note that git-daemon is not really chatty at the moment, especially when
things do not go according to plan (e.g. a socket could not be bound).
Another word of warning: if you run
$ git ls-remote git://127.0.0.1/rule-the-world.git
and you see a message like
fatal: The remote end hung up unexpectedly
it only means that _something_ went wrong. To find out _what_ went wrong,
you have to ask the server. (Git refuses to be more precise for your
security only. Take off your shoes now. You have any coins in your pockets?
Sorry, not allowed -- who knows what you planned to do with them?)
With these two caveats, let's see an example:
$ git daemon --reuseaddr --verbose --base-path=/home/gitte/git \
--export-all -- /home/gitte/git/rule-the-world.git
(Of course, unless your user name is `gitte` _and_ your repository is in
~/rule-the-world.git, you have to adjust the paths. If your repository is
not bare, be aware that you have to type the path to the .git directory!)
This invocation tries to reuse the address if it is already taken
(this can save you some debugging, because otherwise killing and restarting
git-daemon could just silently fail to bind to a socket).
Also, it is (relatively) verbose when somebody actually connects to it.
It also sets the base path, which means that all the projects which can be
accessed using this daemon have to reside in or under that path.
The option `--export-all` just means that you _don't_ have to create a
file named `git-daemon-export-ok` in each exported repository. (Otherwise,
git-daemon would complain loudly, and refuse to cooperate.)
Last of all, the repository which should be exported is specified. It is
a good practice to put the paths after a "--" separator.
Now, test your daemon with
$ git ls-remote git://127.0.0.1/rule-the-world.git
If this does not work, find out why, and submit a patch to this document.

View file

@ -0,0 +1,75 @@
Date: Sat, 5 Jan 2008 20:17:40 -0500
From: Sean <seanlkml@sympatico.ca>
To: Miklos Vajna <vmiklos@frugalware.org>
Cc: git@vger.kernel.org
Subject: how to use git merge -s subtree?
Abstract: In this article, Sean demonstrates how one can use the subtree merge
strategy.
Content-type: text/asciidoc
Message-ID: <BAYC1-PASMTP12374B54BA370A1E1C6E78AE4E0@CEZ.ICE>
How to use the subtree merge strategy
=====================================
There are situations where you want to include contents in your project
from an independently developed project. You can just pull from the
other project as long as there are no conflicting paths.
The problematic case is when there are conflicting files. Potential
candidates are Makefiles and other standard filenames. You could merge
these files but probably you do not want to. A better solution for this
problem can be to merge the project as its own subdirectory. This is not
supported by the 'recursive' merge strategy, so just pulling won't work.
What you want is the 'subtree' merge strategy, which helps you in such a
situation.
In this example, let's say you have the repository at `/path/to/B` (but
it can be a URL as well, if you want). You want to merge the 'master'
branch of that repository to the `dir-B` subdirectory in your current
branch.
Here is the command sequence you need:
----------------
$ git remote add -f Bproject /path/to/B <1>
$ git merge -s ours --no-commit --allow-unrelated-histories Bproject/master <2>
$ git read-tree --prefix=dir-B/ -u Bproject/master <3>
$ git commit -m "Merge B project as our subdirectory" <4>
$ git pull -s subtree Bproject master <5>
----------------
<1> name the other project "Bproject", and fetch.
<2> prepare for the later step to record the result as a merge.
<3> read "master" branch of Bproject to the subdirectory "dir-B".
<4> record the merge result.
<5> maintain the result with subsequent merges using "subtree"
The first four commands are used for the initial merge, while the last
one is to merge updates from 'B project'.
Comparing 'subtree' merge with submodules
-----------------------------------------
- The benefit of using subtree merge is that it requires less
administrative burden from the users of your repository. It works with
older (before Git v1.5.2) clients and you have the code right after
clone.
- However if you use submodules then you can choose not to transfer the
submodule objects. This may be a problem with the subtree merge.
- Also, in case you make changes to the other project, it is easier to
submit changes if you just use submodules.
Additional tips
---------------
- If you made changes to the other project in your repository, they may
want to merge from your project. This is possible using subtree -- it
can shift up the paths in your tree and then they can merge only the
relevant parts of your tree.
- Please note that if the other project merges from you, then it will
connect its history to yours, which can be something they don't want
to.

View file

@ -0,0 +1,217 @@
From: Junio C Hamano <gitster@pobox.com>
Date: Tue, 17 Jan 2011 13:00:00 -0800
Subject: Using signed tag in pull requests
Abstract: Beginning v1.7.9, a contributor can push a signed tag to her
publishing repository and ask her integrator to pull it. This assures the
integrator that the pulled history is authentic and allows others to
later validate it.
Content-type: text/asciidoc
How to use a signed tag in pull requests
========================================
A typical distributed workflow using Git is for a contributor to fork a
project, build on it, publish the result to her public repository, and ask
the "upstream" person (often the owner of the project where she forked
from) to pull from her public repository. Requesting such a "pull" is made
easy by the `git request-pull` command.
Earlier, a typical pull request may have started like this:
------------
The following changes since commit 406da78032179...:
Froboz 3.2 (2011-09-30 14:20:57 -0700)
are available in the Git repository at:
example.com:/git/froboz.git for-xyzzy
------------
followed by a shortlog of the changes and a diffstat.
The request was for a branch name (e.g. `for-xyzzy`) in the public
repository of the contributor, and even though it stated where the
contributor forked her work from, the message did not say anything about
the commit to expect at the tip of the for-xyzzy branch. If the site that
hosts the public repository of the contributor cannot be fully trusted, it
was unnecessarily hard to make sure what was pulled by the integrator was
genuinely what the contributor had produced for the project. Also there
was no easy way for third-party auditors to later verify the resulting
history.
Starting from Git release v1.7.9, a contributor can add a signed tag to
the commit at the tip of the history and ask the integrator to pull that
signed tag. When the integrator runs `git pull`, the signed tag is
automatically verified to assure that the history is not tampered with.
In addition, the resulting merge commit records the content of the signed
tag, so that other people can verify that the branch merged by the
integrator was signed by the contributor, without fetching the signed tag
used to validate the pull request separately and keeping it in the refs
namespace.
This document describes the workflow between the contributor and the
integrator, using Git v1.7.9 or later.
A contributor or a lieutenant
-----------------------------
After preparing her work to be pulled, the contributor uses `git tag -s`
to create a signed tag:
------------
$ git checkout work
$ ... "git pull" from sublieutenants, "git commit" your own work ...
$ git tag -s -m "Completed frotz feature" frotz-for-xyzzy work
------------
Note that this example uses the `-m` option to create a signed tag with
just a one-liner message, but this is for illustration purposes only. It
is advisable to compose a well-written explanation of what the topic does
to justify why it is worthwhile for the integrator to pull it, as this
message will eventually become part of the final history after the
integrator responds to the pull request (as we will see later).
Then she pushes the tag out to her public repository:
------------
$ git push example.com:/git/froboz.git/ +frotz-for-xyzzy
------------
There is no need to push the `work` branch or anything else.
Note that the above command line used a plus sign at the beginning of
`+frotz-for-xyzzy` to allow forcing the update of a tag, as the same
contributor may want to reuse a signed tag with the same name after the
previous pull request has already been responded to.
The contributor then prepares a message to request a "pull":
------------
$ git request-pull v3.2 example.com:/git/froboz.git/ frotz-for-xyzzy >msg.txt
------------
The arguments are:
. the version of the integrator's commit the contributor based her work on;
. the URL of the repository, to which the contributor has pushed what she
wants to get pulled; and
. the name of the tag the contributor wants to get pulled (earlier, she could
write only a branch name here).
The resulting msg.txt file begins like so:
------------
The following changes since commit 406da78032179...:
Froboz 3.2 (2011-09-30 14:20:57 -0700)
are available in the Git repository at:
example.com:/git/froboz.git tags/frotz-for-xyzzy
for you to fetch changes up to 703f05ad5835c...:
Add tests and documentation for frotz (2011-12-02 10:02:52 -0800)
-----------------------------------------------
Completed frotz feature
-----------------------------------------------
------------
followed by a shortlog of the changes and a diffstat. Comparing this with
the earlier illustration of the output from the traditional `git request-pull`
command, the reader should notice that:
. The tip commit to expect is shown to the integrator; and
. The signed tag message is shown prominently between the dashed lines
before the shortlog.
The latter is why the contributor would want to justify why pulling her
work is worthwhile when creating the signed tag. The contributor then
opens her favorite MUA, reads msg.txt, edits and sends it to her upstream
integrator.
Integrator
----------
After receiving such a pull request message, the integrator fetches and
integrates the tag named in the request, with:
------------
$ git pull example.com:/git/froboz.git/ tags/frotz-for-xyzzy
------------
This operation will always open an editor to allow the integrator to fine
tune the commit log message when merging a signed tag. Also, pulling a
signed tag will always create a merge commit even when the integrator does
not have any new commit since the contributor's work forked (i.e. 'fast
forward'), so that the integrator can properly explain what the merge is
about and why it was made.
In the editor, the integrator will see something like this:
------------
Merge tag 'frotz-for-xyzzy' of example.com:/git/froboz.git/
Completed frotz feature
# gpg: Signature made Fri 02 Dec 2011 10:03:01 AM PST using RSA key ID 96AFE6CB
# gpg: Good signature from "Con Tributor <nitfol@example.com>"
------------
Notice that the message recorded in the signed tag "Completed frotz
feature" appears here, and again that is why it is important for the
contributor to explain her work well when creating the signed tag.
As usual, the lines commented with `#` are stripped out. The resulting
commit records the signed tag used for this validation in a hidden field
so that it can later be used by others to audit the history. There is no
need for the integrator to keep a separate copy of the tag in his
repository (i.e. `git tag -l` won't list the `frotz-for-xyzzy` tag in the
above example), and there is no need to publish the tag to his public
repository, either.
After the integrator responds to the pull request and her work becomes
part of the permanent history, the contributor can remove the tag from
her public repository, if she chooses, in order to keep the tag namespace
of her public repository clean, with:
------------
$ git push example.com:/git/froboz.git :frotz-for-xyzzy
------------
Auditors
--------
The `--show-signature` option can be given to `git log` or `git show` and
shows the verification status of the embedded signed tag in merge commits
created when the integrator responded to a pull request of a signed tag.
A typical output from `git show --show-signature` may look like this:
------------
$ git show --show-signature
commit 02306ef6a3498a39118aef9df7975bdb50091585
merged tag 'frotz-for-xyzzy'
gpg: Signature made Fri 06 Jan 2012 12:41:49 PM PST using RSA key ID 96AFE6CB
gpg: Good signature from "Con Tributor <nitfol@example.com>"
Merge: 406da78 703f05a
Author: Inte Grator <xyzzy@example.com>
Date: Tue Jan 17 13:49:41 2012 -0800
Merge tag 'frotz-for-xyzzy' of example.com:/git/froboz.git/
Completed frotz feature
* tag 'frotz-for-xyzzy' (100 commits)
Add tests and documentation for frotz
...
------------
There is no need for the auditor to explicitly fetch the contributor's
signature, or to even be aware of what tag(s) the contributor and integrator
used to communicate the signature. All the required information is recorded
as part of the merge commit.