Mercurial > hg
view tests/test-convert-tagsbranch-topology.t @ 31210:e1d035905b2e
similar: compare between actual file contents for exact identity
Before this patch, similarity detection logic (for addremove and
automv) depends entirely on SHA-1 digesting. But this causes incorrect
rename detection, if:
- removing file A and adding file B occur at same committing, and
- SHA-1 hash values of file A and B are same
This may prevent security experts from managing sample files for
SHAttered issue in Mercurial repository, for example.
https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html
https://shattered.it/
Hash collision itself isn't so serious for core repository
functionality of Mercurial, described by mpm as below, though.
https://www.mercurial-scm.org/wiki/mpm/SHA1
This patch compares between actual file contents after hash comparison
for exact identity.
Even after this patch, SHA-1 is still used, because it is reasonable
enough to quickly detect existence of "(almost) same" file.
- replacing SHA-1 causes decreasing performance, and
- replacement of it has ambiguity, yet
Getting content of removed file (= rfctx.data()) at each exact
comparison should be cheap enough, even though getting content of
added one costs much.
======= ============== =====================
file fctx data() reads from
======= ============== =====================
removed filectx in-memory revlog data
added workingfilectx storage
======= ============== =====================
author | FUJIWARA Katsunori <foozy@lares.dti.ne.jp> |
---|---|
date | Fri, 03 Mar 2017 02:57:06 +0900 |
parents | 86fe3c404c1e |
children |
line wrap: on
line source
#require git $ echo "[core]" >> $HOME/.gitconfig $ echo "autocrlf = false" >> $HOME/.gitconfig $ echo "[core]" >> $HOME/.gitconfig $ echo "autocrlf = false" >> $HOME/.gitconfig $ cat <<EOF >> $HGRCPATH > [extensions] > convert = > [convert] > hg.usebranchnames = True > hg.tagsbranch = tags-update > EOF $ GIT_AUTHOR_NAME='test'; export GIT_AUTHOR_NAME $ GIT_AUTHOR_EMAIL='test@example.org'; export GIT_AUTHOR_EMAIL $ GIT_AUTHOR_DATE="2007-01-01 00:00:00 +0000"; export GIT_AUTHOR_DATE $ GIT_COMMITTER_NAME="$GIT_AUTHOR_NAME"; export GIT_COMMITTER_NAME $ GIT_COMMITTER_EMAIL="$GIT_AUTHOR_EMAIL"; export GIT_COMMITTER_EMAIL $ GIT_COMMITTER_DATE="$GIT_AUTHOR_DATE"; export GIT_COMMITTER_DATE $ count=10 $ action() > { > GIT_AUTHOR_DATE="2007-01-01 00:00:$count +0000" > GIT_COMMITTER_DATE="$GIT_AUTHOR_DATE" > git "$@" >/dev/null 2>/dev/null || echo "git command error" > count=`expr $count + 1` > } $ glog() > { > hg log -G --template '{rev} "{desc|firstline}" files: {files}\n' "$@" > } $ convertrepo() > { > hg convert --datesort git-repo hg-repo > } Build a GIT repo with at least 1 tag $ mkdir git-repo $ cd git-repo $ git init >/dev/null 2>&1 $ echo a > a $ git add a $ action commit -m "rev1" $ action tag -m "tag1" tag1 $ cd .. Convert without tags $ hg convert git-repo hg-repo --config convert.skiptags=True initializing destination hg-repo repository scanning source... sorting... converting... 0 rev1 updating bookmarks $ hg -R hg-repo tags tip 0:d98c8ad3a4cf $ rm -rf hg-repo Do a first conversion $ convertrepo initializing destination hg-repo repository scanning source... sorting... converting... 0 rev1 updating tags updating bookmarks Simulate upstream updates after first conversion $ cd git-repo $ echo b > a $ git add a $ action commit -m "rev2" $ action tag -m "tag2" tag2 $ cd .. Perform an incremental conversion $ convertrepo scanning source... sorting... converting... 0 rev2 updating tags updating bookmarks Print the log $ cd hg-repo $ glog o 3 "update tags" files: .hgtags | | o 2 "rev2" files: a | | o | 1 "update tags" files: .hgtags / o 0 "rev1" files: a $ cd ..