Mercurial > hg
view tests/test-issue2137.t @ 31210:e1d035905b2e
similar: compare between actual file contents for exact identity
Before this patch, similarity detection logic (for addremove and
automv) depends entirely on SHA-1 digesting. But this causes incorrect
rename detection, if:
- removing file A and adding file B occur at same committing, and
- SHA-1 hash values of file A and B are same
This may prevent security experts from managing sample files for
SHAttered issue in Mercurial repository, for example.
https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html
https://shattered.it/
Hash collision itself isn't so serious for core repository
functionality of Mercurial, described by mpm as below, though.
https://www.mercurial-scm.org/wiki/mpm/SHA1
This patch compares between actual file contents after hash comparison
for exact identity.
Even after this patch, SHA-1 is still used, because it is reasonable
enough to quickly detect existence of "(almost) same" file.
- replacing SHA-1 causes decreasing performance, and
- replacement of it has ambiguity, yet
Getting content of removed file (= rfctx.data()) at each exact
comparison should be cheap enough, even though getting content of
added one costs much.
======= ============== =====================
file fctx data() reads from
======= ============== =====================
removed filectx in-memory revlog data
added workingfilectx storage
======= ============== =====================
author | FUJIWARA Katsunori <foozy@lares.dti.ne.jp> |
---|---|
date | Fri, 03 Mar 2017 02:57:06 +0900 |
parents | 2fc86d92c4a9 |
children | c4ccc73f9d49 |
line wrap: on
line source
https://bz.mercurial-scm.org/2137 Setup: create a little extension that has 3 side-effects: 1) ensure changelog data is not inlined 2) make revlog to use lazyparser 3) test that repo.lookup() works 1 and 2 are preconditions for the bug; 3 is the bug. $ cat > commitwrapper.py <<EOF > from mercurial import extensions, node, revlog > > def reposetup(ui, repo): > class wraprepo(repo.__class__): > def commit(self, *args, **kwargs): > result = super(wraprepo, self).commit(*args, **kwargs) > tip1 = node.short(repo.changelog.tip()) > tip2 = node.short(repo.lookup(tip1)) > assert tip1 == tip2 > ui.write('new tip: %s\n' % tip1) > return result > repo.__class__ = wraprepo > > def extsetup(ui): > revlog._maxinline = 8 # split out 00changelog.d early > revlog._prereadsize = 8 # use revlog.lazyparser > EOF $ cat >> $HGRCPATH <<EOF > [extensions] > commitwrapper = `pwd`/commitwrapper.py > EOF $ hg init repo1 $ cd repo1 $ echo a > a $ hg commit -A -m'add a with a long commit message to make the changelog a bit bigger' adding a new tip: 553596fad57b Test that new changesets are visible to repo.lookup(): $ echo a >> a $ hg commit -m'one more commit to demonstrate the bug' new tip: 799ae3599e0e $ hg tip changeset: 1:799ae3599e0e tag: tip user: test date: Thu Jan 01 00:00:00 1970 +0000 summary: one more commit to demonstrate the bug $ cd ..