Mercurial > hg
view tests/test-convert-hg-source.t @ 23785:cb99bacb9b4e
branchcache: introduce revbranchcache for caching of revision branch names
It is expensive to retrieve the branch name of a revision. Very expensive when
creating a changectx and calling .branch() every time - slightly less when
using changelog.branchinfo().
Now, to speed things up, provide a way to cache the results on disk in an
efficient format. Each branchname is assigned a number, and for each revision
we store the number of the corresponding branch name. The branch names are
stored in a dedicated file which is strictly append only.
Branch names are usually reused across several revisions, and the total list of
branch names will thus be so small that it is feasible to read the whole set of
names before using the cache. It will however do that it might be more
efficient to use the changelog for retrieving the branch info for a single
revision.
The revision entries are stored in another file. This file is usually append
only, but if the repository has been modified, the file will be truncated and
the relevant parts rewritten on demand.
The entries for each revision are 8 bytes each, and the whole revision file
will thus be 1/8 of 00changelog.i.
Each revision entry contains the first 4 bytes of the corresponding node hash.
This is used as a check sum that always is verified before the entry is used.
That check is relatively expensive but it makes sure history modification is
detected and handled correctly. It will also detect and handle most revision
file corruptions.
This is just a cache. A new format can always be introduced if other
requirements or ideas make that seem like a good idea. Rebuilding the cache is
not really more expensive than it was to run for example 'hg log -b branchname'
before this cache was introduced.
This new method is still unused but promise to make some operations several
times faster once it actually is used.
Abandoning Python 2.4 would make it possible to implement this more efficiently
by using struct classes and pack_into. The Python code could probably also be
micro optimized or it could be implemented very efficiently in C where it would
be easy to control the data access.
author | Mads Kiilerich <madski@unity3d.com> |
---|---|
date | Thu, 08 Jan 2015 00:01:03 +0100 |
parents | a3c2d9211294 |
children | 884ef09cf658 |
line wrap: on
line source
$ cat >> $HGRCPATH <<EOF > [extensions] > convert= > [convert] > hg.saverev=False > EOF $ hg init orig $ cd orig $ echo foo > foo $ echo bar > bar $ hg ci -qAm 'add foo bar' -d '0 0' $ echo >> foo $ hg ci -m 'change foo' -d '1 0' $ hg up -qC 0 $ hg copy --after --force foo bar $ hg copy foo baz $ hg ci -m 'make bar and baz copies of foo' -d '2 0' created new head Test that template can print all file copies (issue4362) $ hg log -r . --template "{file_copies % ' File: {file_copy}\n'}" File: bar (foo) File: baz (foo) $ hg bookmark premerge1 $ hg merge -r 1 merging baz and foo to baz 1 files updated, 1 files merged, 0 files removed, 0 files unresolved (branch merge, don't forget to commit) $ hg ci -m 'merge local copy' -d '3 0' $ hg up -C 1 1 files updated, 0 files merged, 1 files removed, 0 files unresolved (leaving bookmark premerge1) $ hg bookmark premerge2 $ hg merge 2 merging foo and baz to baz 1 files updated, 1 files merged, 0 files removed, 0 files unresolved (branch merge, don't forget to commit) $ hg ci -m 'merge remote copy' -d '4 0' created new head #if execbit $ chmod +x baz #else $ echo some other change to make sure we get a rev 5 > baz #endif $ hg ci -m 'mark baz executable' -d '5 0' $ cd .. $ hg convert --datesort orig new 2>&1 | grep -v 'subversion python bindings could not be loaded' initializing destination new repository scanning source... sorting... converting... 5 add foo bar 4 change foo 3 make bar and baz copies of foo 2 merge local copy 1 merge remote copy 0 mark baz executable updating bookmarks $ cd new $ hg out ../orig comparing with ../orig searching for changes no changes found [1] #if execbit $ hg bookmarks premerge1 3:973ef48a98a4 premerge2 5:13d9b87cf8f8 #else Different hash because no x bit $ hg bookmarks premerge1 3:973ef48a98a4 premerge2 5:df0779bcf33c #endif $ cd .. check shamap LF and CRLF handling $ cat > rewrite.py <<EOF > import sys > # Interlace LF and CRLF > lines = [(l.rstrip() + ((i % 2) and '\n' or '\r\n')) > for i, l in enumerate(file(sys.argv[1]))] > file(sys.argv[1], 'wb').write(''.join(lines)) > EOF $ python rewrite.py new/.hg/shamap $ cd orig $ hg up -qC 1 $ echo foo >> foo $ hg ci -qm 'change foo again' $ hg up -qC 2 $ echo foo >> foo $ hg ci -qm 'change foo again again' $ cd .. $ hg convert --datesort orig new 2>&1 | grep -v 'subversion python bindings could not be loaded' scanning source... sorting... converting... 1 change foo again again 0 change foo again updating bookmarks init broken repository $ hg init broken $ cd broken $ echo a >> a $ echo b >> b $ hg ci -qAm init $ echo a >> a $ echo b >> b $ hg copy b c $ hg ci -qAm changeall $ hg up -qC 0 $ echo bc >> b $ hg ci -m changebagain created new head $ HGMERGE=internal:local hg -q merge $ hg ci -m merge $ hg mv b d $ hg ci -m moveb break it $ rm .hg/store/data/b.* $ cd .. $ hg --config convert.hg.ignoreerrors=True convert broken fixed initializing destination fixed repository scanning source... sorting... converting... 4 init ignoring: data/b.i@1e88685f5dde: no match found 3 changeall 2 changebagain 1 merge 0 moveb $ hg -R fixed verify checking changesets checking manifests crosschecking files in changesets and manifests checking files 3 files, 5 changesets, 5 total revisions manifest -r 0 $ hg -R fixed manifest -r 0 a manifest -r tip $ hg -R fixed manifest -r tip a c d