annotate tests/test-largefiles-small-disk.t @ 17616:9535a0dc41f2

store: implement fncache basic path encoding in C (This is not yet enabled; it will be turned on in a followup patch.) The path encoding performed by fncache is complex and (perhaps surprisingly) slow enough to negatively affect the overall performance of Mercurial. For a short path (< 120 bytes), the Python code can be reduced to a fairly tractable state machine that either determines that nothing needs to be done in a single pass, or performs the encoding in a second pass. For longer paths, we avoid the more complicated hashed encoding scheme for now, and fall back to Python. Raw performance: I measured in a repo containing 150,000 files in its tip manifest, with a median path name length of 57 bytes, and 95th percentile of 96 bytes. In this repo, the Python code takes 3.1 seconds to encode all path names, while the hybrid C-and-Python code (called from Python) takes 0.21 seconds, for a speedup of about 14. Across several other large repositories, I've measured the speedup from the C code at between 26x and 40x. For path names above 120 bytes where we must fall back to Python for hashed encoding, the speedup is about 1.7x. Thus absolute performance will depend strongly on the characteristics of a particular repository.
author Bryan O'Sullivan <bryano@fb.com>
date Tue, 18 Sep 2012 15:42:19 -0700
parents 0d91211dd12f
children c9db897d5a43
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
15571
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
1 Test how largefiles abort in case the disk runs full
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
2
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
3 $ cat > criple.py <<EOF
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
4 > import os, errno, shutil
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
5 > from mercurial import util
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
6 > #
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
7 > # this makes the original largefiles code abort:
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
8 > def copyfileobj(fsrc, fdst, length=16*1024):
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
9 > fdst.write(fsrc.read(4))
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
10 > raise IOError(errno.ENOSPC, os.strerror(errno.ENOSPC))
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
11 > shutil.copyfileobj = copyfileobj
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
12 > #
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
13 > # this makes the rewritten code abort:
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
14 > def filechunkiter(f, size=65536, limit=None):
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
15 > yield f.read(4)
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
16 > raise IOError(errno.ENOSPC, os.strerror(errno.ENOSPC))
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
17 > util.filechunkiter = filechunkiter
15572
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
18 > #
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
19 > def oslink(src, dest):
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
20 > raise OSError("no hardlinks, try copying instead")
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
21 > util.oslink = oslink
15571
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
22 > EOF
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
23
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
24 $ echo "[extensions]" >> $HGRCPATH
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
25 $ echo "largefiles =" >> $HGRCPATH
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
26
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
27 $ hg init alice
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
28 $ cd alice
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
29 $ echo "this is a very big file" > big
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
30 $ hg add --large big
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
31 $ hg commit --config extensions.criple=$TESTTMP/criple.py -m big
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
32 abort: No space left on device
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
33 [255]
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
34
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
35 The largefile is not created in .hg/largefiles:
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
36
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
37 $ ls .hg/largefiles
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
38 dirstate
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
39
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
40 The user cache is not even created:
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
41
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
42 >>> import os; os.path.exists("$HOME/.cache/largefiles/")
809788118aa2 largefiles: write .hg/largefiles/ files atomically
Martin Geisler <mg@aragost.com>
parents:
diff changeset
43 False
15572
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
44
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
45 Make the commit with space on the device:
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
46
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
47 $ hg commit -m big
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
48
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
49 Now make a clone with a full disk, and make sure lfutil.link function
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
50 makes copies instead of hardlinks:
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
51
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
52 $ cd ..
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
53 $ hg --config extensions.criple=$TESTTMP/criple.py clone --pull alice bob
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
54 requesting all changes
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
55 adding changesets
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
56 adding manifests
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
57 adding file changes
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
58 added 1 changesets with 1 changes to 1 files
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
59 updating to branch default
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
60 1 files updated, 0 files merged, 0 files removed, 0 files unresolved
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
61 getting changed largefiles
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
62 abort: No space left on device
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
63 [255]
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
64
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
65 The largefile is not created in .hg/largefiles:
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
66
926bc23d0b6a largefiles: copy files into .hg/largefiles atomically
Martin Geisler <mg@aragost.com>
parents: 15571
diff changeset
67 $ ls bob/.hg/largefiles