copies-rust: tokenize all paths into integer
Copy information for each changesets tend to affect a small new number of path.
However, each of these path might be handled a large number of time. Handling
HgPathBuf (aka `Vec<u8>`) is expensive. Handling integer is cheap.
With this patch we:
- turn any input path into an integer "token" early,
- do all the internal logic using such "token",
- turn "token" back into path right before returning a result.
This gives use a quite significant performance boost in our slower cases.
Repo Case Source-Rev Dest-Rev # of revisions old time new time Difference Factor time per rev
---------------------------------------------------------------------------------------------------------------------------------------------------------------
pypy x000_revs_x000_added_x_copies
8a3b5bfd266e 2c68e87c3efe : 6780 revs, 0.092828 s, 0.081225 s, -0.011603 s, × 0.8750, 11 µs/rev
pypy x0000_revs_x_added_0_copies
d1defd0dc478 c9cb1334cc78 : 43645 revs, 0.711975 s, 0.586011 s, -0.125964 s, × 0.8231, 13 µs/rev
pypy x0000_revs_xx000_added_x000_copies
08ea3258278e d9fa043f30c0 : 11316 revs, 0.124505 s, 0.114173 s, -0.010332 s, × 0.9170, 10 µs/rev
netbeans x000_revs_x000_added_x000_copies
ff453e9fee32 411350406ec2 : 5750 revs, 0.072072 s, 0.061004 s, -0.011068 s, × 0.8464, 10 µs/rev
netbeans x0000_revs_xx000_added_x000_copies
588c2d1ced70 1aad62e59ddd : 66949 revs, 0.682732 s, 0.535874 s, -0.146858 s, × 0.7849, 8 µs/rev
mozilla-central x00000_revs_x0000_added_x0000_copies
6832ae71433c 4c222a1d9a00 : 153721 revs, 1.935918 s, 1.781383 s, -0.154535 s, × 0.9202, 11 µs/rev
mozilla-central x00000_revs_x00000_added_x000_copies
76caed42cf7c 1daa622bbe42 : 204976 revs, 2.827320 s, 2.603867 s, -0.223453 s, × 0.9210, 12 µs/rev
mozilla-try x0000_revs_xx000_added_x000_copies
89294cd501d9 7ccb2fc7ccb5 : 97052 revs, 3.243010 s, 1.529120 s, -1.713890 s, × 0.4715, 15 µs/rev
mozilla-try x00000_revs_x_added_0_copies
6a320851d377 1ebb79acd503 : 363753 revs, 5.693818 s, 4.842699 s, -0.851119 s, × 0.8505, 13 µs/rev
mozilla-try x00000_revs_x_added_x_copies
5173c4b6f97c 95d83ee7242d : 362229 revs, 5.677655 s, 4.761732 s, -0.915923 s, × 0.8387, 13 µs/rev
mozilla-try x00000_revs_x000_added_x_copies
9126823d0e9c ca82787bb23c : 359344 revs, 5.563370 s, 4.733912 s, -0.829458 s, × 0.8509, 13 µs/rev
mozilla-try x00000_revs_x0000_added_x0000_copies
8d3fafa80d4b eb884023b810 : 192665 revs, 2.864099 s, 2.593410 s, -0.270689 s, × 0.9055, 13 µs/rev
mozilla-try x00000_revs_x00000_added_x0000_copies
1b661134e2ca 1ae03d022d6d : 228985 revs, 113.297287 s, 41.041198 s, -72.256089 s, × 0.3622, 179 µs/rev
mozilla-try x00000_revs_x00000_added_x000_copies
9b2a99adc05e 8e29777b48e6 : 382065 revs, 59.498652 s, 27.915689 s, -31.582963 s, × 0.4692, 73 µs/rev
Full timing comparison between this revision and the previous one:
Repo Case Source-Rev Dest-Rev # of revisions old time new time Difference Factor time per rev
---------------------------------------------------------------------------------------------------------------------------------------------------------------
mercurial x_revs_x_added_0_copies
ad6b123de1c7 39cfcef4f463 : 1 revs, 0.000042 s, 0.000042 s, +0.000000 s, × 1.0000, 42 µs/rev
mercurial x_revs_x_added_x_copies
2b1c78674230 0c1d10351869 : 6 revs, 0.000104 s, 0.000110 s, +0.000006 s, × 1.0577, 18 µs/rev
mercurial x000_revs_x000_added_x_copies
81f8ff2a9bf2 dd3267698d84 : 1032 revs, 0.004913 s, 0.004918 s, +0.000005 s, × 1.0010, 4 µs/rev
pypy x_revs_x_added_0_copies
aed021ee8ae8 099ed31b181b : 9 revs, 0.000191 s, 0.000195 s, +0.000004 s, × 1.0209, 21 µs/rev
pypy x_revs_x000_added_0_copies
4aa4e1f8e19a 359343b9ac0e : 1 revs, 0.000050 s, 0.000049 s, -0.000001 s, × 0.9800, 49 µs/rev
pypy x_revs_x_added_x_copies
ac52eb7bbbb0 72e022663155 : 7 revs, 0.000112 s, 0.000112 s, +0.000000 s, × 1.0000, 16 µs/rev
pypy x_revs_x00_added_x_copies
c3b14617fbd7 ace7255d9a26 : 1 revs, 0.000288 s, 0.000324 s, +0.000036 s, × 1.1250, 324 µs/rev
pypy x_revs_x000_added_x000_copies
df6f7a526b60 a83dc6a2d56f : 6 revs, 0.010411 s, 0.010611 s, +0.000200 s, × 1.0192, 1768 µs/rev
pypy x000_revs_xx00_added_0_copies
89a76aede314 2f22446ff07e : 4785 revs, 0.052852 s, 0.050835 s, -0.002017 s, × 0.9618, 10 µs/rev
pypy x000_revs_x000_added_x_copies
8a3b5bfd266e 2c68e87c3efe : 6780 revs, 0.092828 s, 0.081225 s, -0.011603 s, × 0.8750, 11 µs/rev
pypy x000_revs_x000_added_x000_copies
89a76aede314 7b3dda341c84 : 5441 revs, 0.063269 s, 0.061291 s, -0.001978 s, × 0.9687, 11 µs/rev
pypy x0000_revs_x_added_0_copies
d1defd0dc478 c9cb1334cc78 : 43645 revs, 0.711975 s, 0.586011 s, -0.125964 s, × 0.8231, 13 µs/rev
pypy x0000_revs_xx000_added_0_copies
bf2c629d0071 4ffed77c095c : 2 revs, 0.012771 s, 0.012824 s, +0.000053 s, × 1.0042, 6412 µs/rev
pypy x0000_revs_xx000_added_x000_copies
08ea3258278e d9fa043f30c0 : 11316 revs, 0.124505 s, 0.114173 s, -0.010332 s, × 0.9170, 10 µs/rev
netbeans x_revs_x_added_0_copies
fb0955ffcbcd a01e9239f9e7 : 2 revs, 0.000082 s, 0.000085 s, +0.000003 s, × 1.0366, 42 µs/rev
netbeans x_revs_x000_added_0_copies
6f360122949f 20eb231cc7d0 : 2 revs, 0.000111 s, 0.000108 s, -0.000003 s, × 0.9730, 54 µs/rev
netbeans x_revs_x_added_x_copies
1ada3faf6fb6 5a39d12eecf4 : 3 revs, 0.000171 s, 0.000175 s, +0.000004 s, × 1.0234, 58 µs/rev
netbeans x_revs_x00_added_x_copies
35be93ba1e2c 9eec5e90c05f : 9 revs, 0.000708 s, 0.000719 s, +0.000011 s, × 1.0155, 79 µs/rev
netbeans x000_revs_xx00_added_0_copies
eac3045b4fdd 51d4ae7f1290 : 1421 revs, 0.010608 s, 0.010175 s, -0.000433 s, × 0.9592, 7 µs/rev
netbeans x000_revs_x000_added_x_copies
e2063d266acd 6081d72689dc : 1533 revs, 0.015635 s, 0.015569 s, -0.000066 s, × 0.9958, 10 µs/rev
netbeans x000_revs_x000_added_x000_copies
ff453e9fee32 411350406ec2 : 5750 revs, 0.072072 s, 0.061004 s, -0.011068 s, × 0.8464, 10 µs/rev
netbeans x0000_revs_xx000_added_x000_copies
588c2d1ced70 1aad62e59ddd : 66949 revs, 0.682732 s, 0.535874 s, -0.146858 s, × 0.7849, 8 µs/rev
mozilla-central x_revs_x_added_0_copies
3697f962bb7b 7015fcdd43a2 : 2 revs, 0.000090 s, 0.000090 s, +0.000000 s, × 1.0000, 45 µs/rev
mozilla-central x_revs_x000_added_0_copies
dd390860c6c9 40d0c5bed75d : 8 revs, 0.000210 s, 0.000281 s, +0.000071 s, × 1.3381, 35 µs/rev
mozilla-central x_revs_x_added_x_copies
8d198483ae3b 14207ffc2b2f : 9 revs, 0.000182 s, 0.000187 s, +0.000005 s, × 1.0275, 20 µs/rev
mozilla-central x_revs_x00_added_x_copies
98cbc58cc6bc 446a150332c3 : 7 revs, 0.000594 s, 0.000660 s, +0.000066 s, × 1.1111, 94 µs/rev
mozilla-central x_revs_x000_added_x000_copies
3c684b4b8f68 0a5e72d1b479 : 3 revs, 0.003102 s, 0.003385 s, +0.000283 s, × 1.0912, 1128 µs/rev
mozilla-central x_revs_x0000_added_x0000_copies
effb563bb7e5 c07a39dc4e80 : 6 revs, 0.060234 s, 0.069812 s, +0.009578 s, × 1.1590, 11635 µs/rev
mozilla-central x000_revs_xx00_added_0_copies
6100d773079a 04a55431795e : 1593 revs, 0.006300 s, 0.006503 s, +0.000203 s, × 1.0322, 4 µs/rev
mozilla-central x000_revs_x000_added_x_copies
9f17a6fc04f9 2d37b966abed : 41 revs, 0.004817 s, 0.004988 s, +0.000171 s, × 1.0355, 121 µs/rev
mozilla-central x000_revs_x000_added_x000_copies
7c97034feb78 4407bd0c6330 : 7839 revs, 0.065451 s, 0.063963 s, -0.001488 s, × 0.9773, 8 µs/rev
mozilla-central x0000_revs_xx000_added_0_copies
9eec5917337d 67118cc6dcad : 615 revs, 0.026282 s, 0.026225 s, -0.000057 s, × 0.9978, 42 µs/rev
mozilla-central x0000_revs_xx000_added_x000_copies
f78c615a656c 96a38b690156 : 30263 revs, 0.206873 s, 0.201377 s, -0.005496 s, × 0.9734, 6 µs/rev
mozilla-central x00000_revs_x0000_added_x0000_copies
6832ae71433c 4c222a1d9a00 : 153721 revs, 1.935918 s, 1.781383 s, -0.154535 s, × 0.9202, 11 µs/rev
mozilla-central x00000_revs_x00000_added_x000_copies
76caed42cf7c 1daa622bbe42 : 204976 revs, 2.827320 s, 2.603867 s, -0.223453 s, × 0.9210, 12 µs/rev
mozilla-try x_revs_x_added_0_copies
aaf6dde0deb8 9790f499805a : 2 revs, 0.000842 s, 0.000845 s, +0.000003 s, × 1.0036, 422 µs/rev
mozilla-try x_revs_x000_added_0_copies
d8d0222927b4 5bb8ce8c7450 : 2 revs, 0.000870 s, 0.000862 s, -0.000008 s, × 0.9908, 431 µs/rev
mozilla-try x_revs_x_added_x_copies
092fcca11bdb 936255a0384a : 4 revs, 0.000165 s, 0.000161 s, -0.000004 s, × 0.9758, 40 µs/rev
mozilla-try x_revs_x00_added_x_copies
b53d2fadbdb5 017afae788ec : 2 revs, 0.001145 s, 0.001163 s, +0.000018 s, × 1.0157, 581 µs/rev
mozilla-try x_revs_x000_added_x000_copies
20408ad61ce5 6f0ee96e21ad : 1 revs, 0.026500 s, 0.032414 s, +0.005914 s, × 1.2232, 32414 µs/rev
mozilla-try x_revs_x0000_added_x0000_copies
effb563bb7e5 c07a39dc4e80 : 6 revs, 0.059407 s, 0.070149 s, +0.010742 s, × 1.1808, 11691 µs/rev
mozilla-try x000_revs_xx00_added_0_copies
6100d773079a 04a55431795e : 1593 revs, 0.006325 s, 0.006526 s, +0.000201 s, × 1.0318, 4 µs/rev
mozilla-try x000_revs_x000_added_x_copies
9f17a6fc04f9 2d37b966abed : 41 revs, 0.005171 s, 0.005187 s, +0.000016 s, × 1.0031, 126 µs/rev
mozilla-try x000_revs_x000_added_x000_copies
1346fd0130e4 4c65cbdabc1f : 6657 revs, 0.066837 s, 0.065047 s, -0.001790 s, × 0.9732, 9 µs/rev
mozilla-try x0000_revs_x_added_0_copies
63519bfd42ee a36a2a865d92 : 40314 revs, 0.314252 s, 0.301129 s, -0.013123 s, × 0.9582, 7 µs/rev
mozilla-try x0000_revs_x_added_x_copies
9fe69ff0762d bcabf2a78927 : 38690 revs, 0.304160 s, 0.280683 s, -0.023477 s, × 0.9228, 7 µs/rev
mozilla-try x0000_revs_xx000_added_x_copies
156f6e2674f2 4d0f2c178e66 : 8598 revs, 0.089223 s, 0.084897 s, -0.004326 s, × 0.9515, 9 µs/rev
mozilla-try x0000_revs_xx000_added_0_copies
9eec5917337d 67118cc6dcad : 615 revs, 0.026711 s, 0.026620 s, -0.000091 s, × 0.9966, 43 µs/rev
mozilla-try x0000_revs_xx000_added_x000_copies
89294cd501d9 7ccb2fc7ccb5 : 97052 revs, 3.243010 s, 1.529120 s, -1.713890 s, × 0.4715, 15 µs/rev
mozilla-try x0000_revs_x0000_added_x0000_copies
e928c65095ed e951f4ad123a : 52031 revs, 0.756500 s, 0.738709 s, -0.017791 s, × 0.9765, 14 µs/rev
mozilla-try x00000_revs_x_added_0_copies
6a320851d377 1ebb79acd503 : 363753 revs, 5.693818 s, 4.842699 s, -0.851119 s, × 0.8505, 13 µs/rev
mozilla-try x00000_revs_x00000_added_0_copies
dc8a3ca7010e d16fde900c9c : 34414 revs, 0.590904 s, 0.596946 s, +0.006042 s, × 1.0102, 17 µs/rev
mozilla-try x00000_revs_x_added_x_copies
5173c4b6f97c 95d83ee7242d : 362229 revs, 5.677655 s, 4.761732 s, -0.915923 s, × 0.8387, 13 µs/rev
mozilla-try x00000_revs_x000_added_x_copies
9126823d0e9c ca82787bb23c : 359344 revs, 5.563370 s, 4.733912 s, -0.829458 s, × 0.8509, 13 µs/rev
mozilla-try x00000_revs_x0000_added_x0000_copies
8d3fafa80d4b eb884023b810 : 192665 revs, 2.864099 s, 2.593410 s, -0.270689 s, × 0.9055, 13 µs/rev
mozilla-try x00000_revs_x00000_added_x0000_copies
1b661134e2ca 1ae03d022d6d : 228985 revs, 113.297287 s, 41.041198 s, -72.256089 s, × 0.3622, 179 µs/rev
mozilla-try x00000_revs_x00000_added_x000_copies
9b2a99adc05e 8e29777b48e6 : 382065 revs, 59.498652 s, 27.915689 s, -31.582963 s, × 0.4692, 73 µs/rev
Full timing comparison between this revision and the filelog copy tracing.
Repo Case Source-Rev Dest-Rev # of revisions filelog sidedata Difference Factor time per rev
---------------------------------------------------------------------------------------------------------------------------------------------------------------
mercurial x_revs_x_added_0_copies
ad6b123de1c7 39cfcef4f463 : 1 revs, 0.000903 s, 0.000042 s, -0.000861 s, × 0.0465, 41 µs/rev
mercurial x_revs_x_added_x_copies
2b1c78674230 0c1d10351869 : 6 revs, 0.001861 s, 0.000110 s, -0.001751 s, × 0.0591, 18 µs/rev
mercurial x000_revs_x000_added_x_copies
81f8ff2a9bf2 dd3267698d84 : 1032 revs, 0.018577 s, 0.004918 s, -0.013659 s, × 0.2647, 4 µs/rev
pypy x_revs_x_added_0_copies
aed021ee8ae8 099ed31b181b : 9 revs, 0.001519 s, 0.000195 s, -0.001324 s, × 0.1283, 21 µs/rev
pypy x_revs_x000_added_0_copies
4aa4e1f8e19a 359343b9ac0e : 1 revs, 0.213855 s, 0.000049 s, -0.350d73 s, × 0.0002, 48 µs/rev
pypy x_revs_x_added_x_copies
ac52eb7bbbb0 72e022663155 : 7 revs, 0.017022 s, 0.000112 s, -0.016910 s, × 0.0065, 15 µs/rev
pypy x_revs_x00_added_x_copies
c3b14617fbd7 ace7255d9a26 : 1 revs, 0.019398 s, 0.000324 s, -0.019074 s, × 0.0167, 323 µs/rev
pypy x_revs_x000_added_x000_copies
df6f7a526b60 a83dc6a2d56f : 6 revs, 0.769467 s, 0.010611 s, -0.758856 s, × 0.0137, 1768 µs/rev
pypy x000_revs_xx00_added_0_copies
89a76aede314 2f22446ff07e : 4785 revs, 1.221952 s, 0.050835 s, -1.171117 s, × 0.0416, 10 µs/rev
pypy x000_revs_x000_added_x_copies
8a3b5bfd266e 2c68e87c3efe : 6780 revs, 1.304007 s, 0.081225 s, -1.222782 s, × 0.0622, 11 µs/rev
pypy x000_revs_x000_added_x000_copies
89a76aede314 7b3dda341c84 : 5441 revs, 1.686610 s, 0.061291 s, -1.625319 s, × 0.0363, 11 µs/rev
pypy x0000_revs_x_added_0_copies
d1defd0dc478 c9cb1334cc78 : 43645 revs, 0.001107 s, 0.586011 s, +0.584904 s, × 529.36, 13 µs/rev
pypy x0000_revs_xx000_added_0_copies
bf2c629d0071 4ffed77c095c : 2 revs, 1.100760 s, 0.012824 s, -1.087936 s, × 0.0116, 6408 µs/rev
pypy x0000_revs_xx000_added_x000_copies
08ea3258278e d9fa043f30c0 : 11316 revs, 1.350547 s, 0.114173 s, -1.236374 s, × 0.0845, 10 µs/rev
netbeans x_revs_x_added_0_copies
fb0955ffcbcd a01e9239f9e7 : 2 revs, 0.027864 s, 0.000085 s, -0.027779 s, × 0.0030, 42 µs/rev
netbeans x_revs_x000_added_0_copies
6f360122949f 20eb231cc7d0 : 2 revs, 0.132479 s, 0.000108 s, -0.132371 s, × 0.0008, 53 µs/rev
netbeans x_revs_x_added_x_copies
1ada3faf6fb6 5a39d12eecf4 : 3 revs, 0.025405 s, 0.000175 s, -0.025230 s, × 0.0068, 58 µs/rev
netbeans x_revs_x00_added_x_copies
35be93ba1e2c 9eec5e90c05f : 9 revs, 0.053244 s, 0.000719 s, -0.052525 s, × 0.0135, 79 µs/rev
netbeans x000_revs_xx00_added_0_copies
eac3045b4fdd 51d4ae7f1290 : 1421 revs, 0.038017 s, 0.010175 s, -0.027842 s, × 0.2676, 7 µs/rev
netbeans x000_revs_x000_added_x_copies
e2063d266acd 6081d72689dc : 1533 revs, 0.198308 s, 0.015569 s, -0.182739 s, × 0.0785, 10 µs/rev
netbeans x000_revs_x000_added_x000_copies
ff453e9fee32 411350406ec2 : 5750 revs, 0.949749 s, 0.061004 s, -0.888745 s, × 0.0642, 10 µs/rev
netbeans x0000_revs_xx000_added_x000_copies
588c2d1ced70 1aad62e59ddd : 66949 revs, 3.932262 s, 0.535874 s, -3.396388 s, × 0.1362, 8 µs/rev
mozilla-central x_revs_x_added_0_copies
3697f962bb7b 7015fcdd43a2 : 2 revs, 0.024490 s, 0.000090 s, -0.024400 s, × 0.0036, 44 µs/rev
mozilla-central x_revs_x000_added_0_copies
dd390860c6c9 40d0c5bed75d : 8 revs, 0.143885 s, 0.000281 s, -0.143604 s, × 0.0019, 35 µs/rev
mozilla-central x_revs_x_added_x_copies
8d198483ae3b 14207ffc2b2f : 9 revs, 0.025471 s, 0.000187 s, -0.025284 s, × 0.0073, 20 µs/rev
mozilla-central x_revs_x00_added_x_copies
98cbc58cc6bc 446a150332c3 : 7 revs, 0.086013 s, 0.000660 s, -0.085353 s, × 0.0076, 94 µs/rev
mozilla-central x_revs_x000_added_x000_copies
3c684b4b8f68 0a5e72d1b479 : 3 revs, 0.200726 s, 0.003385 s, -0.197341 s, × 0.0168, 1127 µs/rev
mozilla-central x_revs_x0000_added_x0000_copies
effb563bb7e5 c07a39dc4e80 : 6 revs, 2.224171 s, 0.069812 s, -2.154359 s, × 0.0313, 11633 µs/rev
mozilla-central x000_revs_xx00_added_0_copies
6100d773079a 04a55431795e : 1593 revs, 0.090780 s, 0.006503 s, -0.084277 s, × 0.0716, 4 µs/rev
mozilla-central x000_revs_x000_added_x_copies
9f17a6fc04f9 2d37b966abed : 41 revs, 0.764805 s, 0.004988 s, -0.759817 s, × 0.0065, 121 µs/rev
mozilla-central x000_revs_x000_added_x000_copies
7c97034feb78 4407bd0c6330 : 7839 revs, 1.161405 s, 0.063963 s, -1.097442 s, × 0.0550, 8 µs/rev
mozilla-central x0000_revs_xx000_added_0_copies
9eec5917337d 67118cc6dcad : 615 revs, 6.816186 s, 0.026225 s, -6.789961 s, × 0.0038, 42 µs/rev
mozilla-central x0000_revs_xx000_added_x000_copies
f78c615a656c 96a38b690156 : 30263 revs, 3.374819 s, 0.201377 s, -3.173442 s, × 0.0596, 6 µs/rev
mozilla-central x00000_revs_x0000_added_x0000_copies
6832ae71433c 4c222a1d9a00 : 153721 revs, 16.285469 s, 1.781383 s, -14.504086 s, × 0.1093, 11 µs/rev
mozilla-central x00000_revs_x00000_added_x000_copies
76caed42cf7c 1daa622bbe42 : 204976 revs, 21.207733 s, 2.603867 s, -18.603866 s, × 0.1227, 12 µs/rev
mozilla-try x_revs_x_added_0_copies
aaf6dde0deb8 9790f499805a : 2 revs, 0.080843 s, 0.000845 s, -0.079998 s, × 0.0104, 422 µs/rev
mozilla-try x_revs_x000_added_0_copies
d8d0222927b4 5bb8ce8c7450 : 2 revs, 0.511068 s, 0.000862 s, -0.510206 s, × 0.0016, 430 µs/rev
mozilla-try x_revs_x_added_x_copies
092fcca11bdb 936255a0384a : 4 revs, 0.021573 s, 0.000161 s, -0.021412 s, × 0.0074, 40 µs/rev
mozilla-try x_revs_x00_added_x_copies
b53d2fadbdb5 017afae788ec : 2 revs, 0.227726 s, 0.001163 s, -0.226563 s, × 0.0051, 581 µs/rev
mozilla-try x_revs_x000_added_x000_copies
20408ad61ce5 6f0ee96e21ad : 1 revs, 1.120448 s, 0.032414 s, -1.088034 s, × 0.0289, 32381 µs/rev
mozilla-try x_revs_x0000_added_x0000_copies
effb563bb7e5 c07a39dc4e80 : 6 revs, 2.241713 s, 0.070149 s, -2.171564 s, × 0.0312, 11689 µs/rev
mozilla-try x000_revs_xx00_added_0_copies
6100d773079a 04a55431795e : 1593 revs, 0.090633 s, 0.006526 s, -0.084107 s, × 0.0720, 4 µs/rev
mozilla-try x000_revs_x000_added_x_copies
9f17a6fc04f9 2d37b966abed : 41 revs, 0.770403 s, 0.005187 s, -0.765216 s, × 0.0067, 126 µs/rev
mozilla-try x000_revs_x000_added_x000_copies
1346fd0130e4 4c65cbdabc1f : 6657 revs, 1.184557 s, 0.065047 s, -1.119510 s, × 0.0549, 9 µs/rev
mozilla-try x0000_revs_x_added_0_copies
63519bfd42ee a36a2a865d92 : 40314 revs, 0.085790 s, 0.301129 s, +0.215339 s, × 3.5100, 7 µs/rev
mozilla-try x0000_revs_x_added_x_copies
9fe69ff0762d bcabf2a78927 : 38690 revs, 0.080616 s, 0.280683 s, +0.200067 s, × 3.4817, 7 µs/rev
mozilla-try x0000_revs_xx000_added_x_copies
156f6e2674f2 4d0f2c178e66 : 8598 revs, 7.712554 s, 0.084897 s, -7.627657 s, × 0.0110, 9 µs/rev
mozilla-try x0000_revs_xx000_added_0_copies
9eec5917337d 67118cc6dcad : 615 revs, 6.937294 s, 0.026620 s, -6.910674 s, × 0.0038, 43 µs/rev
mozilla-try x0000_revs_xx000_added_x000_copies
89294cd501d9 7ccb2fc7ccb5 : 97052 revs, 7.712313 s, 1.529120 s, -6.183193 s, × 0.1982, 15 µs/rev
mozilla-try x0000_revs_x0000_added_x0000_copies
e928c65095ed e951f4ad123a : 52031 revs, 9.966910 s, 0.738709 s, -9.228201 s, × 0.0741, 14 µs/rev
mozilla-try x00000_revs_x_added_0_copies
6a320851d377 1ebb79acd503 : 363753 revs, 0.090397 s, 4.842699 s, +4.752302 s, × 53.571, 13 µs/rev
mozilla-try x00000_revs_x00000_added_0_copies
dc8a3ca7010e d16fde900c9c : 34414 revs, 27.817167 s, 0.596946 s, -27.220221 s, × 0.0214, 17 µs/rev
mozilla-try x00000_revs_x_added_x_copies
5173c4b6f97c 95d83ee7242d : 362229 revs, 0.091305 s, 4.761732 s, +4.670427 s, × 52.151, 13 µs/rev
mozilla-try x00000_revs_x000_added_x_copies
9126823d0e9c ca82787bb23c : 359344 revs, 0.231183 s, 4.733912 s, +4.502729 s, × 20.476, 13 µs/rev
mozilla-try x00000_revs_x0000_added_x0000_copies
8d3fafa80d4b eb884023b810 : 192665 revs, 19.830617 s, 2.593410 s, -17.237207 s, × 0.1307, 13 µs/rev
mozilla-try x00000_revs_x00000_added_x0000_copies
1b661134e2ca 1ae03d022d6d : 228985 revs, 21.743873 s, 41.041198 s, +19.297325 s, × 1.8874, 179 µs/rev
mozilla-try x00000_revs_x00000_added_x000_copies
9b2a99adc05e 8e29777b48e6 : 382065 revs, 25.935037 s, 27.915689 s, +1.980652 s, × 1.0763, 73 µs/rev
Differential Revision: https://phab.mercurial-scm.org/D9493
# branchmap.py - logic to computes, maintain and stores branchmap for local repo
#
# Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
from __future__ import absolute_import
import struct
from .node import (
bin,
hex,
nullid,
nullrev,
)
from . import (
encoding,
error,
pycompat,
scmutil,
util,
)
from .utils import (
repoviewutil,
stringutil,
)
if pycompat.TYPE_CHECKING:
from typing import (
Any,
Callable,
Dict,
Iterable,
List,
Optional,
Set,
Tuple,
Union,
)
assert any(
(
Any,
Callable,
Dict,
Iterable,
List,
Optional,
Set,
Tuple,
Union,
)
)
subsettable = repoviewutil.subsettable
calcsize = struct.calcsize
pack_into = struct.pack_into
unpack_from = struct.unpack_from
class BranchMapCache(object):
"""mapping of filtered views of repo with their branchcache"""
def __init__(self):
self._per_filter = {}
def __getitem__(self, repo):
self.updatecache(repo)
return self._per_filter[repo.filtername]
def updatecache(self, repo):
"""Update the cache for the given filtered view on a repository"""
# This can trigger updates for the caches for subsets of the filtered
# view, e.g. when there is no cache for this filtered view or the cache
# is stale.
cl = repo.changelog
filtername = repo.filtername
bcache = self._per_filter.get(filtername)
if bcache is None or not bcache.validfor(repo):
# cache object missing or cache object stale? Read from disk
bcache = branchcache.fromfile(repo)
revs = []
if bcache is None:
# no (fresh) cache available anymore, perhaps we can re-use
# the cache for a subset, then extend that to add info on missing
# revisions.
subsetname = subsettable.get(filtername)
if subsetname is not None:
subset = repo.filtered(subsetname)
bcache = self[subset].copy()
extrarevs = subset.changelog.filteredrevs - cl.filteredrevs
revs.extend(r for r in extrarevs if r <= bcache.tiprev)
else:
# nothing to fall back on, start empty.
bcache = branchcache()
revs.extend(cl.revs(start=bcache.tiprev + 1))
if revs:
bcache.update(repo, revs)
assert bcache.validfor(repo), filtername
self._per_filter[repo.filtername] = bcache
def replace(self, repo, remotebranchmap):
"""Replace the branchmap cache for a repo with a branch mapping.
This is likely only called during clone with a branch map from a
remote.
"""
cl = repo.changelog
clrev = cl.rev
clbranchinfo = cl.branchinfo
rbheads = []
closed = set()
for bheads in pycompat.itervalues(remotebranchmap):
rbheads += bheads
for h in bheads:
r = clrev(h)
b, c = clbranchinfo(r)
if c:
closed.add(h)
if rbheads:
rtiprev = max((int(clrev(node)) for node in rbheads))
cache = branchcache(
remotebranchmap,
repo[rtiprev].node(),
rtiprev,
closednodes=closed,
)
# Try to stick it as low as possible
# filter above served are unlikely to be fetch from a clone
for candidate in (b'base', b'immutable', b'served'):
rview = repo.filtered(candidate)
if cache.validfor(rview):
self._per_filter[candidate] = cache
cache.write(rview)
return
def clear(self):
self._per_filter.clear()
def _unknownnode(node):
"""raises ValueError when branchcache found a node which does not exists"""
raise ValueError('node %s does not exist' % pycompat.sysstr(hex(node)))
def _branchcachedesc(repo):
if repo.filtername is not None:
return b'branch cache (%s)' % repo.filtername
else:
return b'branch cache'
class branchcache(object):
"""A dict like object that hold branches heads cache.
This cache is used to avoid costly computations to determine all the
branch heads of a repo.
The cache is serialized on disk in the following format:
<tip hex node> <tip rev number> [optional filtered repo hex hash]
<branch head hex node> <open/closed state> <branch name>
<branch head hex node> <open/closed state> <branch name>
...
The first line is used to check if the cache is still valid. If the
branch cache is for a filtered repo view, an optional third hash is
included that hashes the hashes of all filtered revisions.
The open/closed state is represented by a single letter 'o' or 'c'.
This field can be used to avoid changelog reads when determining if a
branch head closes a branch or not.
"""
def __init__(
self,
entries=(),
tipnode=nullid,
tiprev=nullrev,
filteredhash=None,
closednodes=None,
hasnode=None,
):
# type: (Union[Dict[bytes, List[bytes]], Iterable[Tuple[bytes, List[bytes]]]], bytes, int, Optional[bytes], Optional[Set[bytes]], Optional[Callable[[bytes], bool]]) -> None
"""hasnode is a function which can be used to verify whether changelog
has a given node or not. If it's not provided, we assume that every node
we have exists in changelog"""
self.tipnode = tipnode
self.tiprev = tiprev
self.filteredhash = filteredhash
# closednodes is a set of nodes that close their branch. If the branch
# cache has been updated, it may contain nodes that are no longer
# heads.
if closednodes is None:
self._closednodes = set()
else:
self._closednodes = closednodes
self._entries = dict(entries)
# whether closed nodes are verified or not
self._closedverified = False
# branches for which nodes are verified
self._verifiedbranches = set()
self._hasnode = hasnode
if self._hasnode is None:
self._hasnode = lambda x: True
def _verifyclosed(self):
""" verify the closed nodes we have """
if self._closedverified:
return
for node in self._closednodes:
if not self._hasnode(node):
_unknownnode(node)
self._closedverified = True
def _verifybranch(self, branch):
""" verify head nodes for the given branch. """
if branch not in self._entries or branch in self._verifiedbranches:
return
for n in self._entries[branch]:
if not self._hasnode(n):
_unknownnode(n)
self._verifiedbranches.add(branch)
def _verifyall(self):
""" verifies nodes of all the branches """
needverification = set(self._entries.keys()) - self._verifiedbranches
for b in needverification:
self._verifybranch(b)
def __iter__(self):
return iter(self._entries)
def __setitem__(self, key, value):
self._entries[key] = value
def __getitem__(self, key):
self._verifybranch(key)
return self._entries[key]
def __contains__(self, key):
self._verifybranch(key)
return key in self._entries
def iteritems(self):
for k, v in pycompat.iteritems(self._entries):
self._verifybranch(k)
yield k, v
items = iteritems
def hasbranch(self, label):
""" checks whether a branch of this name exists or not """
self._verifybranch(label)
return label in self._entries
@classmethod
def fromfile(cls, repo):
f = None
try:
f = repo.cachevfs(cls._filename(repo))
lineiter = iter(f)
cachekey = next(lineiter).rstrip(b'\n').split(b" ", 2)
last, lrev = cachekey[:2]
last, lrev = bin(last), int(lrev)
filteredhash = None
hasnode = repo.changelog.hasnode
if len(cachekey) > 2:
filteredhash = bin(cachekey[2])
bcache = cls(
tipnode=last,
tiprev=lrev,
filteredhash=filteredhash,
hasnode=hasnode,
)
if not bcache.validfor(repo):
# invalidate the cache
raise ValueError('tip differs')
bcache.load(repo, lineiter)
except (IOError, OSError):
return None
except Exception as inst:
if repo.ui.debugflag:
msg = b'invalid %s: %s\n'
repo.ui.debug(
msg
% (
_branchcachedesc(repo),
pycompat.bytestr(
inst
), # pytype: disable=wrong-arg-types
)
)
bcache = None
finally:
if f:
f.close()
return bcache
def load(self, repo, lineiter):
"""fully loads the branchcache by reading from the file using the line
iterator passed"""
for line in lineiter:
line = line.rstrip(b'\n')
if not line:
continue
node, state, label = line.split(b" ", 2)
if state not in b'oc':
raise ValueError('invalid branch state')
label = encoding.tolocal(label.strip())
node = bin(node)
self._entries.setdefault(label, []).append(node)
if state == b'c':
self._closednodes.add(node)
@staticmethod
def _filename(repo):
"""name of a branchcache file for a given repo or repoview"""
filename = b"branch2"
if repo.filtername:
filename = b'%s-%s' % (filename, repo.filtername)
return filename
def validfor(self, repo):
"""Is the cache content valid regarding a repo
- False when cached tipnode is unknown or if we detect a strip.
- True when cache is up to date or a subset of current repo."""
try:
return (self.tipnode == repo.changelog.node(self.tiprev)) and (
self.filteredhash == scmutil.filteredhash(repo, self.tiprev)
)
except IndexError:
return False
def _branchtip(self, heads):
"""Return tuple with last open head in heads and false,
otherwise return last closed head and true."""
tip = heads[-1]
closed = True
for h in reversed(heads):
if h not in self._closednodes:
tip = h
closed = False
break
return tip, closed
def branchtip(self, branch):
"""Return the tipmost open head on branch head, otherwise return the
tipmost closed head on branch.
Raise KeyError for unknown branch."""
return self._branchtip(self[branch])[0]
def iteropen(self, nodes):
return (n for n in nodes if n not in self._closednodes)
def branchheads(self, branch, closed=False):
self._verifybranch(branch)
heads = self._entries[branch]
if not closed:
heads = list(self.iteropen(heads))
return heads
def iterbranches(self):
for bn, heads in pycompat.iteritems(self):
yield (bn, heads) + self._branchtip(heads)
def iterheads(self):
""" returns all the heads """
self._verifyall()
return pycompat.itervalues(self._entries)
def copy(self):
"""return an deep copy of the branchcache object"""
return type(self)(
self._entries,
self.tipnode,
self.tiprev,
self.filteredhash,
self._closednodes,
)
def write(self, repo):
try:
f = repo.cachevfs(self._filename(repo), b"w", atomictemp=True)
cachekey = [hex(self.tipnode), b'%d' % self.tiprev]
if self.filteredhash is not None:
cachekey.append(hex(self.filteredhash))
f.write(b" ".join(cachekey) + b'\n')
nodecount = 0
for label, nodes in sorted(pycompat.iteritems(self._entries)):
label = encoding.fromlocal(label)
for node in nodes:
nodecount += 1
if node in self._closednodes:
state = b'c'
else:
state = b'o'
f.write(b"%s %s %s\n" % (hex(node), state, label))
f.close()
repo.ui.log(
b'branchcache',
b'wrote %s with %d labels and %d nodes\n',
_branchcachedesc(repo),
len(self._entries),
nodecount,
)
except (IOError, OSError, error.Abort) as inst:
# Abort may be raised by read only opener, so log and continue
repo.ui.debug(
b"couldn't write branch cache: %s\n"
% stringutil.forcebytestr(inst)
)
def update(self, repo, revgen):
"""Given a branchhead cache, self, that may have extra nodes or be
missing heads, and a generator of nodes that are strictly a superset of
heads missing, this function updates self to be correct.
"""
starttime = util.timer()
cl = repo.changelog
# collect new branch entries
newbranches = {}
getbranchinfo = repo.revbranchcache().branchinfo
for r in revgen:
branch, closesbranch = getbranchinfo(r)
newbranches.setdefault(branch, []).append(r)
if closesbranch:
self._closednodes.add(cl.node(r))
# fetch current topological heads to speed up filtering
topoheads = set(cl.headrevs())
# new tip revision which we found after iterating items from new
# branches
ntiprev = self.tiprev
# if older branchheads are reachable from new ones, they aren't
# really branchheads. Note checking parents is insufficient:
# 1 (branch a) -> 2 (branch b) -> 3 (branch a)
for branch, newheadrevs in pycompat.iteritems(newbranches):
bheads = self._entries.setdefault(branch, [])
bheadset = {cl.rev(node) for node in bheads}
# This have been tested True on all internal usage of this function.
# run it again in case of doubt
# assert not (set(bheadrevs) & set(newheadrevs))
bheadset.update(newheadrevs)
# This prunes out two kinds of heads - heads that are superseded by
# a head in newheadrevs, and newheadrevs that are not heads because
# an existing head is their descendant.
uncertain = bheadset - topoheads
if uncertain:
floorrev = min(uncertain)
ancestors = set(cl.ancestors(newheadrevs, floorrev))
bheadset -= ancestors
bheadrevs = sorted(bheadset)
self[branch] = [cl.node(rev) for rev in bheadrevs]
tiprev = bheadrevs[-1]
if tiprev > ntiprev:
ntiprev = tiprev
if ntiprev > self.tiprev:
self.tiprev = ntiprev
self.tipnode = cl.node(ntiprev)
if not self.validfor(repo):
# cache key are not valid anymore
self.tipnode = nullid
self.tiprev = nullrev
for heads in self.iterheads():
tiprev = max(cl.rev(node) for node in heads)
if tiprev > self.tiprev:
self.tipnode = cl.node(tiprev)
self.tiprev = tiprev
self.filteredhash = scmutil.filteredhash(repo, self.tiprev)
duration = util.timer() - starttime
repo.ui.log(
b'branchcache',
b'updated %s in %.4f seconds\n',
_branchcachedesc(repo),
duration,
)
self.write(repo)
class remotebranchcache(branchcache):
"""Branchmap info for a remote connection, should not write locally"""
def write(self, repo):
pass
# Revision branch info cache
_rbcversion = b'-v1'
_rbcnames = b'rbc-names' + _rbcversion
_rbcrevs = b'rbc-revs' + _rbcversion
# [4 byte hash prefix][4 byte branch name number with sign bit indicating open]
_rbcrecfmt = b'>4sI'
_rbcrecsize = calcsize(_rbcrecfmt)
_rbcnodelen = 4
_rbcbranchidxmask = 0x7FFFFFFF
_rbccloseflag = 0x80000000
class revbranchcache(object):
"""Persistent cache, mapping from revision number to branch name and close.
This is a low level cache, independent of filtering.
Branch names are stored in rbc-names in internal encoding separated by 0.
rbc-names is append-only, and each branch name is only stored once and will
thus have a unique index.
The branch info for each revision is stored in rbc-revs as constant size
records. The whole file is read into memory, but it is only 'parsed' on
demand. The file is usually append-only but will be truncated if repo
modification is detected.
The record for each revision contains the first 4 bytes of the
corresponding node hash, and the record is only used if it still matches.
Even a completely trashed rbc-revs fill thus still give the right result
while converging towards full recovery ... assuming no incorrectly matching
node hashes.
The record also contains 4 bytes where 31 bits contains the index of the
branch and the last bit indicate that it is a branch close commit.
The usage pattern for rbc-revs is thus somewhat similar to 00changelog.i
and will grow with it but be 1/8th of its size.
"""
def __init__(self, repo, readonly=True):
assert repo.filtername is None
self._repo = repo
self._names = [] # branch names in local encoding with static index
self._rbcrevs = bytearray()
self._rbcsnameslen = 0 # length of names read at _rbcsnameslen
try:
bndata = repo.cachevfs.read(_rbcnames)
self._rbcsnameslen = len(bndata) # for verification before writing
if bndata:
self._names = [
encoding.tolocal(bn) for bn in bndata.split(b'\0')
]
except (IOError, OSError):
if readonly:
# don't try to use cache - fall back to the slow path
self.branchinfo = self._branchinfo
if self._names:
try:
data = repo.cachevfs.read(_rbcrevs)
self._rbcrevs[:] = data
except (IOError, OSError) as inst:
repo.ui.debug(
b"couldn't read revision branch cache: %s\n"
% stringutil.forcebytestr(inst)
)
# remember number of good records on disk
self._rbcrevslen = min(
len(self._rbcrevs) // _rbcrecsize, len(repo.changelog)
)
if self._rbcrevslen == 0:
self._names = []
self._rbcnamescount = len(self._names) # number of names read at
# _rbcsnameslen
def _clear(self):
self._rbcsnameslen = 0
del self._names[:]
self._rbcnamescount = 0
self._rbcrevslen = len(self._repo.changelog)
self._rbcrevs = bytearray(self._rbcrevslen * _rbcrecsize)
util.clearcachedproperty(self, b'_namesreverse')
@util.propertycache
def _namesreverse(self):
return {b: r for r, b in enumerate(self._names)}
def branchinfo(self, rev):
"""Return branch name and close flag for rev, using and updating
persistent cache."""
changelog = self._repo.changelog
rbcrevidx = rev * _rbcrecsize
# avoid negative index, changelog.read(nullrev) is fast without cache
if rev == nullrev:
return changelog.branchinfo(rev)
# if requested rev isn't allocated, grow and cache the rev info
if len(self._rbcrevs) < rbcrevidx + _rbcrecsize:
return self._branchinfo(rev)
# fast path: extract data from cache, use it if node is matching
reponode = changelog.node(rev)[:_rbcnodelen]
cachenode, branchidx = unpack_from(
_rbcrecfmt, util.buffer(self._rbcrevs), rbcrevidx
)
close = bool(branchidx & _rbccloseflag)
if close:
branchidx &= _rbcbranchidxmask
if cachenode == b'\0\0\0\0':
pass
elif cachenode == reponode:
try:
return self._names[branchidx], close
except IndexError:
# recover from invalid reference to unknown branch
self._repo.ui.debug(
b"referenced branch names not found"
b" - rebuilding revision branch cache from scratch\n"
)
self._clear()
else:
# rev/node map has changed, invalidate the cache from here up
self._repo.ui.debug(
b"history modification detected - truncating "
b"revision branch cache to revision %d\n" % rev
)
truncate = rbcrevidx + _rbcrecsize
del self._rbcrevs[truncate:]
self._rbcrevslen = min(self._rbcrevslen, truncate)
# fall back to slow path and make sure it will be written to disk
return self._branchinfo(rev)
def _branchinfo(self, rev):
"""Retrieve branch info from changelog and update _rbcrevs"""
changelog = self._repo.changelog
b, close = changelog.branchinfo(rev)
if b in self._namesreverse:
branchidx = self._namesreverse[b]
else:
branchidx = len(self._names)
self._names.append(b)
self._namesreverse[b] = branchidx
reponode = changelog.node(rev)
if close:
branchidx |= _rbccloseflag
self._setcachedata(rev, reponode, branchidx)
return b, close
def setdata(self, branch, rev, node, close):
"""add new data information to the cache"""
if branch in self._namesreverse:
branchidx = self._namesreverse[branch]
else:
branchidx = len(self._names)
self._names.append(branch)
self._namesreverse[branch] = branchidx
if close:
branchidx |= _rbccloseflag
self._setcachedata(rev, node, branchidx)
# If no cache data were readable (non exists, bad permission, etc)
# the cache was bypassing itself by setting:
#
# self.branchinfo = self._branchinfo
#
# Since we now have data in the cache, we need to drop this bypassing.
if 'branchinfo' in vars(self):
del self.branchinfo
def _setcachedata(self, rev, node, branchidx):
"""Writes the node's branch data to the in-memory cache data."""
if rev == nullrev:
return
rbcrevidx = rev * _rbcrecsize
if len(self._rbcrevs) < rbcrevidx + _rbcrecsize:
self._rbcrevs.extend(
b'\0'
* (len(self._repo.changelog) * _rbcrecsize - len(self._rbcrevs))
)
pack_into(_rbcrecfmt, self._rbcrevs, rbcrevidx, node, branchidx)
self._rbcrevslen = min(self._rbcrevslen, rev)
tr = self._repo.currenttransaction()
if tr:
tr.addfinalize(b'write-revbranchcache', self.write)
def write(self, tr=None):
"""Save branch cache if it is dirty."""
repo = self._repo
wlock = None
step = b''
try:
# write the new names
if self._rbcnamescount < len(self._names):
wlock = repo.wlock(wait=False)
step = b' names'
self._writenames(repo)
# write the new revs
start = self._rbcrevslen * _rbcrecsize
if start != len(self._rbcrevs):
step = b''
if wlock is None:
wlock = repo.wlock(wait=False)
self._writerevs(repo, start)
except (IOError, OSError, error.Abort, error.LockError) as inst:
repo.ui.debug(
b"couldn't write revision branch cache%s: %s\n"
% (step, stringutil.forcebytestr(inst))
)
finally:
if wlock is not None:
wlock.release()
def _writenames(self, repo):
""" write the new branch names to revbranchcache """
if self._rbcnamescount != 0:
f = repo.cachevfs.open(_rbcnames, b'ab')
if f.tell() == self._rbcsnameslen:
f.write(b'\0')
else:
f.close()
repo.ui.debug(b"%s changed - rewriting it\n" % _rbcnames)
self._rbcnamescount = 0
self._rbcrevslen = 0
if self._rbcnamescount == 0:
# before rewriting names, make sure references are removed
repo.cachevfs.unlinkpath(_rbcrevs, ignoremissing=True)
f = repo.cachevfs.open(_rbcnames, b'wb')
f.write(
b'\0'.join(
encoding.fromlocal(b)
for b in self._names[self._rbcnamescount :]
)
)
self._rbcsnameslen = f.tell()
f.close()
self._rbcnamescount = len(self._names)
def _writerevs(self, repo, start):
""" write the new revs to revbranchcache """
revs = min(len(repo.changelog), len(self._rbcrevs) // _rbcrecsize)
with repo.cachevfs.open(_rbcrevs, b'ab') as f:
if f.tell() != start:
repo.ui.debug(
b"truncating cache/%s to %d\n" % (_rbcrevs, start)
)
f.seek(start)
if f.tell() != start:
start = 0
f.seek(start)
f.truncate()
end = revs * _rbcrecsize
f.write(self._rbcrevs[start:end])
self._rbcrevslen = revs