view tests/test-tools.t @ 38732:be4984261611

merge: mark file gets as not thread safe (issue5933) In default installs, this has the effect of disabling the thread-based worker on Windows when manifesting files in the working directory. My measurements have shown that with revlog-based repositories, Mercurial spends a lot of CPU time in revlog code resolving file data. This ends up incurring a lot of context switching across threads and slows down `hg update` operations when going from an empty working directory to the tip of the repo. On mozilla-unified (246,351 files) on an i7-6700K (4+4 CPUs): before: 487s wall after: 360s wall (equivalent to worker.enabled=false) cpus=2: 379s wall Even with only 2 threads, the thread pool is still slower. The introduction of the thread-based worker (02b36e860e0b) states that it resulted in a "~50%" speedup for `hg sparse --enable-profile` and `hg sparse --disable-profile`. This disagrees with my measurement above. I theorize a few reasons for this: 1) Removal of files from the working directory is I/O - not CPU - bound and should benefit from a thread pool (unless I/O is insanely fast and the GIL release is near instantaneous). So tests like `hg sparse --enable-profile` may exercise deletion throughput and aren't good benchmarks for worker tasks that are CPU heavy. 2) The patch was authored by someone at Facebook. The results were likely measured against a repository using remotefilelog. And I believe that revision retrieval during working directory updates with remotefilelog will often use a remote store, thus being I/O and not CPU bound. This probably resulted in an overstated performance gain. Since there appears to be a need to enable the thread-based worker with some stores, I've made the flagging of file gets as thread safe configurable. I've made it experimental because I don't want to formalize a boolean flag for this option and because this attribute is best captured against the store implementation. But we don't have a proper store API for this yet. I'd rather cross this bridge later. It is possible there are revlog-based repositories that do benefit from a thread-based worker. I didn't do very comprehensive testing. If there are, we may want to devise a more proper algorithm for whether to use the thread-based worker, including possibly config options to limit the number of threads to use. But until I see evidence that justifies complexity, simplicity wins. Differential Revision: https://phab.mercurial-scm.org/D3963
author Gregory Szorc <gregory.szorc@gmail.com>
date Wed, 18 Jul 2018 09:49:34 -0700
parents c1f7037c2ded
children 5abc47d4ca6b
line wrap: on
line source

Tests of the file helper tool

  $ f -h
  ?sage: f [options] [filenames] (glob)
  
  ?ptions: (glob)
    -h, --help            show this help message and exit
    -t, --type            show file type (file or directory)
    -m, --mode            show file mode
    -l, --links           show number of links
    -s, --size            show size of file
    -n NEWER, --newer=NEWER
                          check if file is newer (or same)
    -r, --recurse         recurse into directories
    -S, --sha1            show sha1 hash of the content
    --sha256              show sha256 hash of the content
    -M, --md5             show md5 hash of the content
    -D, --dump            dump file content
    -H, --hexdump         hexdump file content
    -B BYTES, --bytes=BYTES
                          number of characters to dump
    -L LINES, --lines=LINES
                          number of lines to dump
    -q, --quiet           no default output

  $ mkdir dir
  $ cd dir

  $ f --size
  size=0

  $ echo hello | f --md5 --size
  size=6, md5=b1946ac92492d2347c6235b4d2611184

  $ f foo
  foo: file not found

  $ echo foo > foo
  $ f foo
  foo:

  $ f --sha1 foo
  foo: sha1=f1d2d2f924e986ac86fdf7b36c94bcdf32beec15

  $ f --sha256 foo
  foo: sha256=b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c

#if symlink
  $ f foo --mode
  foo: mode=644
#endif

#if no-windows
  $ $PYTHON $TESTDIR/seq.py 10 > bar
#else
Convert CRLF -> LF for consistency
  $ $PYTHON $TESTDIR/seq.py 10 | sed "s/$//" > bar
#endif

#if unix-permissions symlink
  $ chmod +x bar
  $ f bar --newer foo --mode --type --size --dump --links --bytes 7
  bar: file, size=21, mode=755, links=1, newer than foo
  >>>
  1
  2
  3
  4
  <<< no trailing newline
#endif

#if unix-permissions
  $ ln bar baz
  $ f bar -n baz -l --hexdump -t --sha1 --lines=9 -B 20
  bar: file, links=2, newer than baz, sha1=612ca68d0305c821750a
  0000: 31 0a 32 0a 33 0a 34 0a 35 0a 36 0a 37 0a 38 0a |1.2.3.4.5.6.7.8.|
  0010: 39 0a                                           |9.|
  $ rm baz
#endif

#if unix-permissions symlink
  $ ln -s yadda l
  $ f . --recurse -MStmsB4
  .: directory with 3 files, mode=755
  ./bar: file, size=21, mode=755, md5=3b03, sha1=612c
  ./foo: file, size=4, mode=644, md5=d3b0, sha1=f1d2
  ./l: link, size=5, md5=2faa, sha1=af93
#endif

  $ f --quiet bar -DL 3
  1
  2
  3

  $ cd ..

Yadda is a symlink
  $ f -qr dir -HB 17
  dir: directory with 3 files (symlink !)
  dir: directory with 2 files (no-symlink !)
  dir/bar:
  0000: 31 0a 32 0a 33 0a 34 0a 35 0a 36 0a 37 0a 38 0a |1.2.3.4.5.6.7.8.|
  0010: 39                                              |9|
  dir/foo:
  0000: 66 6f 6f 0a                                     |foo.|
  dir/l: (symlink !)
  0000: 79 61 64 64 61                                  |yadda| (symlink !)