match: making visitdir() deal with non-recursive entries
Primarily as an optimization to avoid recursing into directories that will
never have a match inside, this classifies each matcher pattern's root as
recursive or non-recursive (erring on the side of keeping it recursive,
which may lead to wasteful directory or manifest walks that yield no matches).
I measured the performance of "rootfilesin" in two repos:
- The Firefox repo with tree manifests, with
"hg files -r . -I rootfilesin:browser".
The browser directory contains about 3K files across 249 subdirectories.
- A specific Google-internal directory which contains 75K files across 19K
subdirectories, with "hg files -r . -I rootfilesin:REDACTED".
I tested with both cold and warm disk caches. Cold cache was produced by
running "sync; echo 3 > /proc/sys/vm/drop_caches". Warm cache was produced
by re-running the same command a few times.
These were the results:
Cold cache Warm cache
Before After Before After
firefox 0m5.1s 0m2.18s 0m0.22s 0m0.14s
google3 dir 2m3.9s 0m1.57s 0m8.12s 0m0.16s
Certain extensions, notably narrowhg, can depend on this for correctness
(not trying to recurse into directories for which it has no information).
DOCUMENT_ROOT="/var/www/hg"; export DOCUMENT_ROOT
GATEWAY_INTERFACE="CGI/1.1"; export GATEWAY_INTERFACE
HTTP_ACCEPT="text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5"; export HTTP_ACCEPT
HTTP_ACCEPT_CHARSET="ISO-8859-1,utf-8;q=0.7,*;q=0.7"; export HTTP_ACCEPT_CHARSET
HTTP_ACCEPT_ENCODING="gzip,deflate"; export HTTP_ACCEPT_ENCODING
HTTP_ACCEPT_LANGUAGE="en-us,en;q=0.5"; export HTTP_ACCEPT_LANGUAGE
HTTP_CACHE_CONTROL="max-age=0"; export HTTP_CACHE_CONTROL
HTTP_CONNECTION="keep-alive"; export HTTP_CONNECTION
HTTP_HOST="hg.omnifarious.org"; export HTTP_HOST
HTTP_KEEP_ALIVE="300"; export HTTP_KEEP_ALIVE
HTTP_USER_AGENT="Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.0.4) Gecko/20060608 Ubuntu/dapper-security Firefox/1.5.0.4"; export HTTP_USER_AGENT
PATH_INFO="/"; export PATH_INFO
PATH_TRANSLATED="/var/www/hg/index.html"; export PATH_TRANSLATED
QUERY_STRING=""; export QUERY_STRING
REMOTE_ADDR="127.0.0.2"; export REMOTE_ADDR
REMOTE_PORT="44703"; export REMOTE_PORT
REQUEST_METHOD="GET"; export REQUEST_METHOD
REQUEST_URI="/test/"; export REQUEST_URI
SCRIPT_FILENAME="/home/hopper/hg_public/test.cgi"; export SCRIPT_FILENAME
SCRIPT_NAME="/test"; export SCRIPT_NAME
SCRIPT_URI="http://hg.omnifarious.org/test/"; export SCRIPT_URI
SCRIPT_URL="/test/"; export SCRIPT_URL
SERVER_ADDR="127.0.0.1"; export SERVER_ADDR
SERVER_ADMIN="eric@localhost"; export SERVER_ADMIN
SERVER_NAME="hg.omnifarious.org"; export SERVER_NAME
SERVER_PORT="80"; export SERVER_PORT
SERVER_PROTOCOL="HTTP/1.1"; export SERVER_PROTOCOL
SERVER_SIGNATURE="<address>Apache/2.0.53 (Fedora) Server at hg.omnifarious.org Port 80</address>"; export SERVER_SIGNATURE
SERVER_SOFTWARE="Apache/2.0.53 (Fedora)"; export SERVER_SOFTWARE