util.fspath: use a dict rather than a linear scan for lookups
Previously, we'd scan through the entire directory listing looking for a
normalized match. This is O(N) in the number of files in the directory. If we
decide to call util.fspath on each file in it, the overall complexity works out
to O(N^2). This becomes a problem with directories a few thousand files or
larger.
Switch to using a dictionary instead. There is a slightly higher upfront cost
to pay, but for cases like the above this is amortized O(1). Plus there is a
lower constant factor because generator comprehensions are faster than for
loops, so overall it works out to be a very small loss in performance for 1
file, and a huge gain when there's more.
For a large repo with around 200k files in it on a case-insensitive file
system, for a large directory with over 30,000 files in it, the following
command was tested:
ls | shuf -n $COUNT | xargs hg status
This command leads to util.fspath being called on $COUNT files in the
directory.
COUNT before after
1 0.77s 0.78s
100 1.42s 0.80s
1000 6.3s 0.96s
I also tested with COUNT=10000, but before took too long so I gave up.
#!/bin/sh
#
# tcsh_completion_build.sh - script to generate tcsh completion
#
#
# Copyright (C) 2005 TK Soh.
#
# This is free software; you can redistribute it and/or modify it under
# the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
#
# Description
# -----------
# This script generates a tcsh source file to support completion
# of Mercurial commands and options.
#
# Instruction:
# -----------
# Run this script to generate the tcsh source file, and source
# the file to add command completion support for Mercurial.
#
# tcsh% tcsh_completion.sh FILE
# tcsh% source FILE
#
# If FILE is not specified, tcsh_completion will be generated.
#
# Bugs:
# ----
# 1. command specific options are not supported
# 2. hg commands must be specified immediately after 'hg'.
#
tcsh_file=${1-tcsh_completion}
hg_commands=`hg --debug help | \
sed -e '1,/^list of commands:/d' \
-e '/^enabled extensions:/,$d' \
-e '/^additional help topics:/,$d' \
-e '/^ [^ ]/!d; s/[,:]//g;' | \
xargs -n5 | \
sed -e '$!s/$/ \\\\/g; 2,$s/^ */ /g'`
hg_global_options=`hg -v help | \
sed -e '1,/global/d;/^ *-/!d; s/ [^- ].*//' | \
sed -e 's/ *$//; $!s/$/ \\\\/g; 2,$s/^ */ /g'`
hg_version=`hg version | sed -e '1q'`
script_name=`basename $0`
cat > $tcsh_file <<END
#
# tcsh completion for Mercurial
#
# This file has been auto-generated by $script_name for
# $hg_version
#
# Copyright (C) 2005 TK Soh.
#
# This is free software; you can redistribute it and/or modify it under
# the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
complete hg \\
'n/--cwd/d/' 'n/-R/d/' 'n/--repository/d/' \\
'C/-/($hg_global_options)/' \\
'p/1/($hg_commands)/'
END