diff: do not concatenate immutable bytes while building a/b bodies (issue6445)
Use bytearray instead. I don't know what's changed since Python 2, but bytes
concatenation is 100x slow on Python 3.
% python2.7 -m timeit -s "s = b''" "for i in range(10000): s += b'line'"
1000 loops, best of 3: 321 usec per loop
% python3.9 -m timeit -s "s = b''" "for i in range(10000): s += b'line'"
5 loops, best of 5: 39.2 msec per loop
Benchmark using tailwind.css (measuring the fast path, a is empty):
% HGRCPATH=/dev/null python2.7 ./hg log -R /tmp/issue6445 -p --time \
--color=always --config diff.word-diff=true >/dev/null
(prev) time: real 1.580 secs (user 1.560+0.000 sys 0.020+0.000)
(this) time: real 1.610 secs (user 1.570+0.000 sys 0.030+0.000)
% HGRCPATH=/dev/null python3.9 ./hg log -R /tmp/issue6445 -p --time \
--color=always --config diff.word-diff=true >/dev/null
(prev) time: real 114.500 secs (user 114.460+0.000 sys 0.030+0.000)
(this) time: real 2.180 secs (user 2.140+0.000 sys 0.040+0.000)
Benchmark using random tabular text data (not the fast path):
% dd if=/dev/urandom bs=1k count=1000 | hexdump -v -e '16/1 "%3u," "\n"' > ttf
% hg ci -ma
% dd if=/dev/urandom bs=1k count=1000 | hexdump -v -e '16/1 "%3u," "\n"' > ttf
% hg ci -mb
% HGRCPATH=/dev/null python2.7 ./hg log -R /tmp/issue6445 -p --time \
--color=always --config diff.word-diff=true >/dev/null
(prev) time: real 3.240 secs (user 3.040+0.000 sys 0.200+0.000
(this) time: real 3.230 secs (user 3.070+0.000 sys 0.160+0.000)
% HGRCPATH=/dev/null python3.9 ./hg log -R /tmp/issue6445 -p --time \
--color=always --config diff.word-diff=true >/dev/null
(prev) time: real 44.130 secs (user 43.850+0.000 sys 0.270+0.000)
(this) time: real 4.170 secs (user 3.850+0.000 sys 0.310+0.000)
#!/bin/sh
#
# This is an example of using HGEDITOR to create of diff to review the
# changes while committing.
# If you want to pass your favourite editor some other parameters
# only for Mercurial, modify this:
case "${EDITOR}" in
"")
EDITOR="vi"
;;
emacs)
EDITOR="$EDITOR -nw"
;;
gvim|vim)
EDITOR="$EDITOR -f -o"
;;
esac
HGTMP=""
cleanup_exit() {
rm -rf "$HGTMP"
}
# Remove temporary files even if we get interrupted
trap "cleanup_exit" 0 # normal exit
trap "exit 255" HUP INT QUIT ABRT TERM
HGTMP=$(mktemp -d ${TMPDIR-/tmp}/hgeditor.XXXXXX)
[ x$HGTMP != x -a -d $HGTMP ] || {
echo "Could not create temporary directory! Exiting." 1>&2
exit 1
}
(
grep '^HG: changed' "$1" | cut -b 13- | while read changed; do
"$HG" diff "$changed" >> "$HGTMP/diff"
done
)
cat "$1" > "$HGTMP/msg"
MD5=$(which md5sum 2>/dev/null) || \
MD5=$(which md5 2>/dev/null)
[ -x "${MD5}" ] && CHECKSUM=`${MD5} "$HGTMP/msg"`
if [ -s "$HGTMP/diff" ]; then
$EDITOR "$HGTMP/msg" "$HGTMP/diff" || exit $?
else
$EDITOR "$HGTMP/msg" || exit $?
fi
[ -x "${MD5}" ] && (echo "$CHECKSUM" | ${MD5} -c >/dev/null 2>&1 && exit 13)
mv "$HGTMP/msg" "$1"
exit $?