fsmonitor: match watchman and filesystem encoding
watchman's paths encoding can differ from filesystem encoding. For example,
on Windows, it's always utf-8.
Before this patch, on Windows, mismatch in path comparison between fsmonitor
state and osutil.statfiles would yield a clean status for added/modified files.
In addition to status reporting wrong results, this leads to files being
discarded from changesets while doing history editing operations such as rebase.
Benchmark:
There is a little overhead at module import:
python -m timeit "import hgext.fsmonitor"
Windows before patch: 1000000 loops, best of 3: 0.563 usec per loop
Windows after patch: 1000000 loops, best of 3: 0.583 usec per loop
Linx before patch: 1000000 loops, best of 3: 0.579 usec per loop
Linux after patch: 1000000 loops, best of 3: 0.588 usec per loop
10000 calls to _watchmantofsencoding:
python -m timeit -s "from hgext.fsmonitor import _watchmantofsencoding, _fixencoding" "fname = '/path/to/file'" "for i in range(10000):" " if _fixencoding: fname = _watchmantofsencoding(fname)"
Windows (_fixencoding is True): 100 loops, best of 3: 19.5 msec per loop
Linux (_fixencoding is False): 100 loops, best of 3: 3.08 msec per loop
#!/usr/bin/env python
#
# Based on python's Tools/scripts/md5sum.py
#
# This software may be used and distributed according to the terms
# of the PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2, which is
# GPL-compatible.
from __future__ import absolute_import
import os
import sys
try:
import hashlib
md5 = hashlib.md5
except ImportError:
import md5
md5 = md5.md5
try:
import msvcrt
msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)
msvcrt.setmode(sys.stderr.fileno(), os.O_BINARY)
except ImportError:
pass
for filename in sys.argv[1:]:
try:
fp = open(filename, 'rb')
except IOError as msg:
sys.stderr.write('%s: Can\'t open: %s\n' % (filename, msg))
sys.exit(1)
m = md5()
try:
for data in iter(lambda: fp.read(8192), ''):
m.update(data)
except IOError as msg:
sys.stderr.write('%s: I/O error: %s\n' % (filename, msg))
sys.exit(1)
sys.stdout.write('%s %s\n' % (m.hexdigest(), filename))
sys.exit(0)