tests/filtercr.py
author Bryan O'Sullivan <bryano@fb.com>
Fri, 01 Jun 2012 15:26:20 -0700
changeset 16943 8d08a28aa63e
parent 13141 6cfe17c19ba2
permissions -rwxr-xr-x
matcher: use re2 bindings if available There are two sets of Python re2 bindings available on the internet; this code works with both. Using re2 can greatly improve "hg status" performance when a .hgignore file becomes even modestly complex. Example: "hg status" on a clean tree with 134K files, where "hg debugignore" reports a regexp 4256 bytes in size. no .hgignore: 1.76 sec Python re: 2.79 re2: 1.82 The overhead of regexp matching drops from 1.03 seconds with stock re to 0.06 with re2. (For comparison, a git repo with the same contents and .gitignore file runs "git status -s" in 1.71 seconds, i.e. only slightly faster than hg with re2.)

#!/usr/bin/env python

# Filter output by the progress extension to make it readable in tests

import sys, re

for line in sys.stdin:
    line = re.sub(r'\r+[^\n]', lambda m: '\n' + m.group()[-1:], line)
    sys.stdout.write(line)
print