view tests/filterpyflakes.py @ 17970:0b03454abae7

ancestor: faster algorithm for difference of ancestor sets One of the major reasons rebase is slow in large repositories is the computation of the detach set: the set of ancestors of the changesets to rebase not in the destination parent. This is currently done via a revset that does two walks all the way to the root of the DAG. Instead of doing that, to find ancestors of a set <revs> not in another set <common> we walk up the tree in reverse revision number order, maintaining sets of nodes visited from <revs>, <common> or both. For the common case where the sets are close both topologically and in revision number (relative to repository size), this has been found to speed up rebase by around 15-20%. When the nodes are farther apart and the DAG is highly branching, it is harder to say which would win. Here's how long computing the detach set takes in a linear repository with over 400000 changesets, rebasing near tip: Rebasing across 4 changesets Revset method: 2.2s New algorithm: 0.00015s Rebasing across 250 changesets Revset method: 2.2s New algorithm: 0.00069s Rebasing across 10000 changesets Revset method: 2.4s New algorithm: 0.019s
author Siddharth Agarwal <sid0@fb.com>
date Mon, 26 Nov 2012 11:46:51 -0800
parents 08d84bdce1a5
children 77440de177f7
line wrap: on
line source

#!/usr/bin/env python

# Filter output by pyflakes to control which warnings we check

import sys, re, os

def makekey(message):
    # "path/file:line: message"
    match = re.search(r"(line \d+)", message)
    line = ''
    if match:
        line = match.group(0)
        message = re.sub(r"(line \d+)", '', message)
    return re.sub(r"([^:]*):([^:]+):([^']*)('[^']*')(.*)$",
                  r'\3:\5:\4:\1:\2:' + line,
                  message)

lines = []
for line in sys.stdin:
    # We whitelist tests
    pats = [
            r"imported but unused",
            r"local variable '.*' is assigned to but never used",
            r"unable to detect undefined names",
           ]
    if not re.search('|'.join(pats), line):
        continue
    fn = line.split(':', 1)[0]
    f = open(os.path.join(os.path.dirname(os.path.dirname(__file__)), fn))
    data = f.read()
    f.close()
    if 'no-check-code' in data:
        continue
    lines.append(line)

for line in sorted(lines, key = makekey):
    sys.stdout.write(line)
print