view mercurial/parser.py @ 18792:10669e24eb6c

completion: add a debugpathcomplete command The bash_completion code uses "hg status" to generate a list of possible completions for commands that operate on files in the working directory. In a large working directory, this can result in a single tab-completion being very slow (several seconds) as a result of checking the status of every file, even when there is no need to check status or no possible matches. The new debugpathcomplete command gains performance in a few simple ways: * Allow completion to operate on just a single directory. When used to complete the right commands, this considerably reduces the number of completions returned, at no loss in functionality. * Never check the status of files. For completions that really must know if a file is modified, it is faster to use status: hg status -nm 'glob:myprefix**' Performance: Here are the commands used by bash_completion to complete, run in the root of the mozilla-central working dir (~77,000 files) and another repo (~165,000 files): All "normal state" files (used by e.g. remove, revert): mozilla other status -nmcd 'glob:**' 1.77 4.10 sec debugpathcomplete -f -n 0.53 1.26 debugpathcomplete -n 0.17 0.41 ("-f" means "complete full paths", rather than the current directory) Tracked files matching "a": mozilla other status -nmcd 'glob:a**' 0.26 0.47 debugpathcomplete -f -n a 0.10 0.24 debugpathcomplete -n a 0.10 0.22 We should be able to further improve completion performance once the critbit work lands. Right now, our performance is limited by the need to iterate over all keys in the dirstate.
author Bryan O'Sullivan <bryano@fb.com>
date Thu, 21 Mar 2013 16:31:28 -0700
parents 8ac8db8dc346
children 7c4778bc29f0
line wrap: on
line source

# parser.py - simple top-down operator precedence parser for mercurial
#
# Copyright 2010 Matt Mackall <mpm@selenic.com>
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.

# see http://effbot.org/zone/simple-top-down-parsing.htm and
# http://eli.thegreenplace.net/2010/01/02/top-down-operator-precedence-parsing/
# for background

# takes a tokenizer and elements
# tokenizer is an iterator that returns type, value pairs
# elements is a mapping of types to binding strength, prefix and infix actions
# an action is a tree node name, a tree label, and an optional match
# __call__(program) parses program into a labeled tree

import error
from i18n import _

class parser(object):
    def __init__(self, tokenizer, elements, methods=None):
        self._tokenizer = tokenizer
        self._elements = elements
        self._methods = methods
        self.current = None
    def _advance(self):
        'advance the tokenizer'
        t = self.current
        try:
            self.current = self._iter.next()
        except StopIteration:
            pass
        return t
    def _match(self, m, pos):
        'make sure the tokenizer matches an end condition'
        if self.current[0] != m:
            raise error.ParseError(_("unexpected token: %s") % self.current[0],
                                   self.current[2])
        self._advance()
    def _parse(self, bind=0):
        token, value, pos = self._advance()
        # handle prefix rules on current token
        prefix = self._elements[token][1]
        if not prefix:
            raise error.ParseError(_("not a prefix: %s") % token, pos)
        if len(prefix) == 1:
            expr = (prefix[0], value)
        else:
            if len(prefix) > 2 and prefix[2] == self.current[0]:
                self._match(prefix[2], pos)
                expr = (prefix[0], None)
            else:
                expr = (prefix[0], self._parse(prefix[1]))
                if len(prefix) > 2:
                    self._match(prefix[2], pos)
        # gather tokens until we meet a lower binding strength
        while bind < self._elements[self.current[0]][0]:
            token, value, pos = self._advance()
            e = self._elements[token]
            # check for suffix - next token isn't a valid prefix
            if len(e) == 4 and not self._elements[self.current[0]][1]:
                suffix = e[3]
                expr = (suffix[0], expr)
            else:
                # handle infix rules
                if len(e) < 3 or not e[2]:
                    raise error.ParseError(_("not an infix: %s") % token, pos)
                infix = e[2]
                if len(infix) == 3 and infix[2] == self.current[0]:
                    self._match(infix[2], pos)
                    expr = (infix[0], expr, (None))
                else:
                    expr = (infix[0], expr, self._parse(infix[1]))
                    if len(infix) == 3:
                        self._match(infix[2], pos)
        return expr
    def parse(self, message):
        'generate a parse tree from a message'
        self._iter = self._tokenizer(message)
        self._advance()
        res = self._parse()
        token, value, pos = self.current
        return res, pos
    def eval(self, tree):
        'recursively evaluate a parse tree using node methods'
        if not isinstance(tree, tuple):
            return tree
        return self._methods[tree[0]](*[self.eval(t) for t in tree[1:]])
    def __call__(self, message):
        'parse a message into a parse tree and evaluate if methods given'
        t = self.parse(message)
        if self._methods:
            return self.eval(t)
        return t