hgext/highlight/highlight.py
author Martin von Zweigbergk <martinvonz@google.com>
Tue, 28 Oct 2014 22:47:22 -0700
changeset 25233 9789b4a7c595
parent 23613 7b8ff3fd11d3
child 25867 a74e9806d17d
permissions -rw-r--r--
match: introduce boolean prefix() method tl;dr: This is another step towards a (previously unstated) goal of eliminating match.files() in conditions. There are four types of matchers: * always: Matches everything, checked with always(), files() is empty * exact: Matches exact set of files, checked with isexact(), files() contains the files to match * patterns: Matches more complex patterns, checked with anypats(), files() contains roots of the matched patterns * prefix: Matches simple 'path:' patterns as prefixes ('foo' matches both 'foo' and 'foo/bar'), no single method to check, files() contains the prefixes to match For completeness, it would be nice to have a method for checking for the "prefix" type of matcher as well, so let's add that, making it return True simply when none of the others do. The larger goal here is to eliminate uses of match.files() in conditions (i.e. bool(match.files())). The reason for this is that there are scenarios when you would like to create a "prefix" matcher that happens to match no files. One example is for 'hg files -I foo bar'. The narrowmatcher also restricts the set of files given and it would not surprise me if have bugs caused by that already. Note that 'if m.files() and not m.anypats()' and similar is sometimes used to catch the "exact" and "prefix" cases above.

# highlight.py - highlight extension implementation file
#
#  Copyright 2007-2009 Adam Hupp <adam@hupp.org> and others
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
#
# The original module was split in an interface and an implementation
# file to defer pygments loading and speedup extension setup.

from mercurial import demandimport
demandimport.ignore.extend(['pkgutil', 'pkg_resources', '__main__'])
from mercurial import util, encoding

from pygments import highlight
from pygments.util import ClassNotFound
from pygments.lexers import guess_lexer, guess_lexer_for_filename, TextLexer
from pygments.formatters import HtmlFormatter

SYNTAX_CSS = ('\n<link rel="stylesheet" href="{url}highlightcss" '
              'type="text/css" />')

def pygmentize(field, fctx, style, tmpl):

    # append a <link ...> to the syntax highlighting css
    old_header = tmpl.load('header')
    if SYNTAX_CSS not in old_header:
        new_header =  old_header + SYNTAX_CSS
        tmpl.cache['header'] = new_header

    text = fctx.data()
    if util.binary(text):
        return

    # str.splitlines() != unicode.splitlines() because "reasons"
    for c in "\x0c\x1c\x1d\x1e":
        if c in text:
            text = text.replace(c, '')

    # Pygments is best used with Unicode strings:
    # <http://pygments.org/docs/unicode/>
    text = text.decode(encoding.encoding, 'replace')

    # To get multi-line strings right, we can't format line-by-line
    try:
        lexer = guess_lexer_for_filename(fctx.path(), text[:1024],
                                         stripnl=False)
    except (ClassNotFound, ValueError):
        try:
            lexer = guess_lexer(text[:1024], stripnl=False)
        except (ClassNotFound, ValueError):
            lexer = TextLexer(stripnl=False)

    formatter = HtmlFormatter(style=style)

    colorized = highlight(text, lexer, formatter)
    # strip wrapping div
    colorized = colorized[:colorized.find('\n</pre>')]
    colorized = colorized[colorized.find('<pre>') + 5:]
    coloriter = (s.encode(encoding.encoding, 'replace')
                 for s in colorized.splitlines())

    tmpl.filters['colorize'] = lambda x: coloriter.next()

    oldl = tmpl.cache[field]
    newl = oldl.replace('line|escape', 'line|colorize')
    tmpl.cache[field] = newl