comparison tests/test-highlight.t @ 26680:7a3f6490ef97

highlight: add option to prevent content-only based fallback When Mozilla enabled Pygments on hg.mozilla.org, we got a lot of weirdly colorized files. Upon further investigation, the hightlight extension is first attempting a filename+content based match then falling back to a purely content-driven detection mode in Pygments. Sounds good in theory. Unfortunately, Pygments' content-driven detection establishes no minimum threshold for returning a lexer. Furthermore, the detection code for a number of languages is very liberal. For example, ActionScript 3 will return a confidence of 0.3 (out of 1.0) if the first 1k of the file we pass in matches the regex "\w+\s*:\s*\w"! Python matches on "import ". It's no coincidence that a number of our extension-less files were getting highlighted improperly. This patch adds an option to have the highlighter not fall back to purely content-based detection when filename+content detection failed. This can be enabled to render unlighted text instead of taking the risk that unknown file types are highlighted incorrectly. The old behavior is still the default.
author Gregory Szorc <gregory.szorc@gmail.com>
date Wed, 14 Oct 2015 18:22:16 -0700
parents 4b0fc75f9403
children 834d27c4655d
comparison
equal deleted inserted replaced
26679:0d93df4d1e44 26680:7a3f6490ef97
642 $ hgserveget us-ascii eucjp.txt 642 $ hgserveget us-ascii eucjp.txt
643 % HGENCODING=us-ascii hg serve 643 % HGENCODING=us-ascii hg serve
644 % hgweb filerevision, html 644 % hgweb filerevision, html
645 % errors encountered 645 % errors encountered
646 646
647 We attempt to highlight unknown files by default
648
649 $ killdaemons.py
650
651 $ cat > .hg/hgrc << EOF
652 > [web]
653 > highlightfiles = **
654 > EOF
655
656 $ cat > unknownfile << EOF
657 > #!/usr/bin/python
658 > def foo():
659 > pass
660 > EOF
661
662 $ hg add unknownfile
663 $ hg commit -m unknown unknownfile
664
665 $ hg serve -p $HGPORT -d -n test --pid-file=hg.pid
666 $ cat hg.pid >> $DAEMON_PIDS
667
668 $ get-with-headers.py localhost:$HGPORT 'file/tip/unknownfile' | grep l2
669 <span id="l2"><span class="k">def</span> <span class="nf">foo</span><span class="p">():</span></span><a href="#l2"></a>
670
671 We can prevent Pygments from falling back to a non filename-based
672 detection mode
673
674 $ cat > .hg/hgrc << EOF
675 > [web]
676 > highlightfiles = **
677 > highlightonlymatchfilename = true
678 > EOF
679
680 $ killdaemons.py
681 $ hg serve -p $HGPORT -d -n test --pid-file=hg.pid
682 $ cat hg.pid >> $DAEMON_PIDS
683 $ get-with-headers.py localhost:$HGPORT 'file/tip/unknownfile' | grep l2
684 <span id="l2">def foo():</span><a href="#l2"></a>
685
647 $ cd .. 686 $ cd ..