hgweb.cgi
author Jun Wu <quark@fb.com>
Mon, 09 Apr 2018 15:58:30 -0700
changeset 37732 35632d392279
parent 26421 4b0fc75f9403
child 43731 47ef023d0165
permissions -rwxr-xr-x
patch: implement a new worddiff algorithm The previous worddiff algorithm has many problems. The major problem is it does a "similarity check" that selects a subset of matched lines to do inline diffs. It is a bad idea because: - The "similarity check" is non-obvious to users. For example, a simple change from "long long x" to "int64_t x" will fail the similarity check and won't be diff-ed as expected. - Selecting "lines" to diff won't work as people expect if there are line wrapping changes. - It has a sad time complexity if lines do not match, could be O(N^2)-ish. There are other problems in implementation details. - Lines can match across distant hunks (if the next hunk does not have "-" lines). - "difflib" is slow. The solution would be removing the "similarity check", and just diff all words in a same hunk. So no content will be missed and everything will be diff-ed as expected. This is similar to what code review tool like Phabricator does. This diff implements the word diff algorithm as described above. It also avoids difflib to be faster. Note about colors: To be consistent, "changed inserted" parts and "purely insertion blocks" should have a same color, since they do not exist in the previous version. Instead of highlighting differences, this patch chooses to dim common parts. This is also more consistent with Phabricator or GitHub webpage. That said, the labels are defined in a way that people can still highlight changed parts and leave purely inserted/deleted hunks use the "non-highlighted" color. As one example, running: hg log -pr df50b87d8f736aff8dc281f816bddcd6f306930c mercurial/commands.py \ --config experimental.worddiff=1 --color=debug --config diff.unified=0 The previous algorithm outputs: [diff.file_a|--- a/mercurial/commands.py Fri Mar 09 15:53:41 2018 +0100] [diff.file_b|+++ b/mercurial/commands.py Sat Mar 10 12:33:19 2018 +0530] [diff.hunk|@@ -2039,1 +2039,4 @@] [diff.deleted|-][diff.deleted.highlight|@command('^forget',][diff.deleted| ][diff.deleted.highlight|walkopts,][diff.deleted| _('[OPTION]... FILE...'), inferrepo=True)] [diff.inserted|+@command(] [diff.inserted|+ '^forget',] [diff.inserted|+ walkopts + dryrunopts,] [diff.inserted|+ ][diff.inserted.highlight| ][diff.inserted| _('[OPTION]... FILE...'), inferrepo=True)] [diff.hunk|@@ -2074,1 +2077,3 @@] [diff.deleted|- rejected = cmdutil.forget(ui, repo, m, prefix="",][diff.deleted.highlight| explicitonly=False)[0]] [diff.inserted|+ dryrun = opts.get(r'dry_run')] [diff.inserted|+ rejected = cmdutil.forget(ui, repo, m, prefix="",] [diff.inserted|+ explicitonly=False, dryrun=dryrun)[0]] The new algorithm outputs: [diff.file_a|--- a/mercurial/commands.py Fri Mar 09 15:53:41 2018 +0100] [diff.file_b|+++ b/mercurial/commands.py Sat Mar 10 12:33:19 2018 +0530] [diff.hunk|@@ -2039,1 +2039,4 @@] [diff.deleted|-][diff.deleted.unchanged|@command(][diff.deleted.unchanged|'^forget',][diff.deleted.unchanged| ][diff.deleted.changed|walkopts][diff.deleted.unchanged|,][diff.deleted.changed| ][diff.deleted.unchanged|_('[OPTION]... FILE...'), inferrepo=True)] [diff.inserted|+][diff.inserted.unchanged|@command(] [diff.inserted|+][diff.inserted.changed| ][diff.inserted.unchanged|'^forget',] [diff.inserted|+][diff.inserted.changed| walkopts][diff.inserted.unchanged| ][diff.inserted.changed|+ dryrunopts][diff.inserted.unchanged|,] [diff.inserted|+][diff.inserted.changed| ][diff.inserted.unchanged|_('[OPTION]... FILE...'), inferrepo=True)] [diff.hunk|@@ -2074,1 +2077,3 @@] [diff.deleted|-][diff.deleted.unchanged| rejected = cmdutil.forget(ui, repo, m, prefix="",][diff.deleted.changed| ][diff.deleted.unchanged|explicitonly=False][diff.deleted.unchanged|)[0]] [diff.inserted|+][diff.inserted.changed| dryrun = opts.get(r'dry_run')] [diff.inserted|+][diff.inserted.unchanged| rejected = cmdutil.forget(ui, repo, m, prefix="",] [diff.inserted|+][diff.inserted.changed| ][diff.inserted.unchanged|explicitonly=False][diff.inserted.changed|, dryrun=dryrun][diff.inserted.unchanged|)[0]] Practically, when diffing a 8k line change, the time spent on worddiff reduces from 4 seconds to 0.14 seconds. Differential Revision: https://phab.mercurial-scm.org/D3212
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
202
e875a0cf7f3a Call python via env in hgweb.cgi
mpm@selenic.com
parents: 159
diff changeset
     1
#!/usr/bin/env python
159
f9d8620ef469 Add example CGI script
mpm@selenic.com
parents:
diff changeset
     2
#
11000
338167735124 hgweb: simplify hgweb.cgi, add help pointer
Matt Mackall <mpm@selenic.com>
parents: 6142
diff changeset
     3
# An example hgweb CGI script, edit as necessary
26421
4b0fc75f9403 urls: bulk-change primary website URLs
Matt Mackall <mpm@selenic.com>
parents: 15475
diff changeset
     4
# See also https://mercurial-scm.org/wiki/PublishingRepositories
159
f9d8620ef469 Add example CGI script
mpm@selenic.com
parents:
diff changeset
     5
11000
338167735124 hgweb: simplify hgweb.cgi, add help pointer
Matt Mackall <mpm@selenic.com>
parents: 6142
diff changeset
     6
# Path to repo or hgweb config to serve (see 'hg help hgweb')
338167735124 hgweb: simplify hgweb.cgi, add help pointer
Matt Mackall <mpm@selenic.com>
parents: 6142
diff changeset
     7
config = "/path/to/repo/or/config"
5244
79279b5583c6 cgi: sys.path.insert should be before importing mercurial
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 5197
diff changeset
     8
15475
85cba926cb59 hgweb: add hint about finding library path with debuginstall
Matt Mackall <mpm@selenic.com>
parents: 11503
diff changeset
     9
# Uncomment and adjust if Mercurial is not installed system-wide
85cba926cb59 hgweb: add hint about finding library path with debuginstall
Matt Mackall <mpm@selenic.com>
parents: 11503
diff changeset
    10
# (consult "installed modules" path from 'hg debuginstall'):
11000
338167735124 hgweb: simplify hgweb.cgi, add help pointer
Matt Mackall <mpm@selenic.com>
parents: 6142
diff changeset
    11
#import sys; sys.path.insert(0, "/path/to/python/lib")
5197
55860a45bbf2 Enable demandimport only in scripts, not in importable modules (issue605)
Thomas Arendsen Hein <thomas@intevation.de>
parents: 3868
diff changeset
    12
6080
4baad19c4801 hgweb: disable cgitb by default
Maxim Dounin <mdounin@mdounin.ru>
parents: 5995
diff changeset
    13
# Uncomment to send python tracebacks to the browser if an error occurs:
11000
338167735124 hgweb: simplify hgweb.cgi, add help pointer
Matt Mackall <mpm@selenic.com>
parents: 6142
diff changeset
    14
#import cgitb; cgitb.enable()
391
5f65a108a559 hgweb: pull cgitb into CGI script example, where it can easily be disabled
mpm@selenic.com
parents: 202
diff changeset
    15
11000
338167735124 hgweb: simplify hgweb.cgi, add help pointer
Matt Mackall <mpm@selenic.com>
parents: 6142
diff changeset
    16
from mercurial import demandimport; demandimport.enable()
338167735124 hgweb: simplify hgweb.cgi, add help pointer
Matt Mackall <mpm@selenic.com>
parents: 6142
diff changeset
    17
from mercurial.hgweb import hgweb, wsgicgi
338167735124 hgweb: simplify hgweb.cgi, add help pointer
Matt Mackall <mpm@selenic.com>
parents: 6142
diff changeset
    18
application = hgweb(config)
6141
90e5c82a3859 Backed out changeset b913d3aacddc (see issue971/msg5317)
Thomas Arendsen Hein <thomas@intevation.de>
parents: 5995
diff changeset
    19
wsgicgi.launch(application)