Mercurial > hg
view hgext/logtoprocess.py @ 30561:7c0c722d568d
bdiff: early pruning of common prefix before doing expensive computations
It seems quite common that files don't change completely. New lines are often
pretty much appended, and modifications will often only change a small section
of the file which on average will be in the middle.
There can thus be a big win by pruning a common prefix before starting the more
expensive search for longest common substrings.
Worst case, it will scan through a long sequence of similar bytes without
encountering a newline. Splitlines will then have to do the same again ...
twice for each side. If similar lines are found, splitlines will save the
double iteration and hashing of the lines ... plus there will be less lines to
find common substrings in.
This change might in some cases make the algorith pick shorter or less optimal
common substrings. We can't have the cake and eat it.
This make hg --time bundle --base null -r 4.0 go from 14.5 to 15 s - a 3%
increase.
On mozilla-unified:
perfbdiff -m 3041e4d59df2
! wall 0.053088 comb 0.060000 user 0.060000 sys 0.000000 (best of 100) to
! wall 0.024618 comb 0.020000 user 0.020000 sys 0.000000 (best of 116)
perfbdiff 0e9928989e9c --alldata --count 10
! wall 0.702075 comb 0.700000 user 0.700000 sys 0.000000 (best of 15) to
! wall 0.579235 comb 0.580000 user 0.580000 sys 0.000000 (best of 18)
author | Mads Kiilerich <madski@unity3d.com> |
---|---|
date | Wed, 16 Nov 2016 19:45:35 +0100 |
parents | 318a24b52eeb |
children | 1c5cbf28f007 |
line wrap: on
line source
# logtoprocess.py - send ui.log() data to a subprocess # # Copyright 2016 Facebook, Inc. # # This software may be used and distributed according to the terms of the # GNU General Public License version 2 or any later version. """Send ui.log() data to a subprocess (EXPERIMENTAL) This extension lets you specify a shell command per ui.log() event, sending all remaining arguments to as environment variables to that command. Each positional argument to the method results in a `MSG[N]` key in the environment, starting at 1 (so `MSG1`, `MSG2`, etc.). Each keyword argument is set as a `OPT_UPPERCASE_KEY` variable (so the key is uppercased, and prefixed with `OPT_`). The original event name is passed in the `EVENT` environment variable, and the process ID of mercurial is given in `HGPID`. So given a call `ui.log('foo', 'bar', 'baz', spam='eggs'), a script configured for the `foo` event can expect an environment with `MSG1=bar`, `MSG2=baz`, and `OPT_SPAM=eggs`. Scripts are configured in the `[logtoprocess]` section, each key an event name. For example:: [logtoprocess] commandexception = echo "$MSG2$MSG3" > /var/log/mercurial_exceptions.log would log the warning message and traceback of any failed command dispatch. Scripts are run asynchronously as detached daemon processes; mercurial will not ensure that they exit cleanly. """ from __future__ import absolute_import import itertools import os import platform import subprocess import sys # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for # extensions which SHIP WITH MERCURIAL. Non-mainline extensions should # be specifying the version(s) of Mercurial they are tested with, or # leave the attribute unspecified. testedwith = 'ships-with-hg-core' def uisetup(ui): if platform.system() == 'Windows': # no fork on Windows, but we can create a detached process # https://msdn.microsoft.com/en-us/library/windows/desktop/ms684863.aspx # No stdlib constant exists for this value DETACHED_PROCESS = 0x00000008 _creationflags = DETACHED_PROCESS | subprocess.CREATE_NEW_PROCESS_GROUP def runshellcommand(script, env): # we can't use close_fds *and* redirect stdin. I'm not sure that we # need to because the detached process has no console connection. subprocess.Popen( script, shell=True, env=env, close_fds=True, creationflags=_creationflags) else: def runshellcommand(script, env): # double-fork to completely detach from the parent process # based on http://code.activestate.com/recipes/278731 pid = os.fork() if pid: # parent return # subprocess.Popen() forks again, all we need to add is # flag the new process as a new session. if sys.version_info < (3, 2): newsession = {'preexec_fn': os.setsid} else: newsession = {'start_new_session': True} try: # connect stdin to devnull to make sure the subprocess can't # muck up that stream for mercurial. subprocess.Popen( script, shell=True, stdin=open(os.devnull, 'r'), env=env, close_fds=True, **newsession) finally: # mission accomplished, this child needs to exit and not # continue the hg process here. os._exit(0) class logtoprocessui(ui.__class__): def log(self, event, *msg, **opts): """Map log events to external commands Arguments are passed on as environment variables. """ script = self.config('logtoprocess', event) if script: if msg: # try to format the log message given the remaining # arguments try: # Python string formatting with % either uses a # dictionary *or* tuple, but not both. If we have # keyword options, assume we need a mapping. formatted = msg[0] % (opts or msg[1:]) except (TypeError, KeyError): # Failed to apply the arguments, ignore formatted = msg[0] messages = (formatted,) + msg[1:] else: messages = msg # positional arguments are listed as MSG[N] keys in the # environment msgpairs = ( ('MSG{0:d}'.format(i), str(m)) for i, m in enumerate(messages, 1)) # keyword arguments get prefixed with OPT_ and uppercased optpairs = ( ('OPT_{0}'.format(key.upper()), str(value)) for key, value in opts.iteritems()) env = dict(itertools.chain(os.environ.items(), msgpairs, optpairs), EVENT=event, HGPID=str(os.getpid())) # Connect stdin to /dev/null to prevent child processes messing # with mercurial's stdin. runshellcommand(script, env) return super(logtoprocessui, self).log(event, *msg, **opts) # Replace the class for this instance and all clones created from it: ui.__class__ = logtoprocessui