view hgext/logtoprocess.py @ 47343:9f798c1b0d89 stable

cext: fix memory leak in phases computation Without this a buffer whose size in bytes is the number of changesets in the repository is leaked each time the repository is opened and changeset phases are computed. Impact: the current code in hgwebdir creates a new `localrepository` instance for each HTTP request. Since any pull or push is made of several requests, a team of 100 people can easily produce thousands of such requests per day. Being a low-level malloc, this leak can't be seen with the gc module and tools relying on that, but was spotted by valgrind immediately. Reproduction ------------ for i in range(cl_args.iterations): repo = hg.repository(baseui, repo_path) rev = repo.revs(rev).first() ctx = repo[rev] del ctx del repo # avoid any pollution by other type of leak # (that should be fixed in 5.8) repoview._filteredrepotypes.clear() gc.collect() Measurements ------------ Resident Set Size (RSS), taken on a clone of mozilla-central for performance analysis (440 000 changesets). before: 5.8+hg19.5ac0f2a8ba72 1000 iterations: 1606MB 5.8+hg19.5ac0f2a8ba72 10000 iterations: 5723MB after: 5.8+hg20.e2084d39e145 1000 iterations: 555MB 5.8+hg20.e2084d39e145 10000 iterations: 555MB (double checked, not a copy/paste error) (e2084d39e14 is the present changeset, before amendment of the message to add the measurements)
author Georges Racinet <georges.racinet@octobus.net>
date Sun, 06 Jun 2021 01:24:30 +0200
parents 7c0b8652fd8c
children 6000f5b25c9b
line wrap: on
line source

# logtoprocess.py - send ui.log() data to a subprocess
#
# Copyright 2016 Facebook, Inc.
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
"""send ui.log() data to a subprocess (EXPERIMENTAL)

This extension lets you specify a shell command per ui.log() event,
sending all remaining arguments to as environment variables to that command.

Positional arguments construct a log message, which is passed in the `MSG1`
environment variables. Each keyword argument is set as a `OPT_UPPERCASE_KEY`
variable (so the key is uppercased, and prefixed with `OPT_`). The original
event name is passed in the `EVENT` environment variable, and the process ID
of mercurial is given in `HGPID`.

So given a call `ui.log('foo', 'bar %s\n', 'baz', spam='eggs'), a script
configured for the `foo` event can expect an environment with `MSG1=bar baz`,
and `OPT_SPAM=eggs`.

Scripts are configured in the `[logtoprocess]` section, each key an event name.
For example::

  [logtoprocess]
  commandexception = echo "$MSG1" > /var/log/mercurial_exceptions.log

would log the warning message and traceback of any failed command dispatch.

Scripts are run asynchronously as detached daemon processes; mercurial will
not ensure that they exit cleanly.

"""

from __future__ import absolute_import

import os

from mercurial.utils import procutil

# Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for
# extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
# be specifying the version(s) of Mercurial they are tested with, or
# leave the attribute unspecified.
testedwith = b'ships-with-hg-core'


class processlogger(object):
    """Map log events to external commands

    Arguments are passed on as environment variables.
    """

    def __init__(self, ui):
        self._scripts = dict(ui.configitems(b'logtoprocess'))

    def tracked(self, event):
        return bool(self._scripts.get(event))

    def log(self, ui, event, msg, opts):
        script = self._scripts[event]
        maxmsg = 100000
        if len(msg) > maxmsg:
            # Each env var has a 128KiB limit on linux. msg can be long, in
            # particular for command event, where it's the full command line.
            # Prefer truncating the message than raising "Argument list too
            # long" error.
            msg = msg[:maxmsg] + b' (truncated)'
        env = {
            b'EVENT': event,
            b'HGPID': os.getpid(),
            b'MSG1': msg,
        }
        # keyword arguments get prefixed with OPT_ and uppercased
        env.update(
            (b'OPT_%s' % key.upper(), value) for key, value in opts.items()
        )
        fullenv = procutil.shellenviron(env)
        procutil.runbgcommand(script, fullenv, shell=True)


def uipopulate(ui):
    ui.setlogger(b'logtoprocess', processlogger(ui))