tests/test-hgweb-non-interactive.t
author Gregory Szorc <gregory.szorc@gmail.com>
Mon, 22 Aug 2016 21:48:50 -0700
changeset 29830 92ac2baaea86
parent 28861 86db5cb55d46
child 31008 636cf3f7620d
permissions -rw-r--r--
revlog: use an LRU cache for delta chain bases Profiling using statprof revealed a hotspot during changegroup application calculating delta chain bases on generaldelta repos. Essentially, revlog._addrevision() was performing a lot of redundant work tracing the delta chain as part of determining when the chain distance was acceptable. This was most pronounced when adding revisions to manifests, which can have delta chains thousands of revisions long. There was a delta chain base cache on revlogs before, but it only captured a single revision. This was acceptable before generaldelta, when _addrevision would build deltas from the previous revision and thus we'd pretty much guarantee a cache hit when resolving the delta chain base on a subsequent _addrevision call. However, it isn't suitable for generaldelta because parent revisions aren't necessarily the last processed revision. This patch converts the delta chain base cache to an LRU dict cache. The cache can hold multiple entries, so generaldelta repos have a higher chance of getting a cache hit. The impact of this change when processing changegroup additions is significant. On a generaldelta conversion of the "mozilla-unified" repo (which contains heads of the main Firefox repositories in chronological order - this means there are lots of transitions between heads in revlog order), this change has the following impact when performing an `hg unbundle` of an uncompressed bundle of the repo: before: 5:42 CPU time after: 4:34 CPU time Most of this time is saved when applying the changelog and manifest revlogs: before: 2:30 CPU time after: 1:17 CPU time That nearly a 50% reduction in CPU time applying changesets and manifests! Applying a gzipped bundle of the same repo (effectively simulating a `hg clone` over HTTP) showed a similar speedup: before: 5:53 CPU time after: 4:46 CPU time Wall time improvements were basically the same as CPU time. I didn't measure explicitly, but it feels like most of the time is saved when processing manifests. This makes sense, as large manifests tend to have very long delta chains and thus benefit the most from this cache. So, this change effectively makes changegroup application (which is used by `hg unbundle`, `hg clone`, `hg pull`, `hg unshelve`, and various other commands) significantly faster when delta chains are long (which can happen on repos with large numbers of files and thus large manifests). In theory, this change can result in more memory utilization. However, we're caching a dict of ints. At most we have 200 ints + Python object overhead per revlog. And, the cache is really only populated when performing read-heavy operations, such as adding changegroups or scanning an individual revlog. For memory bloat to be an issue, we'd need to scan/read several revisions from several revlogs all while having active references to several revlogs. I don't think there are many operations that do this, so I don't think memory bloat from the cache will be an issue.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
12440
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
     1
Tests if hgweb can run without touching sys.stdin, as is required
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
     2
by the WSGI standard and strictly implemented by mod_wsgi.
5337
8c5ef3b87cb1 Don't try to determine interactivity if ui() called with interactive=False.
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents:
diff changeset
     3
13956
ffb5c09ba822 tests: remove redundant mkdir
Martin Geisler <mg@lazybytes.net>
parents: 12743
diff changeset
     4
  $ hg init repo
12440
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
     5
  $ cd repo
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
     6
  $ echo foo > bar
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
     7
  $ hg add bar
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
     8
  $ hg commit -m "test"
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
     9
  $ cat > request.py <<EOF
28859
af2e00c85d0a py3: use absolute_import in test-hgweb-non-interactive.t
timeless <timeless@mozdev.org>
parents: 26247
diff changeset
    10
  > from __future__ import absolute_import
af2e00c85d0a py3: use absolute_import in test-hgweb-non-interactive.t
timeless <timeless@mozdev.org>
parents: 26247
diff changeset
    11
  > import os
af2e00c85d0a py3: use absolute_import in test-hgweb-non-interactive.t
timeless <timeless@mozdev.org>
parents: 26247
diff changeset
    12
  > import sys
af2e00c85d0a py3: use absolute_import in test-hgweb-non-interactive.t
timeless <timeless@mozdev.org>
parents: 26247
diff changeset
    13
  > from mercurial import (
af2e00c85d0a py3: use absolute_import in test-hgweb-non-interactive.t
timeless <timeless@mozdev.org>
parents: 26247
diff changeset
    14
  >     dispatch,
af2e00c85d0a py3: use absolute_import in test-hgweb-non-interactive.t
timeless <timeless@mozdev.org>
parents: 26247
diff changeset
    15
  >     hg,
af2e00c85d0a py3: use absolute_import in test-hgweb-non-interactive.t
timeless <timeless@mozdev.org>
parents: 26247
diff changeset
    16
  >     ui as uimod,
28861
86db5cb55d46 pycompat: switch to util.stringio for py3 compat
timeless <timeless@mozdev.org>
parents: 28859
diff changeset
    17
  >     util,
28859
af2e00c85d0a py3: use absolute_import in test-hgweb-non-interactive.t
timeless <timeless@mozdev.org>
parents: 26247
diff changeset
    18
  > )
af2e00c85d0a py3: use absolute_import in test-hgweb-non-interactive.t
timeless <timeless@mozdev.org>
parents: 26247
diff changeset
    19
  > ui = uimod.ui
af2e00c85d0a py3: use absolute_import in test-hgweb-non-interactive.t
timeless <timeless@mozdev.org>
parents: 26247
diff changeset
    20
  > from mercurial.hgweb.hgweb_mod import (
af2e00c85d0a py3: use absolute_import in test-hgweb-non-interactive.t
timeless <timeless@mozdev.org>
parents: 26247
diff changeset
    21
  >     hgweb,
af2e00c85d0a py3: use absolute_import in test-hgweb-non-interactive.t
timeless <timeless@mozdev.org>
parents: 26247
diff changeset
    22
  > )
28861
86db5cb55d46 pycompat: switch to util.stringio for py3 compat
timeless <timeless@mozdev.org>
parents: 28859
diff changeset
    23
  > stringio = util.stringio
12440
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    24
  > 
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    25
  > class FileLike(object):
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    26
  >     def __init__(self, real):
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    27
  >         self.real = real
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    28
  >     def fileno(self):
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    29
  >         print >> sys.__stdout__, 'FILENO'
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    30
  >         return self.real.fileno()
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    31
  >     def read(self):
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    32
  >         print >> sys.__stdout__, 'READ'
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    33
  >         return self.real.read()
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    34
  >     def readline(self):
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    35
  >         print >> sys.__stdout__, 'READLINE'
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    36
  >         return self.real.readline()
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    37
  > 
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    38
  > sys.stdin = FileLike(sys.stdin)
28861
86db5cb55d46 pycompat: switch to util.stringio for py3 compat
timeless <timeless@mozdev.org>
parents: 28859
diff changeset
    39
  > errors = stringio()
86db5cb55d46 pycompat: switch to util.stringio for py3 compat
timeless <timeless@mozdev.org>
parents: 28859
diff changeset
    40
  > input = stringio()
86db5cb55d46 pycompat: switch to util.stringio for py3 compat
timeless <timeless@mozdev.org>
parents: 28859
diff changeset
    41
  > output = stringio()
12440
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    42
  > 
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    43
  > def startrsp(status, headers):
12743
4c4aeaab2339 check-code: add 'no tab indent' check for unified tests
Adrian Buehlmann <adrian@cadifra.com>
parents: 12440
diff changeset
    44
  >     print '---- STATUS'
4c4aeaab2339 check-code: add 'no tab indent' check for unified tests
Adrian Buehlmann <adrian@cadifra.com>
parents: 12440
diff changeset
    45
  >     print status
4c4aeaab2339 check-code: add 'no tab indent' check for unified tests
Adrian Buehlmann <adrian@cadifra.com>
parents: 12440
diff changeset
    46
  >     print '---- HEADERS'
4c4aeaab2339 check-code: add 'no tab indent' check for unified tests
Adrian Buehlmann <adrian@cadifra.com>
parents: 12440
diff changeset
    47
  >     print [i for i in headers if i[0] != 'ETag']
4c4aeaab2339 check-code: add 'no tab indent' check for unified tests
Adrian Buehlmann <adrian@cadifra.com>
parents: 12440
diff changeset
    48
  >     print '---- DATA'
4c4aeaab2339 check-code: add 'no tab indent' check for unified tests
Adrian Buehlmann <adrian@cadifra.com>
parents: 12440
diff changeset
    49
  >     return output.write
12440
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    50
  > 
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    51
  > env = {
12743
4c4aeaab2339 check-code: add 'no tab indent' check for unified tests
Adrian Buehlmann <adrian@cadifra.com>
parents: 12440
diff changeset
    52
  >     'wsgi.version': (1, 0),
4c4aeaab2339 check-code: add 'no tab indent' check for unified tests
Adrian Buehlmann <adrian@cadifra.com>
parents: 12440
diff changeset
    53
  >     'wsgi.url_scheme': 'http',
4c4aeaab2339 check-code: add 'no tab indent' check for unified tests
Adrian Buehlmann <adrian@cadifra.com>
parents: 12440
diff changeset
    54
  >     'wsgi.errors': errors,
4c4aeaab2339 check-code: add 'no tab indent' check for unified tests
Adrian Buehlmann <adrian@cadifra.com>
parents: 12440
diff changeset
    55
  >     'wsgi.input': input,
4c4aeaab2339 check-code: add 'no tab indent' check for unified tests
Adrian Buehlmann <adrian@cadifra.com>
parents: 12440
diff changeset
    56
  >     'wsgi.multithread': False,
4c4aeaab2339 check-code: add 'no tab indent' check for unified tests
Adrian Buehlmann <adrian@cadifra.com>
parents: 12440
diff changeset
    57
  >     'wsgi.multiprocess': False,
4c4aeaab2339 check-code: add 'no tab indent' check for unified tests
Adrian Buehlmann <adrian@cadifra.com>
parents: 12440
diff changeset
    58
  >     'wsgi.run_once': False,
4c4aeaab2339 check-code: add 'no tab indent' check for unified tests
Adrian Buehlmann <adrian@cadifra.com>
parents: 12440
diff changeset
    59
  >     'REQUEST_METHOD': 'GET',
4c4aeaab2339 check-code: add 'no tab indent' check for unified tests
Adrian Buehlmann <adrian@cadifra.com>
parents: 12440
diff changeset
    60
  >     'SCRIPT_NAME': '',
4c4aeaab2339 check-code: add 'no tab indent' check for unified tests
Adrian Buehlmann <adrian@cadifra.com>
parents: 12440
diff changeset
    61
  >     'PATH_INFO': '',
4c4aeaab2339 check-code: add 'no tab indent' check for unified tests
Adrian Buehlmann <adrian@cadifra.com>
parents: 12440
diff changeset
    62
  >     'QUERY_STRING': '',
4c4aeaab2339 check-code: add 'no tab indent' check for unified tests
Adrian Buehlmann <adrian@cadifra.com>
parents: 12440
diff changeset
    63
  >     'SERVER_NAME': '127.0.0.1',
4c4aeaab2339 check-code: add 'no tab indent' check for unified tests
Adrian Buehlmann <adrian@cadifra.com>
parents: 12440
diff changeset
    64
  >     'SERVER_PORT': os.environ['HGPORT'],
4c4aeaab2339 check-code: add 'no tab indent' check for unified tests
Adrian Buehlmann <adrian@cadifra.com>
parents: 12440
diff changeset
    65
  >     'SERVER_PROTOCOL': 'HTTP/1.0'
12440
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    66
  > }
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    67
  > 
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    68
  > i = hgweb('.')
26247
7df5d4760873 hgweb: consume generator inside context manager (issue4756)
Gregory Szorc <gregory.szorc@gmail.com>
parents: 26220
diff changeset
    69
  > for c in i(env, startrsp):
7df5d4760873 hgweb: consume generator inside context manager (issue4756)
Gregory Szorc <gregory.szorc@gmail.com>
parents: 26220
diff changeset
    70
  >     pass
12440
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    71
  > print '---- ERRORS'
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    72
  > print errors.getvalue()
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    73
  > print '---- OS.ENVIRON wsgi variables'
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    74
  > print sorted([x for x in os.environ if x.startswith('wsgi')])
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    75
  > print '---- request.ENVIRON wsgi variables'
26220
a43328baa2ac hgweb: use separate repo instances per thread
Gregory Szorc <gregory.szorc@gmail.com>
parents: 26219
diff changeset
    76
  > with i._obtainrepo() as repo:
a43328baa2ac hgweb: use separate repo instances per thread
Gregory Szorc <gregory.szorc@gmail.com>
parents: 26219
diff changeset
    77
  >     print sorted([x for x in repo.ui.environ if x.startswith('wsgi')])
12440
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    78
  > EOF
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    79
  $ python request.py
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    80
  ---- STATUS
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    81
  200 Script output follows
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    82
  ---- HEADERS
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    83
  [('Content-Type', 'text/html; charset=ascii')]
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    84
  ---- DATA
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    85
  ---- ERRORS
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    86
  
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    87
  ---- OS.ENVIRON wsgi variables
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    88
  []
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    89
  ---- request.ENVIRON wsgi variables
d9f7753a94d5 tests: unify test-hgweb-non-interactive
Matt Mackall <mpm@selenic.com>
parents: 12183
diff changeset
    90
  ['wsgi.errors', 'wsgi.input', 'wsgi.multiprocess', 'wsgi.multithread', 'wsgi.run_once', 'wsgi.url_scheme', 'wsgi.version']
16913
f2719b387380 tests: add missing trailing 'cd ..'
Mads Kiilerich <mads@kiilerich.com>
parents: 13956
diff changeset
    91
f2719b387380 tests: add missing trailing 'cd ..'
Mads Kiilerich <mads@kiilerich.com>
parents: 13956
diff changeset
    92
  $ cd ..