view tests/test-wireproto-caching.t @ 44363:f7459da77f23

nodemap: introduce an option to use mmap to read the nodemap mapping The performance and memory benefit is much greater if we don't have to copy all the data in memory for each information. So we introduce an option (on by default) to read the data using mmap. This changeset is the last one definition the API for index support nodemap data. (they have to be able to use the mmaping). Below are some benchmark comparing the best we currently have in 5.3 with the final step of this series (using the persistent nodemap implementation in Rust). The benchmark run `hg perfindex` with various revset and the following variants: Before: * do not use the persistent nodemap * use the CPython implementation of the index for nodemap * use mmapping of the changelog index After: * use the MixedIndex Rust code, with the NodeTree object for nodemap access (still in review) * use the persistent nodemap data from disk * access the persistent nodemap data through mmap * use mmapping of the changelog index The persistent nodemap greatly speed up most operation on very large repositories. Some of the previously very fast lookup end up a bit slower because the persistent nodemap has to be setup. However the absolute slowdown is very small and won't matters in the big picture. Here are some numbers (in seconds) for the reference copy of mozilla-try: Revset Before After abs-change speedup -10000: 0.004622 0.005532 0.000910 × 0.83 -10: 0.000050 0.000132 0.000082 × 0.37 tip 0.000052 0.000085 0.000033 × 0.61 0 + (-10000:) 0.028222 0.005337 -0.022885 × 5.29 0 0.023521 0.000084 -0.023437 × 280.01 (-10000:) + 0 0.235539 0.005308 -0.230231 × 44.37 (-10:) + :9 0.232883 0.000180 -0.232703 ×1293.79 (-10000:) + (:99) 0.238735 0.005358 -0.233377 × 44.55 :99 + (-10000:) 0.317942 0.005593 -0.312349 × 56.84 :9 + (-10:) 0.313372 0.000179 -0.313193 ×1750.68 :9 0.316450 0.000143 -0.316307 ×2212.93 On smaller repositories, the cost of nodemap related operation is not as big, so the win is much more modest. Yet it helps shaving a handful of millisecond here and there. Here are some numbers (in seconds) for the reference copy of mercurial: Revset Before After abs-change speedup -10: 0.000065 0.000097 0.000032 × 0.67 tip 0.000063 0.000078 0.000015 × 0.80 0 0.000561 0.000079 -0.000482 × 7.10 -10000: 0.004609 0.003648 -0.000961 × 1.26 0 + (-10000:) 0.005023 0.003715 -0.001307 × 1.35 (-10:) + :9 0.002187 0.000108 -0.002079 ×20.25 (-10000:) + 0 0.006252 0.003716 -0.002536 × 1.68 (-10000:) + (:99) 0.006367 0.003707 -0.002660 × 1.71 :9 + (-10:) 0.003846 0.000110 -0.003736 ×34.96 :9 0.003854 0.000099 -0.003755 ×38.92 :99 + (-10000:) 0.007644 0.003778 -0.003866 × 2.02 Differential Revision: https://phab.mercurial-scm.org/D7894
author Pierre-Yves David <pierre-yves.david@octobus.net>
date Tue, 11 Feb 2020 11:18:52 +0100
parents a0886a4d6dce
children 95c4cca641f6
line wrap: on
line source

  $ . $TESTDIR/wireprotohelpers.sh
  $ cat >> $HGRCPATH << EOF
  > [extensions]
  > blackbox =
  > [blackbox]
  > track = simplecache
  > EOF
  $ hg init server
  $ enablehttpv2 server
  $ cd server
  $ cat >> .hg/hgrc << EOF
  > [extensions]
  > simplecache = $TESTDIR/wireprotosimplecache.py
  > EOF

  $ echo a0 > a
  $ echo b0 > b
  $ hg -q commit -A -m 'commit 0'
  $ echo a1 > a
  $ hg commit -m 'commit 1'
  $ echo b1 > b
  $ hg commit -m 'commit 2'
  $ echo a2 > a
  $ echo b2 > b
  $ hg commit -m 'commit 3'

  $ hg log -G -T '{rev}:{node} {desc}'
  @  3:50590a86f3ff5d1e9a1624a7a6957884565cc8e8 commit 3
  |
  o  2:4d01eda50c6ac5f7e89cbe1880143a32f559c302 commit 2
  |
  o  1:4432d83626e8a98655f062ec1f2a43b07f7fbbb0 commit 1
  |
  o  0:3390ef850073fbc2f0dfff2244342c8e9229013a commit 0
  

  $ hg --debug debugindex -m
     rev linkrev nodeid                                   p1                                       p2
       0       0 992f4779029a3df8d0666d00bb924f69634e2641 0000000000000000000000000000000000000000 0000000000000000000000000000000000000000
       1       1 a988fb43583e871d1ed5750ee074c6d840bbbfc8 992f4779029a3df8d0666d00bb924f69634e2641 0000000000000000000000000000000000000000
       2       2 a8853dafacfca6fc807055a660d8b835141a3bb4 a988fb43583e871d1ed5750ee074c6d840bbbfc8 0000000000000000000000000000000000000000
       3       3 3fe11dfbb13645782b0addafbe75a87c210ffddc a8853dafacfca6fc807055a660d8b835141a3bb4 0000000000000000000000000000000000000000

  $ hg serve -p $HGPORT -d --pid-file hg.pid -E error.log
  $ cat hg.pid > $DAEMON_PIDS

Performing the same request should result in same result, with 2nd response
coming from cache.

  $ sendhttpv2peer << EOF
  > command manifestdata
  >     nodes eval:[b'\x99\x2f\x47\x79\x02\x9a\x3d\xf8\xd0\x66\x6d\x00\xbb\x92\x4f\x69\x63\x4e\x26\x41']
  >     tree eval:b''
  >     fields eval:[b'parents']
  > EOF
  creating http peer for wire protocol version 2
  sending manifestdata command
  response: gen[
    {
      b'totalitems': 1
    },
    {
      b'node': b'\x99/Gy\x02\x9a=\xf8\xd0fm\x00\xbb\x92OicN&A',
      b'parents': [
        b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00',
        b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
      ]
    }
  ]

  $ sendhttpv2peer << EOF
  > command manifestdata
  >     nodes eval:[b'\x99\x2f\x47\x79\x02\x9a\x3d\xf8\xd0\x66\x6d\x00\xbb\x92\x4f\x69\x63\x4e\x26\x41']
  >     tree eval:b''
  >     fields eval:[b'parents']
  > EOF
  creating http peer for wire protocol version 2
  sending manifestdata command
  response: gen[
    {
      b'totalitems': 1
    },
    {
      b'node': b'\x99/Gy\x02\x9a=\xf8\xd0fm\x00\xbb\x92OicN&A',
      b'parents': [
        b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00',
        b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
      ]
    }
  ]

Sending different request doesn't yield cache hit.

  $ sendhttpv2peer << EOF
  > command manifestdata
  >     nodes eval:[b'\x99\x2f\x47\x79\x02\x9a\x3d\xf8\xd0\x66\x6d\x00\xbb\x92\x4f\x69\x63\x4e\x26\x41', b'\xa9\x88\xfb\x43\x58\x3e\x87\x1d\x1e\xd5\x75\x0e\xe0\x74\xc6\xd8\x40\xbb\xbf\xc8']
  >     tree eval:b''
  >     fields eval:[b'parents']
  > EOF
  creating http peer for wire protocol version 2
  sending manifestdata command
  response: gen[
    {
      b'totalitems': 2
    },
    {
      b'node': b'\x99/Gy\x02\x9a=\xf8\xd0fm\x00\xbb\x92OicN&A',
      b'parents': [
        b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00',
        b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
      ]
    },
    {
      b'node': b'\xa9\x88\xfbCX>\x87\x1d\x1e\xd5u\x0e\xe0t\xc6\xd8@\xbb\xbf\xc8',
      b'parents': [
        b'\x99/Gy\x02\x9a=\xf8\xd0fm\x00\xbb\x92OicN&A',
        b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
      ]
    }
  ]

  $ cat .hg/blackbox.log
  *> cacher constructed for manifestdata (glob)
  *> cache miss for 47abb8efa5f01b8964d74917793ad2464db0fa2c (glob)
  *> storing cache entry for 47abb8efa5f01b8964d74917793ad2464db0fa2c (glob)
  *> cacher constructed for manifestdata (glob)
  *> cache hit for 47abb8efa5f01b8964d74917793ad2464db0fa2c (glob)
  *> cacher constructed for manifestdata (glob)
  *> cache miss for 37326a83e9843f15161fce9d1e92d06b795d5e8e (glob)
  *> storing cache entry for 37326a83e9843f15161fce9d1e92d06b795d5e8e (glob)

  $ cat error.log

  $ killdaemons.py
  $ rm .hg/blackbox.log

Try with object caching mode

  $ cat >> .hg/hgrc << EOF
  > [simplecache]
  > cacheobjects = true
  > EOF

  $ hg serve -p $HGPORT -d --pid-file hg.pid -E error.log
  $ cat hg.pid > $DAEMON_PIDS

  $ sendhttpv2peer << EOF
  > command manifestdata
  >     nodes eval:[b'\x99\x2f\x47\x79\x02\x9a\x3d\xf8\xd0\x66\x6d\x00\xbb\x92\x4f\x69\x63\x4e\x26\x41']
  >     tree eval:b''
  >     fields eval:[b'parents']
  > EOF
  creating http peer for wire protocol version 2
  sending manifestdata command
  response: gen[
    {
      b'totalitems': 1
    },
    {
      b'node': b'\x99/Gy\x02\x9a=\xf8\xd0fm\x00\xbb\x92OicN&A',
      b'parents': [
        b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00',
        b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
      ]
    }
  ]

  $ sendhttpv2peer << EOF
  > command manifestdata
  >     nodes eval:[b'\x99\x2f\x47\x79\x02\x9a\x3d\xf8\xd0\x66\x6d\x00\xbb\x92\x4f\x69\x63\x4e\x26\x41']
  >     tree eval:b''
  >     fields eval:[b'parents']
  > EOF
  creating http peer for wire protocol version 2
  sending manifestdata command
  response: gen[
    {
      b'totalitems': 1
    },
    {
      b'node': b'\x99/Gy\x02\x9a=\xf8\xd0fm\x00\xbb\x92OicN&A',
      b'parents': [
        b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00',
        b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
      ]
    }
  ]

  $ cat .hg/blackbox.log
  *> cacher constructed for manifestdata (glob)
  *> cache miss for 47abb8efa5f01b8964d74917793ad2464db0fa2c (glob)
  *> storing cache entry for 47abb8efa5f01b8964d74917793ad2464db0fa2c (glob)
  *> cacher constructed for manifestdata (glob)
  *> cache hit for 47abb8efa5f01b8964d74917793ad2464db0fa2c (glob)

  $ cat error.log

  $ killdaemons.py
  $ rm .hg/blackbox.log

A non-cacheable command does not instantiate cacher

  $ hg serve -p $HGPORT -d --pid-file hg.pid -E error.log
  $ cat hg.pid > $DAEMON_PIDS
  $ sendhttpv2peer << EOF
  > command capabilities
  > EOF
  creating http peer for wire protocol version 2
  sending capabilities command
  response: gen[
    {
      b'commands': {
        b'branchmap': {
          b'args': {},
          b'permissions': [
            b'pull'
          ]
        },
        b'capabilities': {
          b'args': {},
          b'permissions': [
            b'pull'
          ]
        },
        b'changesetdata': {
          b'args': {
            b'fields': {
              b'default': set([]),
              b'required': False,
              b'type': b'set',
              b'validvalues': set([
                b'bookmarks',
                b'parents',
                b'phase',
                b'revision'
              ])
            },
            b'revisions': {
              b'required': True,
              b'type': b'list'
            }
          },
          b'permissions': [
            b'pull'
          ]
        },
        b'filedata': {
          b'args': {
            b'fields': {
              b'default': set([]),
              b'required': False,
              b'type': b'set',
              b'validvalues': set([
                b'linknode',
                b'parents',
                b'revision'
              ])
            },
            b'haveparents': {
              b'default': False,
              b'required': False,
              b'type': b'bool'
            },
            b'nodes': {
              b'required': True,
              b'type': b'list'
            },
            b'path': {
              b'required': True,
              b'type': b'bytes'
            }
          },
          b'permissions': [
            b'pull'
          ]
        },
        b'filesdata': {
          b'args': {
            b'fields': {
              b'default': set([]),
              b'required': False,
              b'type': b'set',
              b'validvalues': set([
                b'firstchangeset',
                b'linknode',
                b'parents',
                b'revision'
              ])
            },
            b'haveparents': {
              b'default': False,
              b'required': False,
              b'type': b'bool'
            },
            b'pathfilter': {
              b'default': None,
              b'required': False,
              b'type': b'dict'
            },
            b'revisions': {
              b'required': True,
              b'type': b'list'
            }
          },
          b'permissions': [
            b'pull'
          ],
          b'recommendedbatchsize': 50000
        },
        b'heads': {
          b'args': {
            b'publiconly': {
              b'default': False,
              b'required': False,
              b'type': b'bool'
            }
          },
          b'permissions': [
            b'pull'
          ]
        },
        b'known': {
          b'args': {
            b'nodes': {
              b'default': [],
              b'required': False,
              b'type': b'list'
            }
          },
          b'permissions': [
            b'pull'
          ]
        },
        b'listkeys': {
          b'args': {
            b'namespace': {
              b'required': True,
              b'type': b'bytes'
            }
          },
          b'permissions': [
            b'pull'
          ]
        },
        b'lookup': {
          b'args': {
            b'key': {
              b'required': True,
              b'type': b'bytes'
            }
          },
          b'permissions': [
            b'pull'
          ]
        },
        b'manifestdata': {
          b'args': {
            b'fields': {
              b'default': set([]),
              b'required': False,
              b'type': b'set',
              b'validvalues': set([
                b'parents',
                b'revision'
              ])
            },
            b'haveparents': {
              b'default': False,
              b'required': False,
              b'type': b'bool'
            },
            b'nodes': {
              b'required': True,
              b'type': b'list'
            },
            b'tree': {
              b'required': True,
              b'type': b'bytes'
            }
          },
          b'permissions': [
            b'pull'
          ],
          b'recommendedbatchsize': 100000
        },
        b'pushkey': {
          b'args': {
            b'key': {
              b'required': True,
              b'type': b'bytes'
            },
            b'namespace': {
              b'required': True,
              b'type': b'bytes'
            },
            b'new': {
              b'required': True,
              b'type': b'bytes'
            },
            b'old': {
              b'required': True,
              b'type': b'bytes'
            }
          },
          b'permissions': [
            b'push'
          ]
        },
        b'rawstorefiledata': {
          b'args': {
            b'files': {
              b'required': True,
              b'type': b'list'
            },
            b'pathfilter': {
              b'default': None,
              b'required': False,
              b'type': b'list'
            }
          },
          b'permissions': [
            b'pull'
          ]
        }
      },
      b'framingmediatypes': [
        b'application/mercurial-exp-framing-0006'
      ],
      b'pathfilterprefixes': set([
        b'path:',
        b'rootfilesin:'
      ]),
      b'rawrepoformats': [
        b'generaldelta',
        b'revlogv1',
        b'sparserevlog'
      ]
    }
  ]

  $ test -f .hg/blackbox.log
  [1]

An error is not cached

  $ sendhttpv2peer << EOF
  > command manifestdata
  >     nodes eval:[b'\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa']
  >     tree eval:b''
  >     fields eval:[b'parents']
  > EOF
  creating http peer for wire protocol version 2
  sending manifestdata command
  abort: unknown node: \xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa! (esc)
  [255]

  $ cat .hg/blackbox.log
  *> cacher constructed for manifestdata (glob)
  *> cache miss for 2cba2a7d0d1575fea2fe68f597e97a7c2ac2f705 (glob)
  *> cacher exiting due to error (glob)

  $ killdaemons.py
  $ rm .hg/blackbox.log