view tests/test-wireproto-command-filedata.t @ 44363:f7459da77f23

nodemap: introduce an option to use mmap to read the nodemap mapping The performance and memory benefit is much greater if we don't have to copy all the data in memory for each information. So we introduce an option (on by default) to read the data using mmap. This changeset is the last one definition the API for index support nodemap data. (they have to be able to use the mmaping). Below are some benchmark comparing the best we currently have in 5.3 with the final step of this series (using the persistent nodemap implementation in Rust). The benchmark run `hg perfindex` with various revset and the following variants: Before: * do not use the persistent nodemap * use the CPython implementation of the index for nodemap * use mmapping of the changelog index After: * use the MixedIndex Rust code, with the NodeTree object for nodemap access (still in review) * use the persistent nodemap data from disk * access the persistent nodemap data through mmap * use mmapping of the changelog index The persistent nodemap greatly speed up most operation on very large repositories. Some of the previously very fast lookup end up a bit slower because the persistent nodemap has to be setup. However the absolute slowdown is very small and won't matters in the big picture. Here are some numbers (in seconds) for the reference copy of mozilla-try: Revset Before After abs-change speedup -10000: 0.004622 0.005532 0.000910 × 0.83 -10: 0.000050 0.000132 0.000082 × 0.37 tip 0.000052 0.000085 0.000033 × 0.61 0 + (-10000:) 0.028222 0.005337 -0.022885 × 5.29 0 0.023521 0.000084 -0.023437 × 280.01 (-10000:) + 0 0.235539 0.005308 -0.230231 × 44.37 (-10:) + :9 0.232883 0.000180 -0.232703 ×1293.79 (-10000:) + (:99) 0.238735 0.005358 -0.233377 × 44.55 :99 + (-10000:) 0.317942 0.005593 -0.312349 × 56.84 :9 + (-10:) 0.313372 0.000179 -0.313193 ×1750.68 :9 0.316450 0.000143 -0.316307 ×2212.93 On smaller repositories, the cost of nodemap related operation is not as big, so the win is much more modest. Yet it helps shaving a handful of millisecond here and there. Here are some numbers (in seconds) for the reference copy of mercurial: Revset Before After abs-change speedup -10: 0.000065 0.000097 0.000032 × 0.67 tip 0.000063 0.000078 0.000015 × 0.80 0 0.000561 0.000079 -0.000482 × 7.10 -10000: 0.004609 0.003648 -0.000961 × 1.26 0 + (-10000:) 0.005023 0.003715 -0.001307 × 1.35 (-10:) + :9 0.002187 0.000108 -0.002079 ×20.25 (-10000:) + 0 0.006252 0.003716 -0.002536 × 1.68 (-10000:) + (:99) 0.006367 0.003707 -0.002660 × 1.71 :9 + (-10:) 0.003846 0.000110 -0.003736 ×34.96 :9 0.003854 0.000099 -0.003755 ×38.92 :99 + (-10000:) 0.007644 0.003778 -0.003866 × 2.02 Differential Revision: https://phab.mercurial-scm.org/D7894
author Pierre-Yves David <pierre-yves.david@octobus.net>
date Tue, 11 Feb 2020 11:18:52 +0100
parents ca6372b7e566
children 95c4cca641f6
line wrap: on
line source

  $ . $TESTDIR/wireprotohelpers.sh

  $ hg init server
  $ enablehttpv2 server
  $ cd server
  $ cat > a << EOF
  > a0
  > 00000000000000000000000000000000000000
  > 11111111111111111111111111111111111111
  > EOF
  $ echo b0 > b
  $ mkdir -p dir0/child0 dir0/child1 dir1
  $ echo c0 > dir0/c
  $ echo d0 > dir0/d
  $ echo e0 > dir0/child0/e
  $ echo f0 > dir0/child1/f
  $ hg -q commit -A -m 'commit 0'

  $ echo a1 >> a
  $ echo d1 > dir0/d
  $ hg commit -m 'commit 1'
  $ echo f1 > dir0/child1/f
  $ hg commit -m 'commit 2'

  $ hg -q up -r 0
  $ echo a2 >> a
  $ hg commit -m 'commit 3'
  created new head

Create multiple heads introducing the same changeset

  $ hg -q up -r 0
  $ echo foo > dupe-file
  $ hg commit -Am 'dupe 1'
  adding dupe-file
  created new head
  $ hg -q up -r 0
  $ echo foo > dupe-file
  $ hg commit -Am 'dupe 2'
  adding dupe-file
  created new head

  $ hg log -G -T '{rev}:{node} {desc}\n'
  @  5:732c3dd7bee94242de656000e5f458e7ccfe2828 dupe 2
  |
  | o  4:4334f10897d13c3e8beb4b636f7272b4ec2d0322 dupe 1
  |/
  | o  3:5ce944d7fece1252dae06c34422b573c191b9489 commit 3
  |/
  | o  2:b3c27db01410dae01e5485d425b1440078df540c commit 2
  | |
  | o  1:3ef5e551f219ba505481d34d6b0316b017fa3f00 commit 1
  |/
  o  0:91b232a2253ce0638496f67bdfd7a4933fb51b25 commit 0
  

  $ hg --debug debugindex a
     rev linkrev nodeid                                   p1                                       p2
       0       0 649d149df43d83882523b7fb1e6a3af6f1907b39 0000000000000000000000000000000000000000 0000000000000000000000000000000000000000
       1       1 0a86321f1379d1a9ecd0579a22977af7a5acaf11 649d149df43d83882523b7fb1e6a3af6f1907b39 0000000000000000000000000000000000000000
       2       3 7e5801b6d5f03a5a54f3c47b583f7567aad43e5b 649d149df43d83882523b7fb1e6a3af6f1907b39 0000000000000000000000000000000000000000

  $ hg --debug debugindex dir0/child0/e
     rev linkrev nodeid                                   p1                                       p2
       0       0 bbba6c06b30f443d34ff841bc985c4d0827c6be4 0000000000000000000000000000000000000000 0000000000000000000000000000000000000000

  $ hg --debug debugindex dupe-file
     rev linkrev nodeid                                   p1                                       p2
       0       4 2ed2a3912a0b24502043eae84ee4b279c18b90dd 0000000000000000000000000000000000000000 0000000000000000000000000000000000000000

  $ hg serve -p $HGPORT -d --pid-file hg.pid -E error.log
  $ cat hg.pid > $DAEMON_PIDS

Missing arguments is an error

  $ sendhttpv2peer << EOF
  > command filedata
  > EOF
  creating http peer for wire protocol version 2
  sending filedata command
  abort: missing required arguments: nodes, path!
  [255]

  $ sendhttpv2peer << EOF
  > command filedata
  >     nodes eval:[]
  > EOF
  creating http peer for wire protocol version 2
  sending filedata command
  abort: missing required arguments: path!
  [255]

Unknown node is an error

  $ sendhttpv2peer << EOF
  > command filedata
  >     nodes eval:[b'\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa']
  >     path eval:b'a'
  > EOF
  creating http peer for wire protocol version 2
  sending filedata command
  abort: unknown file node: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa!
  [255]

Fetching a single revision returns just metadata by default

  $ sendhttpv2peer << EOF
  > command filedata
  >     nodes eval:[b'\x0a\x86\x32\x1f\x13\x79\xd1\xa9\xec\xd0\x57\x9a\x22\x97\x7a\xf7\xa5\xac\xaf\x11']
  >     path eval:b'a'
  > EOF
  creating http peer for wire protocol version 2
  sending filedata command
  response: gen[
    {
      b'totalitems': 1
    },
    {
      b'node': b'\n\x862\x1f\x13y\xd1\xa9\xec\xd0W\x9a"\x97z\xf7\xa5\xac\xaf\x11'
    }
  ]

Requesting parents works

  $ sendhttpv2peer << EOF
  > command filedata
  >     nodes eval:[b'\x0a\x86\x32\x1f\x13\x79\xd1\xa9\xec\xd0\x57\x9a\x22\x97\x7a\xf7\xa5\xac\xaf\x11']
  >     path eval:b'a'
  >     fields eval:[b'parents']
  > EOF
  creating http peer for wire protocol version 2
  sending filedata command
  response: gen[
    {
      b'totalitems': 1
    },
    {
      b'node': b'\n\x862\x1f\x13y\xd1\xa9\xec\xd0W\x9a"\x97z\xf7\xa5\xac\xaf\x11',
      b'parents': [
        b'd\x9d\x14\x9d\xf4=\x83\x88%#\xb7\xfb\x1ej:\xf6\xf1\x90{9',
        b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
      ]
    }
  ]

Requesting revision data works
(haveparents defaults to False, so fulltext is emitted)

  $ sendhttpv2peer << EOF
  > command filedata
  >     nodes eval:[b'\x0a\x86\x32\x1f\x13\x79\xd1\xa9\xec\xd0\x57\x9a\x22\x97\x7a\xf7\xa5\xac\xaf\x11']
  >     path eval:b'a'
  >     fields eval:[b'revision']
  > EOF
  creating http peer for wire protocol version 2
  sending filedata command
  response: gen[
    {
      b'totalitems': 1
    },
    {
      b'fieldsfollowing': [
        [
          b'revision',
          84
        ]
      ],
      b'node': b'\n\x862\x1f\x13y\xd1\xa9\xec\xd0W\x9a"\x97z\xf7\xa5\xac\xaf\x11'
    },
    b'a0\n00000000000000000000000000000000000000\n11111111111111111111111111111111111111\na1\n'
  ]

haveparents=False should be same as above

  $ sendhttpv2peer << EOF
  > command filedata
  >     nodes eval:[b'\x0a\x86\x32\x1f\x13\x79\xd1\xa9\xec\xd0\x57\x9a\x22\x97\x7a\xf7\xa5\xac\xaf\x11']
  >     path eval:b'a'
  >     fields eval:[b'revision']
  >     haveparents eval:False
  > EOF
  creating http peer for wire protocol version 2
  sending filedata command
  response: gen[
    {
      b'totalitems': 1
    },
    {
      b'fieldsfollowing': [
        [
          b'revision',
          84
        ]
      ],
      b'node': b'\n\x862\x1f\x13y\xd1\xa9\xec\xd0W\x9a"\x97z\xf7\xa5\xac\xaf\x11'
    },
    b'a0\n00000000000000000000000000000000000000\n11111111111111111111111111111111111111\na1\n'
  ]

haveparents=True should emit a delta

  $ sendhttpv2peer << EOF
  > command filedata
  >     nodes eval:[b'\x0a\x86\x32\x1f\x13\x79\xd1\xa9\xec\xd0\x57\x9a\x22\x97\x7a\xf7\xa5\xac\xaf\x11']
  >     path eval:b'a'
  >     fields eval:[b'revision']
  >     haveparents eval:True
  > EOF
  creating http peer for wire protocol version 2
  sending filedata command
  response: gen[
    {
      b'totalitems': 1
    },
    {
      b'deltabasenode': b'd\x9d\x14\x9d\xf4=\x83\x88%#\xb7\xfb\x1ej:\xf6\xf1\x90{9',
      b'fieldsfollowing': [
        [
          b'delta',
          15
        ]
      ],
      b'node': b'\n\x862\x1f\x13y\xd1\xa9\xec\xd0W\x9a"\x97z\xf7\xa5\xac\xaf\x11'
    },
    b'\x00\x00\x00Q\x00\x00\x00Q\x00\x00\x00\x03a1\n'
  ]

Requesting multiple revisions works
(first revision is a fulltext since haveparents=False by default)

  $ sendhttpv2peer << EOF
  > command filedata
  >     nodes eval:[b'\x64\x9d\x14\x9d\xf4\x3d\x83\x88\x25\x23\xb7\xfb\x1e\x6a\x3a\xf6\xf1\x90\x7b\x39', b'\x0a\x86\x32\x1f\x13\x79\xd1\xa9\xec\xd0\x57\x9a\x22\x97\x7a\xf7\xa5\xac\xaf\x11']
  >     path eval:b'a'
  >     fields eval:[b'revision']
  > EOF
  creating http peer for wire protocol version 2
  sending filedata command
  response: gen[
    {
      b'totalitems': 2
    },
    {
      b'fieldsfollowing': [
        [
          b'revision',
          81
        ]
      ],
      b'node': b'd\x9d\x14\x9d\xf4=\x83\x88%#\xb7\xfb\x1ej:\xf6\xf1\x90{9'
    },
    b'a0\n00000000000000000000000000000000000000\n11111111111111111111111111111111111111\n',
    {
      b'deltabasenode': b'd\x9d\x14\x9d\xf4=\x83\x88%#\xb7\xfb\x1ej:\xf6\xf1\x90{9',
      b'fieldsfollowing': [
        [
          b'delta',
          15
        ]
      ],
      b'node': b'\n\x862\x1f\x13y\xd1\xa9\xec\xd0W\x9a"\x97z\xf7\xa5\xac\xaf\x11'
    },
    b'\x00\x00\x00Q\x00\x00\x00Q\x00\x00\x00\x03a1\n'
  ]

Revisions are sorted by DAG order, parents first

  $ sendhttpv2peer << EOF
  > command filedata
  >     nodes eval:[b'\x0a\x86\x32\x1f\x13\x79\xd1\xa9\xec\xd0\x57\x9a\x22\x97\x7a\xf7\xa5\xac\xaf\x11', b'\x64\x9d\x14\x9d\xf4\x3d\x83\x88\x25\x23\xb7\xfb\x1e\x6a\x3a\xf6\xf1\x90\x7b\x39']
  >     path eval:b'a'
  >     fields eval:[b'revision']
  > EOF
  creating http peer for wire protocol version 2
  sending filedata command
  response: gen[
    {
      b'totalitems': 2
    },
    {
      b'fieldsfollowing': [
        [
          b'revision',
          81
        ]
      ],
      b'node': b'd\x9d\x14\x9d\xf4=\x83\x88%#\xb7\xfb\x1ej:\xf6\xf1\x90{9'
    },
    b'a0\n00000000000000000000000000000000000000\n11111111111111111111111111111111111111\n',
    {
      b'deltabasenode': b'd\x9d\x14\x9d\xf4=\x83\x88%#\xb7\xfb\x1ej:\xf6\xf1\x90{9',
      b'fieldsfollowing': [
        [
          b'delta',
          15
        ]
      ],
      b'node': b'\n\x862\x1f\x13y\xd1\xa9\xec\xd0W\x9a"\x97z\xf7\xa5\xac\xaf\x11'
    },
    b'\x00\x00\x00Q\x00\x00\x00Q\x00\x00\x00\x03a1\n'
  ]

Requesting parents and revision data works

  $ sendhttpv2peer << EOF
  > command filedata
  >     nodes eval:[b'\x7e\x58\x01\xb6\xd5\xf0\x3a\x5a\x54\xf3\xc4\x7b\x58\x3f\x75\x67\xaa\xd4\x3e\x5b']
  >     path eval:b'a'
  >     fields eval:[b'parents', b'revision']
  > EOF
  creating http peer for wire protocol version 2
  sending filedata command
  response: gen[
    {
      b'totalitems': 1
    },
    {
      b'fieldsfollowing': [
        [
          b'revision',
          84
        ]
      ],
      b'node': b'~X\x01\xb6\xd5\xf0:ZT\xf3\xc4{X?ug\xaa\xd4>[',
      b'parents': [
        b'd\x9d\x14\x9d\xf4=\x83\x88%#\xb7\xfb\x1ej:\xf6\xf1\x90{9',
        b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
      ]
    },
    b'a0\n00000000000000000000000000000000000000\n11111111111111111111111111111111111111\na2\n'
  ]

Linknode for duplicate revision is the initial revision

  $ sendhttpv2peer << EOF
  > command filedata
  >     nodes eval:[b'\x2e\xd2\xa3\x91\x2a\x0b\x24\x50\x20\x43\xea\xe8\x4e\xe4\xb2\x79\xc1\x8b\x90\xdd']
  >     path eval:b'dupe-file'
  >     fields eval:[b'linknode', b'parents', b'revision']
  > EOF
  creating http peer for wire protocol version 2
  sending filedata command
  response: gen[
    {
      b'totalitems': 1
    },
    {
      b'fieldsfollowing': [
        [
          b'revision',
          4
        ]
      ],
      b'linknode': b'C4\xf1\x08\x97\xd1<>\x8b\xebKcorr\xb4\xec-\x03"',
      b'node': b'.\xd2\xa3\x91*\x0b$P C\xea\xe8N\xe4\xb2y\xc1\x8b\x90\xdd',
      b'parents': [
        b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00',
        b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
      ]
    },
    b'foo\n'
  ]

  $ cat error.log