view tests/test-revlog-mmapindex.t @ 50400:95acba2c29f6

encoding: avoid quadratic time complexity when json-encoding non-UTF8 strings Apparently the code uses "+=" with a bytes object, which is linear-time, so the whole encoding is quadratic-time. This patch makes us use a bytearray object, instead, which has a(n amortized-)constant-time append operation. The encoding is still not particularly fast, but at least a 10MB file takes tens of seconds, not many hours to encode.
author Arseniy Alekseyev <aalekseyev@janestreet.com>
date Mon, 06 Mar 2023 11:27:57 +0000
parents 42d2b31cee0b
children dcaa2df1f688
line wrap: on
line source

create verbosemmap.py
  $ cat << EOF > verbosemmap.py
  > # extension to make util.mmapread verbose
  > 
  > 
  > from mercurial import (
  >     extensions,
  >     pycompat,
  >     util,
  > )
  > 
  > def extsetup(ui):
  >     def mmapread(orig, fp):
  >         ui.write(b"mmapping %s\n" % pycompat.bytestr(fp.name))
  >         ui.flush()
  >         return orig(fp)
  > 
  >     extensions.wrapfunction(util, 'mmapread', mmapread)
  > EOF

setting up base repo
  $ hg init a
  $ cd a
  $ touch a
  $ hg add a
  $ hg commit -qm base
  $ for i in `$TESTDIR/seq.py 1 100` ; do
  > echo $i > a
  > hg commit -qm $i
  > done

set up verbosemmap extension
  $ cat << EOF >> $HGRCPATH
  > [extensions]
  > verbosemmap=$TESTTMP/verbosemmap.py
  > EOF

mmap index which is now more than 4k long
  $ hg log -l 5 -T '{rev}\n' --config experimental.mmapindexthreshold=4k
  mmapping $TESTTMP/a/.hg/store/00changelog.i
  100
  99
  98
  97
  96

do not mmap index which is still less than 32k
  $ hg log -l 5 -T '{rev}\n' --config experimental.mmapindexthreshold=32k
  100
  99
  98
  97
  96

  $ cd ..