tests/test-transaction-safety.t
author Georges Racinet <georges.racinet@octobus.net>
Tue, 20 Jul 2021 17:20:19 +0200
changeset 47909 de2e04fe4897
parent 47324 b1ce93dcdf3c
child 47780 4870a8dc24d9
permissions -rw-r--r--
hgwebdir: avoid systematic full garbage collection Forcing a systematic full garbage collection upon each request can serioulsy harm performance. This is reported as https://bz.mercurial-scm.org/show_bug.cgi?id=6075 With this change we're performing the full collection according to a new setting, `experimental.web.full-garbage-collection-rate`. The default value is 1, which doesn't change the behavior and will allow us to test on real use cases. If the value is 0, no full garbage collection occurs. Regardless of the value of the setting, a partial garbage collection still occurs upon each request (not attempting to collect objects from the oldest generation). This should be enough to take care of reference cycles that have been created by the last request (assessment of this requires changing the setting, not to be 1). In my experience chasing memory leaks in Mercurial servers, the full collection never reclaimed any memory, but this is with Python 3 and biased towards small repositories. On the other hand, as explained in the Python developer docs [1], frequent full collections are very harmful in terms of performance if lots of objects survive the collection, and hence stay in the oldest generation. Note that `gc.collect()` is indeed trying to collect the oldest generation [2]. This happens usually in two cases: - unwanted lingering objects (i.e., an actual memory leak that the GC cannot do anything about). Sadly, we have lots of those these days. - desireable long-term objects, typically in caches (not inner caches carried by repositories, which should be collected with them). This is a subject of interest for the Heptapod project. In short, the flat rate that this change still permits is probably a bad idea in most cases, and the default value can be tweaked later on (or even be set to 0) according to experiments in the wild. The test is inspired from test-hgwebdir-paths.py [1] https://devguide.python.org/garbage_collector/#collecting-the-oldest-generation [2] https://docs.python.org/3/library/gc.html#gc.collect Differential Revision: https://phab.mercurial-scm.org/D11204

Test transaction safety
=======================

#testcases revlogv1 revlogv2 changelogv2

#if revlogv1

  $ cat << EOF >> $HGRCPATH
  > [experimental]
  > revlogv2=no
  > EOF

#endif

#if revlogv2

  $ cat << EOF >> $HGRCPATH
  > [experimental]
  > revlogv2=enable-unstable-format-and-corrupt-my-data
  > EOF

#endif

#if changelogv2

  $ cat << EOF >> $HGRCPATH
  > [format]
  > exp-use-changelog-v2=enable-unstable-format-and-corrupt-my-data
  > EOF

#endif

This test basic case to make sure external process do not see transaction
content until it is committed.

# TODO: also add an external reader accessing revlog files while they are written
#       (instead of during transaction finalisation)

# TODO: also add stream clone and hardlink clone happening during these transaction.

setup
-----

synchronisation+output script:

  $ mkdir sync
  $ mkdir output
  $ mkdir script
  $ HG_TEST_FILE_EXT_WAITING=$TESTTMP/sync/ext_waiting
  $ export HG_TEST_FILE_EXT_WAITING
  $ HG_TEST_FILE_EXT_UNLOCK=$TESTTMP/sync/ext_unlock
  $ export HG_TEST_FILE_EXT_UNLOCK
  $ HG_TEST_FILE_EXT_DONE=$TESTTMP/sync/ext_done
  $ export HG_TEST_FILE_EXT_DONE
  $ cat << EOF > script/external.sh
  > #!/bin/sh
  > "$RUNTESTDIR/testlib/wait-on-file" 5 "$HG_TEST_FILE_EXT_UNLOCK" "$HG_TEST_FILE_EXT_WAITING"
  > hg log --rev 'tip' -T 'external: {rev} {desc}\n' > "$TESTTMP/output/external.out"
  > touch "$HG_TEST_FILE_EXT_DONE"
  > EOF
  $ cat << EOF > script/internal.sh
  > #!/bin/sh
  > hg log --rev 'tip' -T 'internal: {rev} {desc}\n' > "$TESTTMP/output/internal.out"
  > "$RUNTESTDIR/testlib/wait-on-file" 5 "$HG_TEST_FILE_EXT_DONE" "$HG_TEST_FILE_EXT_UNLOCK"
  > EOF


Automated commands:

  $ make_one_commit() {
  > rm -f $TESTTMP/sync/*
  > rm -f $TESTTMP/output/*
  > hg log --rev 'tip' -T 'pre-commit: {rev} {desc}\n'
  > echo x >> a
  > sh $TESTTMP/script/external.sh & hg commit -m "$1"
  > cat $TESTTMP/output/external.out
  > cat $TESTTMP/output/internal.out
  > hg log --rev 'tip' -T 'post-tr:  {rev} {desc}\n'
  > }


  $ make_one_pull() {
  > rm -f $TESTTMP/sync/*
  > rm -f $TESTTMP/output/*
  > hg log --rev 'tip' -T 'pre-commit: {rev} {desc}\n'
  > echo x >> a
  > sh $TESTTMP/script/external.sh & hg pull ../other-repo/ --rev "$1" --force --quiet
  > cat $TESTTMP/output/external.out
  > cat $TESTTMP/output/internal.out
  > hg log --rev 'tip' -T 'post-tr:  {rev} {desc}\n'
  > }

prepare a large source to which to pull from:

The source is large to unsure we don't use inline more after the pull

  $ hg init other-repo
  $ hg -R other-repo debugbuilddag .+500


prepare an empty repository where to make test:

  $ hg init repo
  $ cd repo
  $ touch a
  $ hg add a

prepare a small extension to controll inline size

  $ mkdir $TESTTMP/ext
  $ cat << EOF > $TESTTMP/ext/small_inline.py
  > from mercurial import revlog
  > revlog._maxinline = 64 * 100
  > EOF




  $ cat << EOF >> $HGRCPATH
  > [extensions]
  > small_inline=$TESTTMP/ext/small_inline.py
  > [hooks]
  > pretxnclose = sh $TESTTMP/script/internal.sh
  > EOF

check this is true for the initial commit (inline → inline)
-----------------------------------------------------------

the repository should still be inline (for relevant format)

  $ make_one_commit first
  pre-commit: -1 
  external: -1 
  internal: 0 first
  post-tr:  0 first

#if revlogv1

  $ hg debugrevlog -c | grep inline
  flags  : inline

#endif

check this is true for extra commit (inline → inline)
-----------------------------------------------------

the repository should still be inline (for relevant format)

#if revlogv1

  $ hg debugrevlog -c | grep inline
  flags  : inline

#endif

  $ make_one_commit second
  pre-commit: 0 first
  external: 0 first
  internal: 1 second
  post-tr:  1 second

#if revlogv1

  $ hg debugrevlog -c | grep inline
  flags  : inline

#endif

check this is true for a small pull (inline → inline)
-----------------------------------------------------

the repository should still be inline (for relevant format)

#if revlogv1

  $ hg debugrevlog -c | grep inline
  flags  : inline

#endif

  $ make_one_pull 3
  pre-commit: 1 second
  warning: repository is unrelated
  external: 1 second
  internal: 5 r3
  post-tr:  5 r3

#if revlogv1

  $ hg debugrevlog -c | grep inline
  flags  : inline

#endif

Make a large pull (inline → no-inline)
---------------------------------------

the repository should no longer be inline (for relevant format)

#if revlogv1

  $ hg debugrevlog -c | grep inline
  flags  : inline

#endif

  $ make_one_pull 400
  pre-commit: 5 r3
  external: 5 r3
  internal: 402 r400
  post-tr:  402 r400

#if revlogv1

  $ hg debugrevlog -c | grep inline
  [1]

#endif

check this is true for extra commit (no-inline → no-inline)
-----------------------------------------------------------

the repository should no longer be inline (for relevant format)

#if revlogv1

  $ hg debugrevlog -c | grep inline
  [1]

#endif

  $ make_one_commit third
  pre-commit: 402 r400
  external: 402 r400
  internal: 403 third
  post-tr:  403 third

#if revlogv1

  $ hg debugrevlog -c | grep inline
  [1]

#endif


Make a  pull (not-inline → no-inline)
-------------------------------------

the repository should no longer be inline (for relevant format)

#if revlogv1

  $ hg debugrevlog -c | grep inline
  [1]

#endif

  $ make_one_pull tip
  pre-commit: 403 third
  external: 403 third
  internal: 503 r500
  post-tr:  503 r500

#if revlogv1

  $ hg debugrevlog -c | grep inline
  [1]

#endif