view tests/test-histedit-fold.t @ 40326:fed697fa1734

sqlitestore: file storage backend using SQLite This commit provides an extension which uses SQLite to store file data (as opposed to revlogs). As the inline documentation describes, there are still several aspects to the extension that are incomplete. But it's a start. The extension does support basic clone, checkout, and commit workflows, which makes it suitable for simple use cases. One notable missing feature is support for "bundlerepos." This is probably responsible for the most test failures when the extension is activated as part of the test suite. All revision data is stored in SQLite. Data is stored as zstd compressed chunks (default if zstd is available), zlib compressed chunks (default if zstd is not available), or raw chunks (if configured or if a compressed delta is not smaller than the raw delta). This makes things very similar to revlogs. Unlike revlogs, the extension doesn't yet enforce a limit on delta chain length. This is an obvious limitation and should be addressed. This is somewhat mitigated by the use of zstd, which is much faster than zlib to decompress. There is a dedicated table for storing deltas. Deltas are stored by the SHA-1 hash of their uncompressed content. The "fileindex" table has columns that reference the delta for each revision and the base delta that delta should be applied against. A recursive SQL query is used to resolve the delta chain along with the delta data. By storing deltas by hash, we are able to de-duplicate delta storage! With revlogs, the same deltas in different revlogs would result in duplicate storage of that delta. In this scheme, inserting the duplicate delta is a no-op and delta chains simply reference the existing delta. When initially implementing this extension, I did not have content-indexed deltas and deltas could be duplicated across files (just like revlogs). When I implemented content-indexed deltas, the size of the SQLite database for a full clone of mozilla-unified dropped: before: 2,554,261,504 bytes after: 2,488,754,176 bytes Surprisingly, this is still larger than the bytes size of revlog files: revlog files: 2,104,861,230 bytes du -b: 2,254,381,614 I would have expected storage to be smaller since we're not limiting delta chain length and since we're using zstd instead of zlib. I suspect the SQLite indexes and per-column overhead account for the bulk of the differences. (Keep in mind that revlog uses a 64-byte packed struct for revision index data and deltas are stored without padding. Aside from the 12 unused bytes in the 32 byte node field, revlogs are pretty efficient.) Another source of overhead is file name storage. With revlogs, file names are stored in the filesystem. But with SQLite, we need to store file names in the database. This is roughly equivalent to the size of the fncache file, which for the mozilla-unified repository is ~34MB. Since the SQLite database isn't append-only and since delta chains can reference any delta, this opens some interesting possibilities. For example, we could store deltas in reverse, such that fulltexts are stored for newer revisions and deltas are applied to reconstruct older revisions. This is likely a more optimal storage strategy for version control, as new data tends to be more frequently accessed than old data. We would obviously need wire protocol support for transferring revision data from newest to oldest. And we would probably need some kind of mechanism for "re-encoding" stores. But it should be doable. This extension is very much experimental quality. There are a handful of features that don't work. It probably isn't suitable for day-to-day use. But it could be used in limited cases (e.g. read-only checkouts like in CI). And it is also a good proving ground for alternate storage backends. As we continue to define interfaces for all things storage, it will be useful to have a viable alternate storage backend to see how things shake out in practice. test-storage.py passes on Python 2 and introduces no new test failures on Python 3. Having the storage-level unit tests has proved to be insanely useful when developing this extension. Those tests caught numerous bugs during development and I'm convinced this style of testing is the way forward for ensuring alternate storage backends work as intended. Of course, test coverage isn't close to what it needs to be. But it is a start. And what coverage we have gives me confidence that basic store functionality is implemented properly. Differential Revision: https://phab.mercurial-scm.org/D4928
author Gregory Szorc <gregory.szorc@gmail.com>
date Tue, 09 Oct 2018 08:50:13 -0700
parents 02b5b5c1bba8
children 704a3aa3dc0a
line wrap: on
line source

Test histedit extension: Fold commands
======================================

This test file is dedicated to testing the fold command in non conflicting
case.

Initialization
---------------


  $ . "$TESTDIR/histedit-helpers.sh"

  $ cat >> $HGRCPATH <<EOF
  > [alias]
  > logt = log --template '{rev}:{node|short} {desc|firstline}\n'
  > [extensions]
  > histedit=
  > EOF


Simple folding
--------------------
  $ addwithdate ()
  > {
  >     echo $1 > $1
  >     hg add $1
  >     hg ci -m $1 -d "$2 0"
  > }

  $ initrepo ()
  > {
  >     hg init r
  >     cd r
  >     addwithdate a 1
  >     addwithdate b 2
  >     addwithdate c 3
  >     addwithdate d 4
  >     addwithdate e 5
  >     addwithdate f 6
  > }

  $ initrepo

log before edit
  $ hg logt --graph
  @  5:178e35e0ce73 f
  |
  o  4:1ddb6c90f2ee e
  |
  o  3:532247a8969b d
  |
  o  2:ff2c9fa2018b c
  |
  o  1:97d72e5f12c7 b
  |
  o  0:8580ff50825a a
  

  $ hg histedit ff2c9fa2018b --commands - 2>&1 <<EOF | fixbundle
  > pick 1ddb6c90f2ee e
  > pick 178e35e0ce73 f
  > fold ff2c9fa2018b c
  > pick 532247a8969b d
  > EOF

log after edit
  $ hg logt --graph
  @  4:c4d7f3def76d d
  |
  o  3:575228819b7e f
  |
  o  2:505a591af19e e
  |
  o  1:97d72e5f12c7 b
  |
  o  0:8580ff50825a a
  

post-fold manifest
  $ hg manifest
  a
  b
  c
  d
  e
  f


check histedit_source, including that it uses the later date, from the first changeset

  $ hg log --debug --rev 3
  changeset:   3:575228819b7e6ed69e8c0a6a383ee59a80db7358
  phase:       draft
  parent:      2:505a591af19eed18f560af827b9e03d2076773dc
  parent:      -1:0000000000000000000000000000000000000000
  manifest:    3:81eede616954057198ead0b2c73b41d1f392829a
  user:        test
  date:        Thu Jan 01 00:00:06 1970 +0000
  files+:      c f
  extra:       branch=default
  extra:       histedit_source=7cad1d7030207872dfd1c3a7cb430f24f2884086,ff2c9fa2018b15fa74b33363bda9527323e2a99f
  description:
  f
  ***
  c
  
  

rollup will fold without preserving the folded commit's message or date

  $ OLDHGEDITOR=$HGEDITOR
  $ HGEDITOR=false
  $ hg histedit 97d72e5f12c7 --commands - 2>&1 <<EOF | fixbundle
  > pick 97d72e5f12c7 b
  > roll 505a591af19e e
  > pick 575228819b7e f
  > pick c4d7f3def76d d
  > EOF

  $ HGEDITOR=$OLDHGEDITOR

log after edit
  $ hg logt --graph
  @  3:bab801520cec d
  |
  o  2:58c8f2bfc151 f
  |
  o  1:5d939c56c72e b
  |
  o  0:8580ff50825a a
  

description is taken from rollup target commit

  $ hg log --debug --rev 1
  changeset:   1:5d939c56c72e77e29f5167696218e2131a40f5cf
  phase:       draft
  parent:      0:8580ff50825a50c8f716709acdf8de0deddcd6ab
  parent:      -1:0000000000000000000000000000000000000000
  manifest:    1:b5e112a3a8354e269b1524729f0918662d847c38
  user:        test
  date:        Thu Jan 01 00:00:02 1970 +0000
  files+:      b e
  extra:       branch=default
  extra:       histedit_source=97d72e5f12c7e84f85064aa72e5a297142c36ed9,505a591af19eed18f560af827b9e03d2076773dc
  description:
  b
  
  

check saving last-message.txt

  $ cat > $TESTTMP/abortfolding.py <<EOF
  > from mercurial import util
  > def abortfolding(ui, repo, hooktype, **kwargs):
  >     ctx = repo[kwargs.get('node')]
  >     if set(ctx.files()) == {b'c', b'd', b'f'}:
  >         return True # abort folding commit only
  >     ui.warn(b'allow non-folding commit\\n')
  > EOF
  $ cat > .hg/hgrc <<EOF
  > [hooks]
  > pretxncommit.abortfolding = python:$TESTTMP/abortfolding.py:abortfolding
  > EOF

  $ cat > $TESTTMP/editor.sh << EOF
  > echo "==== before editing"
  > cat \$1
  > echo "===="
  > echo "check saving last-message.txt" >> \$1
  > EOF

  $ rm -f .hg/last-message.txt
  $ hg status --rev '58c8f2bfc151^1::bab801520cec'
  A c
  A d
  A f
  $ HGEDITOR="sh $TESTTMP/editor.sh" hg histedit 58c8f2bfc151 --commands - 2>&1 <<EOF
  > pick 58c8f2bfc151 f
  > fold bab801520cec d
  > EOF
  allow non-folding commit
  ==== before editing
  f
  ***
  c
  ***
  d
  
  
  
  HG: Enter commit message.  Lines beginning with 'HG:' are removed.
  HG: Leave message empty to abort commit.
  HG: --
  HG: user: test
  HG: branch 'default'
  HG: added c
  HG: added d
  HG: added f
  ====
  transaction abort!
  rollback completed
  abort: pretxncommit.abortfolding hook failed
  [255]

  $ cat .hg/last-message.txt
  f
  ***
  c
  ***
  d
  
  
  
  check saving last-message.txt

  $ cd ..
  $ rm -r r

folding preserves initial author but uses later date
----------------------------------------------------

  $ initrepo

  $ hg ci -d '7 0' --user "someone else" --amend --quiet

tip before edit
  $ hg log --rev .
  changeset:   5:10c36dd37515
  tag:         tip
  user:        someone else
  date:        Thu Jan 01 00:00:07 1970 +0000
  summary:     f
  

  $ hg --config progress.debug=1 --debug \
  > histedit 1ddb6c90f2ee --commands - 2>&1 <<EOF | \
  > egrep 'editing|unresolved'
  > pick 1ddb6c90f2ee e
  > fold 10c36dd37515 f
  > EOF
  editing: pick 1ddb6c90f2ee 4 e 1/2 changes (50.00%)
  editing: fold 10c36dd37515 5 f 2/2 changes (100.00%)

tip after edit, which should use the later date, from the second changeset
  $ hg log --rev .
  changeset:   4:e4f3ec5d0b40
  tag:         tip
  user:        test
  date:        Thu Jan 01 00:00:07 1970 +0000
  summary:     e
  

  $ cd ..
  $ rm -r r

folding and creating no new change doesn't break:
-------------------------------------------------

folded content is dropped during a merge. The folded commit should properly disappear.

  $ mkdir fold-to-empty-test
  $ cd fold-to-empty-test
  $ hg init
  $ printf "1\n2\n3\n" > file
  $ hg add file
  $ hg commit -m '1+2+3'
  $ echo 4 >> file
  $ hg commit -m '+4'
  $ echo 5 >> file
  $ hg commit -m '+5'
  $ echo 6 >> file
  $ hg commit -m '+6'
  $ hg logt --graph
  @  3:251d831eeec5 +6
  |
  o  2:888f9082bf99 +5
  |
  o  1:617f94f13c0f +4
  |
  o  0:0189ba417d34 1+2+3
  

  $ hg histedit 1 --commands - << EOF
  > pick 617f94f13c0f 1 +4
  > drop 888f9082bf99 2 +5
  > fold 251d831eeec5 3 +6
  > EOF
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  merging file
  warning: conflicts while merging file! (edit, then use 'hg resolve --mark')
  Fix up the change (fold 251d831eeec5)
  (hg histedit --continue to resume)
  [1]
There were conflicts, we keep P1 content. This
should effectively drop the changes from +6.

  $ hg status -v
  M file
  ? file.orig
  # The repository is in an unfinished *histedit* state.
  
  # Unresolved merge conflicts:
  # 
  #     file
  # 
  # To mark files as resolved:  hg resolve --mark FILE
  
  # To continue:    hg histedit --continue
  # To abort:       hg histedit --abort
  
  $ hg resolve -l
  U file
  $ hg revert -r 'p1()' file
  $ hg resolve --mark file
  (no more unresolved files)
  continue: hg histedit --continue
  $ hg histedit --continue
  251d831eeec5: empty changeset
  saved backup bundle to $TESTTMP/fold-to-empty-test/.hg/strip-backup/888f9082bf99-daa0b8b3-histedit.hg
  $ hg logt --graph
  @  1:617f94f13c0f +4
  |
  o  0:0189ba417d34 1+2+3
  

  $ cd ..


Test fold through dropped
-------------------------


Test corner case where folded revision is separated from its parent by a
dropped revision.


  $ hg init fold-with-dropped
  $ cd fold-with-dropped
  $ printf "1\n2\n3\n" > file
  $ hg commit -Am '1+2+3'
  adding file
  $ echo 4 >> file
  $ hg commit -m '+4'
  $ echo 5 >> file
  $ hg commit -m '+5'
  $ echo 6 >> file
  $ hg commit -m '+6'
  $ hg logt -G
  @  3:251d831eeec5 +6
  |
  o  2:888f9082bf99 +5
  |
  o  1:617f94f13c0f +4
  |
  o  0:0189ba417d34 1+2+3
  
  $ hg histedit 1 --commands -  << EOF
  > pick 617f94f13c0f 1 +4
  > drop 888f9082bf99 2 +5
  > fold 251d831eeec5 3 +6
  > EOF
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  merging file
  warning: conflicts while merging file! (edit, then use 'hg resolve --mark')
  Fix up the change (fold 251d831eeec5)
  (hg histedit --continue to resume)
  [1]
  $ cat > file << EOF
  > 1
  > 2
  > 3
  > 4
  > 5
  > EOF
  $ hg resolve --mark file
  (no more unresolved files)
  continue: hg histedit --continue
  $ hg commit -m '+5.2'
  created new head
  $ echo 6 >> file
  $ HGEDITOR=cat hg histedit --continue
  +4
  ***
  +5.2
  ***
  +6
  
  
  
  HG: Enter commit message.  Lines beginning with 'HG:' are removed.
  HG: Leave message empty to abort commit.
  HG: --
  HG: user: test
  HG: branch 'default'
  HG: changed file
  saved backup bundle to $TESTTMP/fold-with-dropped/.hg/strip-backup/617f94f13c0f-3d69522c-histedit.hg
  $ hg logt -G
  @  1:10c647b2cdd5 +4
  |
  o  0:0189ba417d34 1+2+3
  
  $ hg export tip
  # HG changeset patch
  # User test
  # Date 0 0
  #      Thu Jan 01 00:00:00 1970 +0000
  # Node ID 10c647b2cdd54db0603ecb99b2ff5ce66d5a5323
  # Parent  0189ba417d34df9dda55f88b637dcae9917b5964
  +4
  ***
  +5.2
  ***
  +6
  
  diff -r 0189ba417d34 -r 10c647b2cdd5 file
  --- a/file	Thu Jan 01 00:00:00 1970 +0000
  +++ b/file	Thu Jan 01 00:00:00 1970 +0000
  @@ -1,3 +1,6 @@
   1
   2
   3
  +4
  +5
  +6
  $ cd ..


Folding with initial rename (issue3729)
---------------------------------------

  $ hg init fold-rename
  $ cd fold-rename
  $ echo a > a.txt
  $ hg add a.txt
  $ hg commit -m a
  $ hg rename a.txt b.txt
  $ hg commit -m rename
  $ echo b >> b.txt
  $ hg commit -m b

  $ hg logt --follow b.txt
  2:e0371e0426bc b
  1:1c4f440a8085 rename
  0:6c795aa153cb a

  $ hg histedit 1c4f440a8085 --commands - 2>&1 << EOF | fixbundle
  > pick 1c4f440a8085 rename
  > fold e0371e0426bc b
  > EOF

  $ hg logt --follow b.txt
  1:cf858d235c76 rename
  0:6c795aa153cb a

  $ cd ..

Folding with swapping
---------------------

This is an excuse to test hook with histedit temporary commit (issue4422)


  $ hg init issue4422
  $ cd issue4422
  $ echo a > a.txt
  $ hg add a.txt
  $ hg commit -m a
  $ echo b > b.txt
  $ hg add b.txt
  $ hg commit -m b
  $ echo c > c.txt
  $ hg add c.txt
  $ hg commit -m c

  $ hg logt
  2:a1a953ffb4b0 c
  1:199b6bb90248 b
  0:6c795aa153cb a

  $ hg histedit 6c795aa153cb --config hooks.commit='echo commit $HG_NODE' --config hooks.tonative.commit=True \
  >     --commands - 2>&1 << EOF | fixbundle
  > pick 199b6bb90248 b
  > fold a1a953ffb4b0 c
  > pick 6c795aa153cb a
  > EOF
  commit 9599899f62c05f4377548c32bf1c9f1a39634b0c

  $ hg logt
  1:9599899f62c0 a
  0:79b99e9c8e49 b

Test unix -> windows style variable substitution in external hooks.

  $ cat > $TESTTMP/tmp.hgrc <<'EOF'
  > [hooks]
  > pre-add = echo no variables
  > post-add = echo ran $HG_ARGS, literal \$non-var, 'also $non-var', $HG_RESULT
  > tonative.post-add = True
  > EOF

  $ echo "foo" > amended.txt
  $ HGRCPATH=$TESTTMP/tmp.hgrc hg add -v amended.txt
  running hook pre-add: echo no variables
  no variables
  adding amended.txt
  converting hook "post-add" to native (windows !)
  running hook post-add: echo ran %HG_ARGS%, literal $non-var, "also $non-var", %HG_RESULT% (windows !)
  running hook post-add: echo ran $HG_ARGS, literal \$non-var, 'also $non-var', $HG_RESULT (no-windows !)
  ran add -v amended.txt, literal $non-var, "also $non-var", 0 (windows !)
  ran add -v amended.txt, literal $non-var, also $non-var, 0 (no-windows !)
  $ hg ci -q --config extensions.largefiles= --amend -I amended.txt
  The fsmonitor extension is incompatible with the largefiles extension and has been disabled. (fsmonitor !)

Test that folding multiple changes in a row doesn't show multiple
editors.

  $ echo foo >> foo
  $ hg add foo
  $ hg ci -m foo1
  $ echo foo >> foo
  $ hg ci -m foo2
  $ echo foo >> foo
  $ hg ci -m foo3
  $ hg logt
  4:21679ff7675c foo3
  3:b7389cc4d66e foo2
  2:0e01aeef5fa8 foo1
  1:578c7455730c a
  0:79b99e9c8e49 b
  $ cat > "$TESTTMP/editor.sh" <<EOF
  > echo ran editor >> "$TESTTMP/editorlog.txt"
  > cat \$1 >> "$TESTTMP/editorlog.txt"
  > echo END >> "$TESTTMP/editorlog.txt"
  > echo merged foos > \$1
  > EOF
  $ HGEDITOR="sh \"$TESTTMP/editor.sh\"" hg histedit 1 --commands - 2>&1 <<EOF | fixbundle
  > pick 578c7455730c 1 a
  > pick 0e01aeef5fa8 2 foo1
  > fold b7389cc4d66e 3 foo2
  > fold 21679ff7675c 4 foo3
  > EOF
  $ hg logt
  2:e8bedbda72c1 merged foos
  1:578c7455730c a
  0:79b99e9c8e49 b
Editor should have run only once
  $ cat $TESTTMP/editorlog.txt
  ran editor
  foo1
  ***
  foo2
  ***
  foo3
  
  
  
  HG: Enter commit message.  Lines beginning with 'HG:' are removed.
  HG: Leave message empty to abort commit.
  HG: --
  HG: user: test
  HG: branch 'default'
  HG: added foo
  END

  $ cd ..

Test rolling into a commit with multiple children (issue5498)

  $ hg init roll
  $ cd roll
  $ echo a > a
  $ hg commit -qAm aa
  $ echo b > b
  $ hg commit -qAm bb
  $ hg up -q ".^"
  $ echo c > c
  $ hg commit -qAm cc
  $ hg log -G -T '{node|short} {desc}'
  @  5db65b93a12b cc
  |
  | o  301d76bdc3ae bb
  |/
  o  8f0162e483d0 aa
  

  $ hg histedit . --commands - << EOF
  > r 5db65b93a12b
  > EOF
  hg: parse error: first changeset cannot use verb "roll"
  [255]
  $ hg log -G -T '{node|short} {desc}'
  @  5db65b93a12b cc
  |
  | o  301d76bdc3ae bb
  |/
  o  8f0162e483d0 aa