view tests/test-fetch.t @ 40021:c537144fdbef

wireprotov2: support response caching One of the things I've learned from managing VCS servers over the years is that they are hard to scale. It is well known that some companies have very beefy (read: very expensive) servers to power their VCS needs. It is also known that specialized servers for various VCS exist in order to facilitate scaling servers. (Mercurial is in this boat.) One of the aspects that make a VCS server hard to scale is the high CPU load incurred by constant client clone/pull operations. To alleviate the scaling pain associated with data retrieval operations, I want to integrate caching into the Mercurial wire protocol server as robustly as possible such that servers can aggressively cache responses and defer as much server load as possible. This commit represents the initial implementation of a general caching layer in wire protocol version 2. We define a new interface and behavior for a wire protocol cacher in repository.py. (This is probably where a reviewer should look first to understand what is going on.) The bulk of the added code is in wireprotov2server.py, where we define how a command can opt in to being cached and integrate caching into command dispatching. From a very high-level: * A command can declare itself as cacheable by providing a callable that can be used to derive a cache key. * At dispatch time, if a command is cacheable, we attempt to construct a cacher and use it for serving the request and/or caching the request. * The dispatch layer handles the bulk of the business logic for caching, making cachers mostly "dumb content stores." * The mechanism for invalidating cached entries (one of the harder parts about caching in general) is by varying the cache key when state changes. As such, cachers don't need to be concerned with cache invalidation. Initially, we've hooked up support for caching "manifestdata" and "filedata" commands. These are the simplest to cache, as they should be immutable over time. Caching of commands related to changeset data is a bit harder (because cache validation is impacted by changes to bookmarks, phases, etc). This will be implemented later. (Strictly speaking, censoring a file should invalidate caches. I've added an inline TODO to track this edge case.) To prove it works, this commit implements a test-only extension providing in-memory caching backed by an lrucachedict. A new test showing this extension behaving properly is added. FWIW, the cacher is ~50 lines of code, demonstrating the relative ease with which a cache can be added to a server. While the test cacher is not suitable for production workloads, just for kicks I performed a clone of just the changeset and manifest data for the mozilla-unified repository. With a fully warmed cache (of just the manifest data since changeset data is not cached), server-side CPU usage dropped from ~73s to ~28s. That's pretty significant and demonstrates the potential that response caching has on server scalability! Differential Revision: https://phab.mercurial-scm.org/D4773
author Gregory Szorc <gregory.szorc@gmail.com>
date Wed, 26 Sep 2018 17:16:56 -0700
parents eb586ed5d8ce
children 5c2a4f37eace
line wrap: on
line source

#require serve

  $ echo "[extensions]" >> $HGRCPATH
  $ echo "fetch=" >> $HGRCPATH

test fetch with default branches only

  $ hg init a
  $ echo a > a/a
  $ hg --cwd a commit -Ama
  adding a
  $ hg clone a b
  updating to branch default
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ hg clone a c
  updating to branch default
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ echo b > a/b
  $ hg --cwd a commit -Amb
  adding b
  $ hg --cwd a parents -q
  1:d2ae7f538514

should pull one change

  $ hg --cwd b fetch ../a
  pulling from ../a
  searching for changes
  adding changesets
  adding manifests
  adding file changes
  added 1 changesets with 1 changes to 1 files
  new changesets d2ae7f538514
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ hg --cwd b parents -q
  1:d2ae7f538514
  $ echo c > c/c
  $ hg --cwd c commit -Amc
  adding c
  $ hg clone c d
  updating to branch default
  2 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ hg clone c e
  updating to branch default
  2 files updated, 0 files merged, 0 files removed, 0 files unresolved

We cannot use the default commit message if fetching from a local
repo, because the path of the repo will be included in the commit
message, making every commit appear different.
should merge c into a

  $ hg --cwd c fetch -d '0 0' -m 'automated merge' ../a
  pulling from ../a
  searching for changes
  adding changesets
  adding manifests
  adding file changes
  added 1 changesets with 1 changes to 1 files (+1 heads)
  new changesets d2ae7f538514
  updating to 2:d2ae7f538514
  1 files updated, 0 files merged, 1 files removed, 0 files unresolved
  merging with 1:d36c0562f908
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  new changeset 3:a323a0c43ec4 merges remote changes with local
  $ ls c
  a
  b
  c
  $ hg serve --cwd a -a localhost -p $HGPORT -d --pid-file=hg.pid
  $ cat a/hg.pid >> "$DAEMON_PIDS"

fetch over http, no auth
(this also tests that editor is invoked if '--edit' is specified)

  $ HGEDITOR=cat hg --cwd d fetch --edit http://localhost:$HGPORT/
  pulling from http://localhost:$HGPORT/
  searching for changes
  adding changesets
  adding manifests
  adding file changes
  added 1 changesets with 1 changes to 1 files (+1 heads)
  new changesets d2ae7f538514
  updating to 2:d2ae7f538514
  1 files updated, 0 files merged, 1 files removed, 0 files unresolved
  merging with 1:d36c0562f908
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  Automated merge with http://localhost:$HGPORT/
  
  
  HG: Enter commit message.  Lines beginning with 'HG:' are removed.
  HG: Leave message empty to abort commit.
  HG: --
  HG: user: test
  HG: branch merge
  HG: branch 'default'
  HG: changed c
  new changeset 3:* merges remote changes with local (glob)
  $ hg --cwd d tip --template '{desc}\n'
  Automated merge with http://localhost:$HGPORT/
  $ hg --cwd d status --rev 'tip^1' --rev tip
  A c
  $ hg --cwd d status --rev 'tip^2' --rev tip
  A b

fetch over http with auth (should be hidden in desc)
(this also tests that editor is not invoked if '--edit' is not
specified, even though commit message is not specified explicitly)

  $ HGEDITOR=cat hg --cwd e fetch http://user:password@localhost:$HGPORT/
  pulling from http://user:***@localhost:$HGPORT/
  searching for changes
  adding changesets
  adding manifests
  adding file changes
  added 1 changesets with 1 changes to 1 files (+1 heads)
  new changesets d2ae7f538514
  updating to 2:d2ae7f538514
  1 files updated, 0 files merged, 1 files removed, 0 files unresolved
  merging with 1:d36c0562f908
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  new changeset 3:* merges remote changes with local (glob)
  $ hg --cwd e tip --template '{desc}\n'
  Automated merge with http://localhost:$HGPORT/
  $ hg clone a f
  updating to branch default
  2 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ hg clone a g
  updating to branch default
  2 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ echo f > f/f
  $ hg --cwd f ci -Amf
  adding f
  $ echo g > g/g
  $ hg --cwd g ci -Amg
  adding g
  $ hg clone -q f h
  $ hg clone -q g i

should merge f into g

  $ hg --cwd g fetch -d '0 0' --switch -m 'automated merge' ../f
  pulling from ../f
  searching for changes
  adding changesets
  adding manifests
  adding file changes
  added 1 changesets with 1 changes to 1 files (+1 heads)
  new changesets 6343ca3eff20
  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
  merging with 3:6343ca3eff20
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  new changeset 4:f7faa0b7d3c6 merges remote changes with local
  $ rm i/g

should abort, because i is modified

  $ hg --cwd i fetch ../h
  abort: uncommitted changes
  [255]

test fetch with named branches

  $ hg init nbase
  $ echo base > nbase/a
  $ hg -R nbase ci -Am base
  adding a
  $ hg -R nbase branch a
  marked working directory as branch a
  (branches are permanent and global, did you want a bookmark?)
  $ echo a > nbase/a
  $ hg -R nbase ci -m a
  $ hg -R nbase up -C 0
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ hg -R nbase branch b
  marked working directory as branch b
  $ echo b > nbase/b
  $ hg -R nbase ci -Am b
  adding b

pull in change on foreign branch

  $ hg clone nbase n1
  updating to branch default
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ hg clone nbase n2
  updating to branch default
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ hg -R n1 up -C a
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ echo aa > n1/a
  $ hg -R n1 ci -m a1
  $ hg -R n2 up -C b
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ hg -R n2 fetch -m 'merge' n1
  pulling from n1
  searching for changes
  adding changesets
  adding manifests
  adding file changes
  added 1 changesets with 1 changes to 1 files
  new changesets 8fdc9284bbc5

parent should be 2 (no automatic update)

  $ hg -R n2 parents --template '{rev}\n'
  2
  $ rm -fr n1 n2

pull in changes on both foreign and local branches

  $ hg clone nbase n1
  updating to branch default
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ hg clone nbase n2
  updating to branch default
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ hg -R n1 up -C a
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ echo aa > n1/a
  $ hg -R n1 ci -m a1
  $ hg -R n1 up -C b
  2 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ echo bb > n1/b
  $ hg -R n1 ci -m b1
  $ hg -R n2 up -C b
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ hg -R n2 fetch -m 'merge' n1
  pulling from n1
  searching for changes
  adding changesets
  adding manifests
  adding file changes
  added 2 changesets with 2 changes to 2 files
  new changesets 8fdc9284bbc5:3c4a837a864f
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved

parent should be 4 (fast forward)

  $ hg -R n2 parents --template '{rev}\n'
  4
  $ rm -fr n1 n2

pull changes on foreign (2 new heads) and local (1 new head) branches
with a local change

  $ hg clone nbase n1
  updating to branch default
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ hg clone nbase n2
  updating to branch default
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ hg -R n1 up -C a
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ echo a1 > n1/a
  $ hg -R n1 ci -m a1
  $ hg -R n1 up -C b
  2 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ echo bb > n1/b
  $ hg -R n1 ci -m b1
  $ hg -R n1 up -C 1
  1 files updated, 0 files merged, 1 files removed, 0 files unresolved
  $ echo a2 > n1/a
  $ hg -R n1 ci -m a2
  created new head
  $ hg -R n2 up -C b
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ echo change >> n2/c
  $ hg -R n2 ci -A -m local
  adding c
  $ hg -R n2 fetch -d '0 0' -m 'merge' n1
  pulling from n1
  searching for changes
  adding changesets
  adding manifests
  adding file changes
  added 3 changesets with 3 changes to 2 files (+2 heads)
  new changesets d05ce59ff88d:a7954de24e4c
  updating to 5:3c4a837a864f
  1 files updated, 0 files merged, 1 files removed, 0 files unresolved
  merging with 3:1267f84a9ea5
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  new changeset 7:2cf2a1261f21 merges remote changes with local

parent should be 7 (new merge changeset)

  $ hg -R n2 parents --template '{rev}\n'
  7
  $ rm -fr n1 n2

pull in changes on foreign (merge of local branch) and local (2 new
heads) with a local change

  $ hg clone nbase n1
  updating to branch default
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ hg clone nbase n2
  updating to branch default
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ hg -R n1 up -C a
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ hg -R n1 merge b
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  (branch merge, don't forget to commit)
  $ hg -R n1 ci -m merge
  $ hg -R n1 up -C 2
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ echo c > n1/a
  $ hg -R n1 ci -m c
  $ hg -R n1 up -C 2
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ echo cc > n1/a
  $ hg -R n1 ci -m cc
  created new head
  $ hg -R n2 up -C b
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ echo change >> n2/b
  $ hg -R n2 ci -A -m local
  $ hg -R n2 fetch -m 'merge' n1
  pulling from n1
  searching for changes
  adding changesets
  adding manifests
  adding file changes
  added 3 changesets with 2 changes to 1 files (+2 heads)
  new changesets b84e8d0f020f:3d3bf54f99c0
  not merging with 1 other new branch heads (use "hg heads ." and "hg merge" to merge them)
  [1]

parent should be 3 (fetch did not merge anything)

  $ hg -R n2 parents --template '{rev}\n'
  3
  $ rm -fr n1 n2

pull in change on different branch than dirstate

  $ hg init n1
  $ echo a > n1/a
  $ hg -R n1 ci -Am initial
  adding a
  $ hg clone n1 n2
  updating to branch default
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ echo b > n1/a
  $ hg -R n1 ci -m next
  $ hg -R n2 branch topic
  marked working directory as branch topic
  (branches are permanent and global, did you want a bookmark?)
  $ hg -R n2 fetch -m merge n1
  abort: working directory not at branch tip
  (use 'hg update' to check out branch tip)
  [255]

parent should be 0 (fetch did not update or merge anything)

  $ hg -R n2 parents --template '{rev}\n'
  0
  $ rm -fr n1 n2

test fetch with inactive branches

  $ hg init ib1
  $ echo a > ib1/a
  $ hg --cwd ib1 ci -Am base
  adding a
  $ hg --cwd ib1 branch second
  marked working directory as branch second
  (branches are permanent and global, did you want a bookmark?)
  $ echo b > ib1/b
  $ hg --cwd ib1 ci -Am onsecond
  adding b
  $ hg --cwd ib1 branch -f default
  marked working directory as branch default
  $ echo c > ib1/c
  $ hg --cwd ib1 ci -Am newdefault
  adding c
  created new head
  $ hg clone ib1 ib2
  updating to branch default
  3 files updated, 0 files merged, 0 files removed, 0 files unresolved

fetch should succeed

  $ hg --cwd ib2 fetch ../ib1
  pulling from ../ib1
  searching for changes
  no changes found
  $ rm -fr ib1 ib2

test issue1726

  $ hg init i1726r1
  $ echo a > i1726r1/a
  $ hg --cwd i1726r1 ci -Am base
  adding a
  $ hg clone i1726r1 i1726r2
  updating to branch default
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  $ echo b > i1726r1/a
  $ hg --cwd i1726r1 ci -m second
  $ echo c > i1726r2/a
  $ hg --cwd i1726r2 ci -m third
  $ HGMERGE=true hg --cwd i1726r2 fetch ../i1726r1
  pulling from ../i1726r1
  searching for changes
  adding changesets
  adding manifests
  adding file changes
  added 1 changesets with 1 changes to 1 files (+1 heads)
  new changesets 7837755a2789
  updating to 2:7837755a2789
  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  merging with 1:d1f0c6c48ebd
  merging a
  0 files updated, 1 files merged, 0 files removed, 0 files unresolved
  new changeset 3:* merges remote changes with local (glob)
  $ hg --cwd i1726r2 heads default --template '{rev}\n'
  3

test issue2047

  $ hg -q init i2047a
  $ cd i2047a
  $ echo a > a
  $ hg -q ci -Am a
  $ hg -q branch stable
  $ echo b > b
  $ hg -q ci -Am b
  $ cd ..
  $ hg -q clone -r 0 i2047a i2047b
  $ cd i2047b
  $ hg fetch ../i2047a
  pulling from ../i2047a
  searching for changes
  adding changesets
  adding manifests
  adding file changes
  added 1 changesets with 1 changes to 1 files
  new changesets c8735224de5c

  $ cd ..