comparison hgext/largefiles/design.txt @ 15168:cfccd3bee7b3

hgext: add largefiles extension This code has a number of contributors and a complicated history prior to its introduction that can be seen by visiting: https://developers.kilnhg.com/Repo/Kiln/largefiles/largefiles http://hg.gerg.ca/hg-bfiles and looking at the included copyright notices and contributors list.
author various
date Sat, 24 Sep 2011 17:35:45 +0200
parents
children ca51a5dd5d0b
comparison
equal deleted inserted replaced
15167:8df4166b6f63 15168:cfccd3bee7b3
1 = largefiles - manage large binary files =
2 This extension is based off of Greg Ward's bfiles extension which can be found
3 at http://mercurial.selenic.com/wiki/BfilesExtension.
4
5 == The largefile store ==
6
7 largefile stores are, in the typical use case, centralized servers that have
8 every past revision of a given binary file. Each largefile is identified by
9 its sha1 hash, and all interactions with the store take one of the following
10 forms.
11
12 -Download a bfile with this hash
13 -Upload a bfile with this hash
14 -Check if the store has a bfile with this hash
15
16 largefiles stores can take one of two forms:
17
18 -Directories on a network file share
19 -Mercurial wireproto servers, either via ssh or http (hgweb)
20
21 == The Local Repository ==
22
23 The local repository has a largefile cache in .hg/largefiles which holds a
24 subset of the largefiles needed. On a clone only the largefiles at tip are
25 downloaded. When largefiles are downloaded from the central store, a copy is
26 saved in this store.
27
28 == The Global Cache ==
29
30 largefiles in a local repository cache are hardlinked to files in the global
31 cache. Before a file is downloaded we check if it is in the global cache.
32
33 == Implementation Details ==
34
35 Each largefile has a standin which is in .hglf. The standin is tracked by
36 Mercurial. The standin contains the SHA1 hash of the largefile. When a
37 largefile is added/removed/copied/renamed/etc the same operation is applied to
38 the standin. Thus the history of the standin is the history of the largefile.
39
40 For performance reasons, the contents of a standin are only updated before a
41 commit. Standins are added/removed/copied/renamed from add/remove/copy/rename
42 Mercurial commands but their contents will not be updated. The contents of a
43 standin will always be the hash of the largefile as of the last commit. To
44 support some commands (revert) some standins are temporarily updated but will
45 be changed back after the command is finished.
46
47 A Mercurial dirstate object tracks the state of the largefiles. The dirstate
48 uses the last modified time and current size to detect if a file has changed
49 (without reading the entire contents of the file).