Mercurial > hg
annotate contrib/synthrepo.py @ 23819:6bf93440a717
branchmap: add seek() to end of file before calling tell() on append open()
This is similar to 48c232873a54, which was subsequently modified in 19f5dec2d61f
for 2.4. Unexpected test changes on Windows occurred without this.
author | Matt Harbison <matt_harbison@yahoo.com> |
---|---|
date | Sat, 10 Jan 2015 12:00:03 -0500 |
parents | a5dbec255f14 |
children | 6ddc86eedc3b |
rev | line source |
---|---|
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
1 # synthrepo.py - repo synthesis |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
2 # |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
3 # Copyright 2012 Facebook |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
4 # |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
5 # This software may be used and distributed according to the terms of the |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
6 # GNU General Public License version 2 or any later version. |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
7 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
8 '''synthesize structurally interesting change history |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
9 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
10 This extension is useful for creating a repository with properties |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
11 that are statistically similar to an existing repository. During |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
12 analysis, a simple probability table is constructed from the history |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
13 of an existing repository. During synthesis, these properties are |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
14 reconstructed. |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
15 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
16 Properties that are analyzed and synthesized include the following: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
17 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
18 - Lines added or removed when an existing file is modified |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
19 - Number and sizes of files added |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
20 - Number of files removed |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
21 - Line lengths |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
22 - Topological distance to parent changeset(s) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
23 - Probability of a commit being a merge |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
24 - Probability of a newly added file being added to a new directory |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
25 - Interarrival time, and time zone, of commits |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
26 - Number of files in each directory |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
27 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
28 A few obvious properties that are not currently handled realistically: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
29 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
30 - Merges are treated as regular commits with two parents, which is not |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
31 realistic |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
32 - Modifications are not treated as operations on hunks of lines, but |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
33 as insertions and deletions of randomly chosen single lines |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
34 - Committer ID (always random) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
35 - Executability of files |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
36 - Symlinks and binary files are ignored |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
37 ''' |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
38 |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
39 import bisect, collections, itertools, json, os, random, time, sys |
19322
ff1586a3adc5
cleanup: remove unused imports
Simon Heimberg <simohe@besonet.ch>
parents:
18927
diff
changeset
|
40 from mercurial import cmdutil, context, patch, scmutil, util, hg |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
41 from mercurial.i18n import _ |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
42 from mercurial.node import nullrev, nullid, short |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
43 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
44 testedwith = 'internal' |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
45 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
46 cmdtable = {} |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
47 command = cmdutil.command(cmdtable) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
48 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
49 newfile = set(('new fi', 'rename', 'copy f', 'copy t')) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
50 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
51 def zerodict(): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
52 return collections.defaultdict(lambda: 0) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
53 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
54 def roundto(x, k): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
55 if x > k * 2: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
56 return int(round(x / float(k)) * k) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
57 return int(round(x)) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
58 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
59 def parsegitdiff(lines): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
60 filename, mar, lineadd, lineremove = None, None, zerodict(), 0 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
61 binary = False |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
62 for line in lines: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
63 start = line[:6] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
64 if start == 'diff -': |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
65 if filename: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
66 yield filename, mar, lineadd, lineremove, binary |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
67 mar, lineadd, lineremove, binary = 'm', zerodict(), 0, False |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
68 filename = patch.gitre.match(line).group(1) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
69 elif start in newfile: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
70 mar = 'a' |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
71 elif start == 'GIT bi': |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
72 binary = True |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
73 elif start == 'delete': |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
74 mar = 'r' |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
75 elif start: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
76 s = start[0] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
77 if s == '-' and not line.startswith('--- '): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
78 lineremove += 1 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
79 elif s == '+' and not line.startswith('+++ '): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
80 lineadd[roundto(len(line) - 1, 5)] += 1 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
81 if filename: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
82 yield filename, mar, lineadd, lineremove, binary |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
83 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
84 @command('analyze', |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
85 [('o', 'output', '', _('write output to given file'), _('FILE')), |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
86 ('r', 'rev', [], _('analyze specified revisions'), _('REV'))], |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
87 _('hg analyze'), optionalrepo=True) |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
88 def analyze(ui, repo, *revs, **opts): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
89 '''create a simple model of a repository to use for later synthesis |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
90 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
91 This command examines every changeset in the given range (or all |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
92 of history if none are specified) and creates a simple statistical |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
93 model of the history of the repository. It also measures the directory |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
94 structure of the repository as checked out. |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
95 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
96 The model is written out to a JSON file, and can be used by |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
97 :hg:`synthesize` to create or augment a repository with synthetic |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
98 commits that have a structure that is statistically similar to the |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
99 analyzed repository. |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
100 ''' |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
101 root = repo.root |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
102 if not root.endswith(os.path.sep): |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
103 root += os.path.sep |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
104 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
105 revs = list(revs) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
106 revs.extend(opts['rev']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
107 if not revs: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
108 revs = [':'] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
109 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
110 output = opts['output'] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
111 if not output: |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
112 output = os.path.basename(root) + '.json' |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
113 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
114 if output == '-': |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
115 fp = sys.stdout |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
116 else: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
117 fp = open(output, 'w') |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
118 |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
119 # Always obtain file counts of each directory in the given root directory. |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
120 def onerror(e): |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
121 ui.warn(_('error walking directory structure: %s\n') % e) |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
122 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
123 dirs = {} |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
124 rootprefixlen = len(root) |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
125 for dirpath, dirnames, filenames in os.walk(root, onerror=onerror): |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
126 dirpathfromroot = dirpath[rootprefixlen:] |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
127 dirs[dirpathfromroot] = len(filenames) |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
128 if '.hg' in dirnames: |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
129 dirnames.remove('.hg') |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
130 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
131 lineschanged = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
132 children = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
133 p1distance = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
134 p2distance = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
135 linesinfilesadded = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
136 fileschanged = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
137 filesadded = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
138 filesremoved = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
139 linelengths = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
140 interarrival = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
141 parents = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
142 dirsadded = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
143 tzoffset = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
144 |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
145 # If a mercurial repo is available, also model the commit history. |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
146 if repo: |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
147 revs = scmutil.revrange(repo, revs) |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
148 revs.sort() |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
149 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
150 progress = ui.progress |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
151 _analyzing = _('analyzing') |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
152 _changesets = _('changesets') |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
153 _total = len(revs) |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
154 |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
155 for i, rev in enumerate(revs): |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
156 progress(_analyzing, i, unit=_changesets, total=_total) |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
157 ctx = repo[rev] |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
158 pl = ctx.parents() |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
159 pctx = pl[0] |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
160 prev = pctx.rev() |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
161 children[prev] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
162 p1distance[rev - prev] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
163 parents[len(pl)] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
164 tzoffset[ctx.date()[1]] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
165 if len(pl) > 1: |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
166 p2distance[rev - pl[1].rev()] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
167 if prev == rev - 1: |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
168 lastctx = pctx |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
169 else: |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
170 lastctx = repo[rev - 1] |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
171 if lastctx.rev() != nullrev: |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
172 timedelta = ctx.date()[0] - lastctx.date()[0] |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
173 interarrival[roundto(timedelta, 300)] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
174 diff = sum((d.splitlines() for d in ctx.diff(pctx, git=True)), []) |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
175 fileadds, diradds, fileremoves, filechanges = 0, 0, 0, 0 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
176 for filename, mar, lineadd, lineremove, isbin in parsegitdiff(diff): |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
177 if isbin: |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
178 continue |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
179 added = sum(lineadd.itervalues(), 0) |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
180 if mar == 'm': |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
181 if added and lineremove: |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
182 lineschanged[roundto(added, 5), |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
183 roundto(lineremove, 5)] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
184 filechanges += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
185 elif mar == 'a': |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
186 fileadds += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
187 if '/' in filename: |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
188 filedir = filename.rsplit('/', 1)[0] |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
189 if filedir not in pctx.dirs(): |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
190 diradds += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
191 linesinfilesadded[roundto(added, 5)] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
192 elif mar == 'r': |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
193 fileremoves += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
194 for length, count in lineadd.iteritems(): |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
195 linelengths[length] += count |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
196 fileschanged[filechanges] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
197 filesadded[fileadds] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
198 dirsadded[diradds] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
199 filesremoved[fileremoves] += 1 |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
200 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
201 invchildren = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
202 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
203 for rev, count in children.iteritems(): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
204 invchildren[count] += 1 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
205 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
206 if output != '-': |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
207 ui.status(_('writing output to %s\n') % output) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
208 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
209 def pronk(d): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
210 return sorted(d.iteritems(), key=lambda x: x[1], reverse=True) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
211 |
20672
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
212 json.dump({'revs': len(revs), |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
213 'initdirs': pronk(dirs), |
20672
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
214 'lineschanged': pronk(lineschanged), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
215 'children': pronk(invchildren), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
216 'fileschanged': pronk(fileschanged), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
217 'filesadded': pronk(filesadded), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
218 'linesinfilesadded': pronk(linesinfilesadded), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
219 'dirsadded': pronk(dirsadded), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
220 'filesremoved': pronk(filesremoved), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
221 'linelengths': pronk(linelengths), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
222 'parents': pronk(parents), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
223 'p1distance': pronk(p1distance), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
224 'p2distance': pronk(p2distance), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
225 'interarrival': pronk(interarrival), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
226 'tzoffset': pronk(tzoffset), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
227 }, |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
228 fp) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
229 fp.close() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
230 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
231 @command('synthesize', |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
232 [('c', 'count', 0, _('create given number of commits'), _('COUNT')), |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
233 ('', 'dict', '', _('path to a dictionary of words'), _('FILE')), |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
234 ('', 'initfiles', 0, _('initial file count to create'), _('COUNT'))], |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
235 _('hg synthesize [OPTION].. DESCFILE')) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
236 def synthesize(ui, repo, descpath, **opts): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
237 '''synthesize commits based on a model of an existing repository |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
238 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
239 The model must have been generated by :hg:`analyze`. Commits will |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
240 be generated randomly according to the probabilities described in |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
241 the model. If --initfiles is set, the repository will be seeded with |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
242 the given number files following the modeled repository's directory |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
243 structure. |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
244 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
245 When synthesizing new content, commit descriptions, and user |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
246 names, words will be chosen randomly from a dictionary that is |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
247 presumed to contain one word per line. Use --dict to specify the |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
248 path to an alternate dictionary to use. |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
249 ''' |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
250 try: |
17887
0e2846b2482c
url: use open and not url.open for local files (issue3624)
Siddharth Agarwal <sid0@fb.com>
parents:
17734
diff
changeset
|
251 fp = hg.openpath(ui, descpath) |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
252 except Exception, err: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
253 raise util.Abort('%s: %s' % (descpath, err[0].strerror)) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
254 desc = json.load(fp) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
255 fp.close() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
256 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
257 def cdf(l): |
18047
9196638b08ce
synthrepo: do not crash if a list is empty
Bryan O'Sullivan <bryano@fb.com>
parents:
17887
diff
changeset
|
258 if not l: |
9196638b08ce
synthrepo: do not crash if a list is empty
Bryan O'Sullivan <bryano@fb.com>
parents:
17887
diff
changeset
|
259 return [], [] |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
260 vals, probs = zip(*sorted(l, key=lambda x: x[1], reverse=True)) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
261 t = float(sum(probs, 0)) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
262 s, cdfs = 0, [] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
263 for v in probs: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
264 s += v |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
265 cdfs.append(s / t) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
266 return vals, cdfs |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
267 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
268 lineschanged = cdf(desc['lineschanged']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
269 fileschanged = cdf(desc['fileschanged']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
270 filesadded = cdf(desc['filesadded']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
271 dirsadded = cdf(desc['dirsadded']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
272 filesremoved = cdf(desc['filesremoved']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
273 linelengths = cdf(desc['linelengths']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
274 parents = cdf(desc['parents']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
275 p1distance = cdf(desc['p1distance']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
276 p2distance = cdf(desc['p2distance']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
277 interarrival = cdf(desc['interarrival']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
278 linesinfilesadded = cdf(desc['linesinfilesadded']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
279 tzoffset = cdf(desc['tzoffset']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
280 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
281 dictfile = opts.get('dict') or '/usr/share/dict/words' |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
282 try: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
283 fp = open(dictfile, 'rU') |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
284 except IOError, err: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
285 raise util.Abort('%s: %s' % (dictfile, err.strerror)) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
286 words = fp.read().splitlines() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
287 fp.close() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
288 |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
289 initdirs = {} |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
290 if desc['initdirs']: |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
291 for k, v in desc['initdirs']: |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
292 initdirs[k.encode('utf-8').replace('.hg', '_hg')] = v |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
293 initdirs = renamedirs(initdirs, words) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
294 initdirscdf = cdf(initdirs) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
295 |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
296 def pick(cdf): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
297 return cdf[0][bisect.bisect_left(cdf[1], random.random())] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
298 |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
299 def pickpath(): |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
300 return os.path.join(pick(initdirscdf), random.choice(words)) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
301 |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
302 def makeline(minimum=0): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
303 total = max(minimum, pick(linelengths)) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
304 c, l = 0, [] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
305 while c < total: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
306 w = random.choice(words) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
307 c += len(w) + 1 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
308 l.append(w) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
309 return ' '.join(l) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
310 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
311 wlock = repo.wlock() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
312 lock = repo.lock() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
313 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
314 nevertouch = set(('.hgsub', '.hgignore', '.hgtags')) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
315 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
316 progress = ui.progress |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
317 _synthesizing = _('synthesizing') |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
318 _files = _('initial files') |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
319 _changesets = _('changesets') |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
320 |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
321 # Synthesize a single initial revision adding files to the repo according |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
322 # to the modeled directory structure. |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
323 initcount = int(opts['initfiles']) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
324 if initcount and initdirs: |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
325 pctx = repo[None].parents()[0] |
23778
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
326 dirs = set(pctx.dirs()) |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
327 files = {} |
23778
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
328 |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
329 def validpath(path): |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
330 # Don't pick filenames which are already directory names. |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
331 if path in dirs: |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
332 return False |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
333 # Don't pick directories which were used as file names. |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
334 while path: |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
335 if path in files: |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
336 return False |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
337 path = os.path.dirname(path) |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
338 return True |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
339 |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
340 for i in xrange(0, initcount): |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
341 ui.progress(_synthesizing, i, unit=_files, total=initcount) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
342 |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
343 path = pickpath() |
23778
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
344 while not validpath(path): |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
345 path = pickpath() |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
346 data = '%s contents\n' % path |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
347 files[path] = context.memfilectx(repo, path, data) |
23778
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
348 dir = os.path.dirname(path) |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
349 while dir and dir not in dirs: |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
350 dirs.add(dir) |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
351 dir = os.path.dirname(dir) |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
352 |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
353 def filectxfn(repo, memctx, path): |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
354 return files[path] |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
355 |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
356 ui.progress(_synthesizing, None) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
357 message = 'synthesized wide repo with %d files' % (len(files),) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
358 mc = context.memctx(repo, [pctx.node(), nullid], message, |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
359 files.iterkeys(), filectxfn, ui.username(), |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
360 '%d %d' % util.makedate()) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
361 initnode = mc.commit() |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
362 hexfn = ui.debugflag and hex or short |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
363 ui.status(_('added commit %s with %d files\n') |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
364 % (hexfn(initnode), len(files))) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
365 |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
366 # Synthesize incremental revisions to the repository, adding repo depth. |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
367 count = int(opts['count']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
368 heads = set(map(repo.changelog.rev, repo.heads())) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
369 for i in xrange(count): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
370 progress(_synthesizing, i, unit=_changesets, total=count) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
371 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
372 node = repo.changelog.node |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
373 revs = len(repo) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
374 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
375 def pickhead(heads, distance): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
376 if heads: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
377 lheads = sorted(heads) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
378 rev = revs - min(pick(distance), revs) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
379 if rev < lheads[-1]: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
380 rev = lheads[bisect.bisect_left(lheads, rev)] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
381 else: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
382 rev = lheads[-1] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
383 return rev, node(rev) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
384 return nullrev, nullid |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
385 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
386 r1 = revs - min(pick(p1distance), revs) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
387 p1 = node(r1) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
388 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
389 # the number of heads will grow without bound if we use a pure |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
390 # model, so artificially constrain their proliferation |
22472
2e2577b0ccbe
contrib/synthrepo: only generate 2 parents if model contains merges
Mike Edgar <adgar@google.com>
parents:
22446
diff
changeset
|
391 toomanyheads = len(heads) > random.randint(1, 20) |
2e2577b0ccbe
contrib/synthrepo: only generate 2 parents if model contains merges
Mike Edgar <adgar@google.com>
parents:
22446
diff
changeset
|
392 if p2distance[0] and (pick(parents) == 2 or toomanyheads): |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
393 r2, p2 = pickhead(heads.difference([r1]), p2distance) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
394 else: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
395 r2, p2 = nullrev, nullid |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
396 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
397 pl = [p1, p2] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
398 pctx = repo[r1] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
399 mf = pctx.manifest() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
400 mfk = mf.keys() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
401 changes = {} |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
402 if mfk: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
403 for __ in xrange(pick(fileschanged)): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
404 for __ in xrange(10): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
405 fctx = pctx.filectx(random.choice(mfk)) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
406 path = fctx.path() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
407 if not (path in nevertouch or fctx.isbinary() or |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
408 'l' in fctx.flags()): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
409 break |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
410 lines = fctx.data().splitlines() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
411 add, remove = pick(lineschanged) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
412 for __ in xrange(remove): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
413 if not lines: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
414 break |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
415 del lines[random.randrange(0, len(lines))] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
416 for __ in xrange(add): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
417 lines.insert(random.randint(0, len(lines)), makeline()) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
418 path = fctx.path() |
21689
503bb3af70fe
memfilectx: call super.__init__ instead of duplicating code
Sean Farley <sean.michael.farley@gmail.com>
parents:
20672
diff
changeset
|
419 changes[path] = context.memfilectx(repo, path, |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
420 '\n'.join(lines) + '\n') |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
421 for __ in xrange(pick(filesremoved)): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
422 path = random.choice(mfk) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
423 for __ in xrange(10): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
424 path = random.choice(mfk) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
425 if path not in changes: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
426 changes[path] = None |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
427 break |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
428 if filesadded: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
429 dirs = list(pctx.dirs()) |
23235
4cdc3e2810b9
synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents:
23234
diff
changeset
|
430 dirs.insert(0, '') |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
431 for __ in xrange(pick(filesadded)): |
23235
4cdc3e2810b9
synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents:
23234
diff
changeset
|
432 pathstr = '' |
4cdc3e2810b9
synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents:
23234
diff
changeset
|
433 while pathstr in dirs: |
4cdc3e2810b9
synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents:
23234
diff
changeset
|
434 path = [random.choice(dirs)] |
4cdc3e2810b9
synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents:
23234
diff
changeset
|
435 if pick(dirsadded): |
4cdc3e2810b9
synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents:
23234
diff
changeset
|
436 path.append(random.choice(words)) |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
437 path.append(random.choice(words)) |
23235
4cdc3e2810b9
synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents:
23234
diff
changeset
|
438 pathstr = '/'.join(filter(None, path)) |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
439 data = '\n'.join(makeline() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
440 for __ in xrange(pick(linesinfilesadded))) + '\n' |
23235
4cdc3e2810b9
synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents:
23234
diff
changeset
|
441 changes[pathstr] = context.memfilectx(repo, pathstr, data) |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
442 def filectxfn(repo, memctx, path): |
22446
054ec0212718
contrib/synthrepo: return None to delete files on commit, don't raise IOError
Mike Edgar <adgar@google.com>
parents:
21689
diff
changeset
|
443 return changes[path] |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
444 if not changes: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
445 continue |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
446 if revs: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
447 date = repo['tip'].date()[0] + pick(interarrival) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
448 else: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
449 date = time.time() - (86400 * count) |
23234
944d6cfbe166
synthrepo: synthesized dates must be positive, fit in 32-bit signed ints
Mike Edgar <adgar@google.com>
parents:
22709
diff
changeset
|
450 # dates in mercurial must be positive, fit in 32-bit signed integers. |
944d6cfbe166
synthrepo: synthesized dates must be positive, fit in 32-bit signed ints
Mike Edgar <adgar@google.com>
parents:
22709
diff
changeset
|
451 date = min(0x7fffffff, max(0, date)) |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
452 user = random.choice(words) + '@' + random.choice(words) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
453 mc = context.memctx(repo, pl, makeline(minimum=2), |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
454 sorted(changes.iterkeys()), |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
455 filectxfn, user, '%d %d' % (date, pick(tzoffset))) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
456 newnode = mc.commit() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
457 heads.add(repo.changelog.rev(newnode)) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
458 heads.discard(r1) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
459 heads.discard(r2) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
460 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
461 lock.release() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
462 wlock.release() |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
463 |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
464 def renamedirs(dirs, words): |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
465 '''Randomly rename the directory names in the per-dir file count dict.''' |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
466 wordgen = itertools.cycle(words) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
467 replacements = {'': ''} |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
468 def rename(dirpath): |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
469 '''Recursively rename the directory and all path prefixes. |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
470 |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
471 The mapping from path to renamed path is stored for all path prefixes |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
472 as in dynamic programming, ensuring linear runtime and consistent |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
473 renaming regardless of iteration order through the model. |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
474 ''' |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
475 if dirpath in replacements: |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
476 return replacements[dirpath] |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
477 head, _ = os.path.split(dirpath) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
478 head = head and rename(head) or '' |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
479 renamed = os.path.join(head, wordgen.next()) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
480 replacements[dirpath] = renamed |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
481 return renamed |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
482 result = [] |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
483 for dirpath, count in dirs.iteritems(): |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
484 result.append([rename(dirpath.lstrip(os.sep)), count]) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
485 return result |