annotate contrib/synthrepo.py @ 26009:bbb698697efc

reachableroots: fix transposition of set and list types in PyArg_ParseTuple This is being masked by the function not properly returning NULL when it raises an exception, so the client code was just falling back to the native codepath when it got None back. A future change removes all reason for this C function to return None, which exposed this problem during development.
author Augie Fackler <augie@google.com>
date Tue, 11 Aug 2015 15:34:10 -0400
parents 328739ea70c3
children 56b2bcea2529
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
1 # synthrepo.py - repo synthesis
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
2 #
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
3 # Copyright 2012 Facebook
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
4 #
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
5 # This software may be used and distributed according to the terms of the
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
6 # GNU General Public License version 2 or any later version.
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
7
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
8 '''synthesize structurally interesting change history
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
9
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
10 This extension is useful for creating a repository with properties
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
11 that are statistically similar to an existing repository. During
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
12 analysis, a simple probability table is constructed from the history
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
13 of an existing repository. During synthesis, these properties are
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
14 reconstructed.
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
15
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
16 Properties that are analyzed and synthesized include the following:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
17
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
18 - Lines added or removed when an existing file is modified
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
19 - Number and sizes of files added
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
20 - Number of files removed
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
21 - Line lengths
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
22 - Topological distance to parent changeset(s)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
23 - Probability of a commit being a merge
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
24 - Probability of a newly added file being added to a new directory
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
25 - Interarrival time, and time zone, of commits
22709
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
26 - Number of files in each directory
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
27
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
28 A few obvious properties that are not currently handled realistically:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
29
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
30 - Merges are treated as regular commits with two parents, which is not
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
31 realistic
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
32 - Modifications are not treated as operations on hunks of lines, but
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
33 as insertions and deletions of randomly chosen single lines
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
34 - Committer ID (always random)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
35 - Executability of files
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
36 - Symlinks and binary files are ignored
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
37 '''
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
38
22708
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
39 import bisect, collections, itertools, json, os, random, time, sys
19322
ff1586a3adc5 cleanup: remove unused imports
Simon Heimberg <simohe@besonet.ch>
parents: 18927
diff changeset
40 from mercurial import cmdutil, context, patch, scmutil, util, hg
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
41 from mercurial.i18n import _
22708
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
42 from mercurial.node import nullrev, nullid, short
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
43
25186
80c5b2666a96 extensions: document that `testedwith = 'internal'` is special
Augie Fackler <augie@google.com>
parents: 24306
diff changeset
44 # Note for extension authors: ONLY specify testedwith = 'internal' for
80c5b2666a96 extensions: document that `testedwith = 'internal'` is special
Augie Fackler <augie@google.com>
parents: 24306
diff changeset
45 # extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
80c5b2666a96 extensions: document that `testedwith = 'internal'` is special
Augie Fackler <augie@google.com>
parents: 24306
diff changeset
46 # be specifying the version(s) of Mercurial they are tested with, or
80c5b2666a96 extensions: document that `testedwith = 'internal'` is special
Augie Fackler <augie@google.com>
parents: 24306
diff changeset
47 # leave the attribute unspecified.
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
48 testedwith = 'internal'
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
49
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
50 cmdtable = {}
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
51 command = cmdutil.command(cmdtable)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
52
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
53 newfile = set(('new fi', 'rename', 'copy f', 'copy t'))
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
54
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
55 def zerodict():
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
56 return collections.defaultdict(lambda: 0)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
57
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
58 def roundto(x, k):
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
59 if x > k * 2:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
60 return int(round(x / float(k)) * k)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
61 return int(round(x))
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
62
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
63 def parsegitdiff(lines):
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
64 filename, mar, lineadd, lineremove = None, None, zerodict(), 0
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
65 binary = False
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
66 for line in lines:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
67 start = line[:6]
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
68 if start == 'diff -':
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
69 if filename:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
70 yield filename, mar, lineadd, lineremove, binary
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
71 mar, lineadd, lineremove, binary = 'm', zerodict(), 0, False
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
72 filename = patch.gitre.match(line).group(1)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
73 elif start in newfile:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
74 mar = 'a'
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
75 elif start == 'GIT bi':
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
76 binary = True
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
77 elif start == 'delete':
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
78 mar = 'r'
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
79 elif start:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
80 s = start[0]
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
81 if s == '-' and not line.startswith('--- '):
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
82 lineremove += 1
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
83 elif s == '+' and not line.startswith('+++ '):
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
84 lineadd[roundto(len(line) - 1, 5)] += 1
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
85 if filename:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
86 yield filename, mar, lineadd, lineremove, binary
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
87
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
88 @command('analyze',
22709
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
89 [('o', 'output', '', _('write output to given file'), _('FILE')),
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
90 ('r', 'rev', [], _('analyze specified revisions'), _('REV'))],
22709
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
91 _('hg analyze'), optionalrepo=True)
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
92 def analyze(ui, repo, *revs, **opts):
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
93 '''create a simple model of a repository to use for later synthesis
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
94
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
95 This command examines every changeset in the given range (or all
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
96 of history if none are specified) and creates a simple statistical
22709
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
97 model of the history of the repository. It also measures the directory
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
98 structure of the repository as checked out.
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
99
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
100 The model is written out to a JSON file, and can be used by
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
101 :hg:`synthesize` to create or augment a repository with synthetic
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
102 commits that have a structure that is statistically similar to the
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
103 analyzed repository.
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
104 '''
22709
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
105 root = repo.root
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
106 if not root.endswith(os.path.sep):
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
107 root += os.path.sep
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
108
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
109 revs = list(revs)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
110 revs.extend(opts['rev'])
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
111 if not revs:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
112 revs = [':']
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
113
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
114 output = opts['output']
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
115 if not output:
22709
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
116 output = os.path.basename(root) + '.json'
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
117
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
118 if output == '-':
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
119 fp = sys.stdout
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
120 else:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
121 fp = open(output, 'w')
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
122
22709
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
123 # Always obtain file counts of each directory in the given root directory.
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
124 def onerror(e):
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
125 ui.warn(_('error walking directory structure: %s\n') % e)
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
126
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
127 dirs = {}
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
128 rootprefixlen = len(root)
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
129 for dirpath, dirnames, filenames in os.walk(root, onerror=onerror):
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
130 dirpathfromroot = dirpath[rootprefixlen:]
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
131 dirs[dirpathfromroot] = len(filenames)
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
132 if '.hg' in dirnames:
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
133 dirnames.remove('.hg')
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
134
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
135 lineschanged = zerodict()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
136 children = zerodict()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
137 p1distance = zerodict()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
138 p2distance = zerodict()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
139 linesinfilesadded = zerodict()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
140 fileschanged = zerodict()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
141 filesadded = zerodict()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
142 filesremoved = zerodict()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
143 linelengths = zerodict()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
144 interarrival = zerodict()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
145 parents = zerodict()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
146 dirsadded = zerodict()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
147 tzoffset = zerodict()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
148
22709
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
149 # If a mercurial repo is available, also model the commit history.
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
150 if repo:
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
151 revs = scmutil.revrange(repo, revs)
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
152 revs.sort()
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
153
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
154 progress = ui.progress
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
155 _analyzing = _('analyzing')
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
156 _changesets = _('changesets')
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
157 _total = len(revs)
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
158
22709
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
159 for i, rev in enumerate(revs):
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
160 progress(_analyzing, i, unit=_changesets, total=_total)
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
161 ctx = repo[rev]
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
162 pl = ctx.parents()
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
163 pctx = pl[0]
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
164 prev = pctx.rev()
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
165 children[prev] += 1
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
166 p1distance[rev - prev] += 1
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
167 parents[len(pl)] += 1
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
168 tzoffset[ctx.date()[1]] += 1
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
169 if len(pl) > 1:
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
170 p2distance[rev - pl[1].rev()] += 1
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
171 if prev == rev - 1:
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
172 lastctx = pctx
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
173 else:
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
174 lastctx = repo[rev - 1]
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
175 if lastctx.rev() != nullrev:
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
176 timedelta = ctx.date()[0] - lastctx.date()[0]
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
177 interarrival[roundto(timedelta, 300)] += 1
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
178 diff = sum((d.splitlines() for d in ctx.diff(pctx, git=True)), [])
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
179 fileadds, diradds, fileremoves, filechanges = 0, 0, 0, 0
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
180 for filename, mar, lineadd, lineremove, isbin in parsegitdiff(diff):
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
181 if isbin:
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
182 continue
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
183 added = sum(lineadd.itervalues(), 0)
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
184 if mar == 'm':
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
185 if added and lineremove:
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
186 lineschanged[roundto(added, 5),
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
187 roundto(lineremove, 5)] += 1
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
188 filechanges += 1
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
189 elif mar == 'a':
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
190 fileadds += 1
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
191 if '/' in filename:
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
192 filedir = filename.rsplit('/', 1)[0]
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
193 if filedir not in pctx.dirs():
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
194 diradds += 1
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
195 linesinfilesadded[roundto(added, 5)] += 1
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
196 elif mar == 'r':
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
197 fileremoves += 1
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
198 for length, count in lineadd.iteritems():
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
199 linelengths[length] += count
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
200 fileschanged[filechanges] += 1
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
201 filesadded[fileadds] += 1
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
202 dirsadded[diradds] += 1
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
203 filesremoved[fileremoves] += 1
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
204
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
205 invchildren = zerodict()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
206
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
207 for rev, count in children.iteritems():
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
208 invchildren[count] += 1
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
209
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
210 if output != '-':
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
211 ui.status(_('writing output to %s\n') % output)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
212
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
213 def pronk(d):
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
214 return sorted(d.iteritems(), key=lambda x: x[1], reverse=True)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
215
20672
05e58b08fdfe synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents: 19322
diff changeset
216 json.dump({'revs': len(revs),
22709
889789a2ca9f contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents: 22708
diff changeset
217 'initdirs': pronk(dirs),
20672
05e58b08fdfe synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents: 19322
diff changeset
218 'lineschanged': pronk(lineschanged),
05e58b08fdfe synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents: 19322
diff changeset
219 'children': pronk(invchildren),
05e58b08fdfe synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents: 19322
diff changeset
220 'fileschanged': pronk(fileschanged),
05e58b08fdfe synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents: 19322
diff changeset
221 'filesadded': pronk(filesadded),
05e58b08fdfe synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents: 19322
diff changeset
222 'linesinfilesadded': pronk(linesinfilesadded),
05e58b08fdfe synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents: 19322
diff changeset
223 'dirsadded': pronk(dirsadded),
05e58b08fdfe synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents: 19322
diff changeset
224 'filesremoved': pronk(filesremoved),
05e58b08fdfe synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents: 19322
diff changeset
225 'linelengths': pronk(linelengths),
05e58b08fdfe synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents: 19322
diff changeset
226 'parents': pronk(parents),
05e58b08fdfe synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents: 19322
diff changeset
227 'p1distance': pronk(p1distance),
05e58b08fdfe synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents: 19322
diff changeset
228 'p2distance': pronk(p2distance),
05e58b08fdfe synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents: 19322
diff changeset
229 'interarrival': pronk(interarrival),
05e58b08fdfe synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents: 19322
diff changeset
230 'tzoffset': pronk(tzoffset),
05e58b08fdfe synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents: 19322
diff changeset
231 },
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
232 fp)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
233 fp.close()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
234
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
235 @command('synthesize',
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
236 [('c', 'count', 0, _('create given number of commits'), _('COUNT')),
22708
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
237 ('', 'dict', '', _('path to a dictionary of words'), _('FILE')),
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
238 ('', 'initfiles', 0, _('initial file count to create'), _('COUNT'))],
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
239 _('hg synthesize [OPTION].. DESCFILE'))
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
240 def synthesize(ui, repo, descpath, **opts):
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
241 '''synthesize commits based on a model of an existing repository
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
242
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
243 The model must have been generated by :hg:`analyze`. Commits will
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
244 be generated randomly according to the probabilities described in
22708
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
245 the model. If --initfiles is set, the repository will be seeded with
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
246 the given number files following the modeled repository's directory
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
247 structure.
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
248
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
249 When synthesizing new content, commit descriptions, and user
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
250 names, words will be chosen randomly from a dictionary that is
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
251 presumed to contain one word per line. Use --dict to specify the
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
252 path to an alternate dictionary to use.
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
253 '''
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
254 try:
17887
0e2846b2482c url: use open and not url.open for local files (issue3624)
Siddharth Agarwal <sid0@fb.com>
parents: 17734
diff changeset
255 fp = hg.openpath(ui, descpath)
25660
328739ea70c3 global: mass rewrite to use modern exception syntax
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25186
diff changeset
256 except Exception as err:
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
257 raise util.Abort('%s: %s' % (descpath, err[0].strerror))
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
258 desc = json.load(fp)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
259 fp.close()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
260
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
261 def cdf(l):
18047
9196638b08ce synthrepo: do not crash if a list is empty
Bryan O'Sullivan <bryano@fb.com>
parents: 17887
diff changeset
262 if not l:
9196638b08ce synthrepo: do not crash if a list is empty
Bryan O'Sullivan <bryano@fb.com>
parents: 17887
diff changeset
263 return [], []
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
264 vals, probs = zip(*sorted(l, key=lambda x: x[1], reverse=True))
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
265 t = float(sum(probs, 0))
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
266 s, cdfs = 0, []
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
267 for v in probs:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
268 s += v
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
269 cdfs.append(s / t)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
270 return vals, cdfs
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
271
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
272 lineschanged = cdf(desc['lineschanged'])
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
273 fileschanged = cdf(desc['fileschanged'])
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
274 filesadded = cdf(desc['filesadded'])
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
275 dirsadded = cdf(desc['dirsadded'])
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
276 filesremoved = cdf(desc['filesremoved'])
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
277 linelengths = cdf(desc['linelengths'])
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
278 parents = cdf(desc['parents'])
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
279 p1distance = cdf(desc['p1distance'])
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
280 p2distance = cdf(desc['p2distance'])
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
281 interarrival = cdf(desc['interarrival'])
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
282 linesinfilesadded = cdf(desc['linesinfilesadded'])
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
283 tzoffset = cdf(desc['tzoffset'])
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
284
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
285 dictfile = opts.get('dict') or '/usr/share/dict/words'
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
286 try:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
287 fp = open(dictfile, 'rU')
25660
328739ea70c3 global: mass rewrite to use modern exception syntax
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25186
diff changeset
288 except IOError as err:
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
289 raise util.Abort('%s: %s' % (dictfile, err.strerror))
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
290 words = fp.read().splitlines()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
291 fp.close()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
292
22708
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
293 initdirs = {}
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
294 if desc['initdirs']:
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
295 for k, v in desc['initdirs']:
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
296 initdirs[k.encode('utf-8').replace('.hg', '_hg')] = v
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
297 initdirs = renamedirs(initdirs, words)
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
298 initdirscdf = cdf(initdirs)
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
299
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
300 def pick(cdf):
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
301 return cdf[0][bisect.bisect_left(cdf[1], random.random())]
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
302
22708
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
303 def pickpath():
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
304 return os.path.join(pick(initdirscdf), random.choice(words))
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
305
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
306 def makeline(minimum=0):
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
307 total = max(minimum, pick(linelengths))
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
308 c, l = 0, []
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
309 while c < total:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
310 w = random.choice(words)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
311 c += len(w) + 1
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
312 l.append(w)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
313 return ' '.join(l)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
314
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
315 wlock = repo.wlock()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
316 lock = repo.lock()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
317
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
318 nevertouch = set(('.hgsub', '.hgignore', '.hgtags'))
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
319
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
320 progress = ui.progress
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
321 _synthesizing = _('synthesizing')
22708
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
322 _files = _('initial files')
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
323 _changesets = _('changesets')
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
324
22708
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
325 # Synthesize a single initial revision adding files to the repo according
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
326 # to the modeled directory structure.
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
327 initcount = int(opts['initfiles'])
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
328 if initcount and initdirs:
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
329 pctx = repo[None].parents()[0]
23778
a5dbec255f14 synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents: 23235
diff changeset
330 dirs = set(pctx.dirs())
22708
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
331 files = {}
23778
a5dbec255f14 synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents: 23235
diff changeset
332
a5dbec255f14 synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents: 23235
diff changeset
333 def validpath(path):
a5dbec255f14 synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents: 23235
diff changeset
334 # Don't pick filenames which are already directory names.
a5dbec255f14 synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents: 23235
diff changeset
335 if path in dirs:
a5dbec255f14 synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents: 23235
diff changeset
336 return False
a5dbec255f14 synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents: 23235
diff changeset
337 # Don't pick directories which were used as file names.
a5dbec255f14 synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents: 23235
diff changeset
338 while path:
a5dbec255f14 synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents: 23235
diff changeset
339 if path in files:
a5dbec255f14 synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents: 23235
diff changeset
340 return False
a5dbec255f14 synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents: 23235
diff changeset
341 path = os.path.dirname(path)
a5dbec255f14 synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents: 23235
diff changeset
342 return True
a5dbec255f14 synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents: 23235
diff changeset
343
22708
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
344 for i in xrange(0, initcount):
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
345 ui.progress(_synthesizing, i, unit=_files, total=initcount)
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
346
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
347 path = pickpath()
23778
a5dbec255f14 synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents: 23235
diff changeset
348 while not validpath(path):
22708
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
349 path = pickpath()
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
350 data = '%s contents\n' % path
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
351 files[path] = context.memfilectx(repo, path, data)
23778
a5dbec255f14 synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents: 23235
diff changeset
352 dir = os.path.dirname(path)
a5dbec255f14 synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents: 23235
diff changeset
353 while dir and dir not in dirs:
a5dbec255f14 synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents: 23235
diff changeset
354 dirs.add(dir)
a5dbec255f14 synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents: 23235
diff changeset
355 dir = os.path.dirname(dir)
22708
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
356
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
357 def filectxfn(repo, memctx, path):
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
358 return files[path]
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
359
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
360 ui.progress(_synthesizing, None)
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
361 message = 'synthesized wide repo with %d files' % (len(files),)
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
362 mc = context.memctx(repo, [pctx.node(), nullid], message,
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
363 files.iterkeys(), filectxfn, ui.username(),
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
364 '%d %d' % util.makedate())
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
365 initnode = mc.commit()
24306
6ddc86eedc3b style: kill ersatz if-else ternary operators
Jordi Gutiérrez Hermoso <jordigh@octave.org>
parents: 23778
diff changeset
366 if ui.debugflag:
6ddc86eedc3b style: kill ersatz if-else ternary operators
Jordi Gutiérrez Hermoso <jordigh@octave.org>
parents: 23778
diff changeset
367 hexfn = hex
6ddc86eedc3b style: kill ersatz if-else ternary operators
Jordi Gutiérrez Hermoso <jordigh@octave.org>
parents: 23778
diff changeset
368 else:
6ddc86eedc3b style: kill ersatz if-else ternary operators
Jordi Gutiérrez Hermoso <jordigh@octave.org>
parents: 23778
diff changeset
369 hexfn = short
22708
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
370 ui.status(_('added commit %s with %d files\n')
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
371 % (hexfn(initnode), len(files)))
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
372
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
373 # Synthesize incremental revisions to the repository, adding repo depth.
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
374 count = int(opts['count'])
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
375 heads = set(map(repo.changelog.rev, repo.heads()))
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
376 for i in xrange(count):
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
377 progress(_synthesizing, i, unit=_changesets, total=count)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
378
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
379 node = repo.changelog.node
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
380 revs = len(repo)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
381
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
382 def pickhead(heads, distance):
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
383 if heads:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
384 lheads = sorted(heads)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
385 rev = revs - min(pick(distance), revs)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
386 if rev < lheads[-1]:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
387 rev = lheads[bisect.bisect_left(lheads, rev)]
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
388 else:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
389 rev = lheads[-1]
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
390 return rev, node(rev)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
391 return nullrev, nullid
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
392
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
393 r1 = revs - min(pick(p1distance), revs)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
394 p1 = node(r1)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
395
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
396 # the number of heads will grow without bound if we use a pure
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
397 # model, so artificially constrain their proliferation
22472
2e2577b0ccbe contrib/synthrepo: only generate 2 parents if model contains merges
Mike Edgar <adgar@google.com>
parents: 22446
diff changeset
398 toomanyheads = len(heads) > random.randint(1, 20)
2e2577b0ccbe contrib/synthrepo: only generate 2 parents if model contains merges
Mike Edgar <adgar@google.com>
parents: 22446
diff changeset
399 if p2distance[0] and (pick(parents) == 2 or toomanyheads):
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
400 r2, p2 = pickhead(heads.difference([r1]), p2distance)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
401 else:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
402 r2, p2 = nullrev, nullid
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
403
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
404 pl = [p1, p2]
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
405 pctx = repo[r1]
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
406 mf = pctx.manifest()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
407 mfk = mf.keys()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
408 changes = {}
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
409 if mfk:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
410 for __ in xrange(pick(fileschanged)):
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
411 for __ in xrange(10):
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
412 fctx = pctx.filectx(random.choice(mfk))
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
413 path = fctx.path()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
414 if not (path in nevertouch or fctx.isbinary() or
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
415 'l' in fctx.flags()):
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
416 break
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
417 lines = fctx.data().splitlines()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
418 add, remove = pick(lineschanged)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
419 for __ in xrange(remove):
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
420 if not lines:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
421 break
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
422 del lines[random.randrange(0, len(lines))]
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
423 for __ in xrange(add):
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
424 lines.insert(random.randint(0, len(lines)), makeline())
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
425 path = fctx.path()
21689
503bb3af70fe memfilectx: call super.__init__ instead of duplicating code
Sean Farley <sean.michael.farley@gmail.com>
parents: 20672
diff changeset
426 changes[path] = context.memfilectx(repo, path,
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
427 '\n'.join(lines) + '\n')
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
428 for __ in xrange(pick(filesremoved)):
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
429 path = random.choice(mfk)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
430 for __ in xrange(10):
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
431 path = random.choice(mfk)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
432 if path not in changes:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
433 changes[path] = None
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
434 break
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
435 if filesadded:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
436 dirs = list(pctx.dirs())
23235
4cdc3e2810b9 synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents: 23234
diff changeset
437 dirs.insert(0, '')
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
438 for __ in xrange(pick(filesadded)):
23235
4cdc3e2810b9 synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents: 23234
diff changeset
439 pathstr = ''
4cdc3e2810b9 synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents: 23234
diff changeset
440 while pathstr in dirs:
4cdc3e2810b9 synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents: 23234
diff changeset
441 path = [random.choice(dirs)]
4cdc3e2810b9 synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents: 23234
diff changeset
442 if pick(dirsadded):
4cdc3e2810b9 synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents: 23234
diff changeset
443 path.append(random.choice(words))
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
444 path.append(random.choice(words))
23235
4cdc3e2810b9 synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents: 23234
diff changeset
445 pathstr = '/'.join(filter(None, path))
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
446 data = '\n'.join(makeline()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
447 for __ in xrange(pick(linesinfilesadded))) + '\n'
23235
4cdc3e2810b9 synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents: 23234
diff changeset
448 changes[pathstr] = context.memfilectx(repo, pathstr, data)
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
449 def filectxfn(repo, memctx, path):
22446
054ec0212718 contrib/synthrepo: return None to delete files on commit, don't raise IOError
Mike Edgar <adgar@google.com>
parents: 21689
diff changeset
450 return changes[path]
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
451 if not changes:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
452 continue
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
453 if revs:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
454 date = repo['tip'].date()[0] + pick(interarrival)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
455 else:
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
456 date = time.time() - (86400 * count)
23234
944d6cfbe166 synthrepo: synthesized dates must be positive, fit in 32-bit signed ints
Mike Edgar <adgar@google.com>
parents: 22709
diff changeset
457 # dates in mercurial must be positive, fit in 32-bit signed integers.
944d6cfbe166 synthrepo: synthesized dates must be positive, fit in 32-bit signed ints
Mike Edgar <adgar@google.com>
parents: 22709
diff changeset
458 date = min(0x7fffffff, max(0, date))
17734
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
459 user = random.choice(words) + '@' + random.choice(words)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
460 mc = context.memctx(repo, pl, makeline(minimum=2),
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
461 sorted(changes.iterkeys()),
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
462 filectxfn, user, '%d %d' % (date, pick(tzoffset)))
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
463 newnode = mc.commit()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
464 heads.add(repo.changelog.rev(newnode))
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
465 heads.discard(r1)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
466 heads.discard(r2)
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
467
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
468 lock.release()
619068c280fd contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
469 wlock.release()
22708
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
470
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
471 def renamedirs(dirs, words):
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
472 '''Randomly rename the directory names in the per-dir file count dict.'''
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
473 wordgen = itertools.cycle(words)
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
474 replacements = {'': ''}
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
475 def rename(dirpath):
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
476 '''Recursively rename the directory and all path prefixes.
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
477
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
478 The mapping from path to renamed path is stored for all path prefixes
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
479 as in dynamic programming, ensuring linear runtime and consistent
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
480 renaming regardless of iteration order through the model.
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
481 '''
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
482 if dirpath in replacements:
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
483 return replacements[dirpath]
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
484 head, _ = os.path.split(dirpath)
24306
6ddc86eedc3b style: kill ersatz if-else ternary operators
Jordi Gutiérrez Hermoso <jordigh@octave.org>
parents: 23778
diff changeset
485 if head:
6ddc86eedc3b style: kill ersatz if-else ternary operators
Jordi Gutiérrez Hermoso <jordigh@octave.org>
parents: 23778
diff changeset
486 head = rename(head)
6ddc86eedc3b style: kill ersatz if-else ternary operators
Jordi Gutiérrez Hermoso <jordigh@octave.org>
parents: 23778
diff changeset
487 else:
6ddc86eedc3b style: kill ersatz if-else ternary operators
Jordi Gutiérrez Hermoso <jordigh@octave.org>
parents: 23778
diff changeset
488 head = ''
22708
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
489 renamed = os.path.join(head, wordgen.next())
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
490 replacements[dirpath] = renamed
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
491 return renamed
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
492 result = []
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
493 for dirpath, count in dirs.iteritems():
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
494 result.append([rename(dirpath.lstrip(os.sep)), count])
4c66e70c3488 contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents: 22473
diff changeset
495 return result