Mercurial > hg
annotate contrib/synthrepo.py @ 39318:c03c5f528e9b
perf: use storage API for resolving manifest node
lookup() isn't part of the storage API. And this code shouldn't
be accessing manifestlog._revlog directly for the modern code base.
So let's port it to the modern API.
Note that the previous code was busted for cases where we needed
to call lookup() because lookup() isn't exposed by manifestrevlog
any more.
This change is strictly BC breaking because we no longer support
resolving partial nodes. But it is a perf* command and I don't
think we should flag the change as such.
Differential Revision: https://phab.mercurial-scm.org/D4390
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Wed, 15 Aug 2018 19:45:39 +0000 |
parents | 1c93e0237a24 |
children | 876494fd967d |
rev | line source |
---|---|
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
1 # synthrepo.py - repo synthesis |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
2 # |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
3 # Copyright 2012 Facebook |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
4 # |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
5 # This software may be used and distributed according to the terms of the |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
6 # GNU General Public License version 2 or any later version. |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
7 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
8 '''synthesize structurally interesting change history |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
9 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
10 This extension is useful for creating a repository with properties |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
11 that are statistically similar to an existing repository. During |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
12 analysis, a simple probability table is constructed from the history |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
13 of an existing repository. During synthesis, these properties are |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
14 reconstructed. |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
15 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
16 Properties that are analyzed and synthesized include the following: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
17 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
18 - Lines added or removed when an existing file is modified |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
19 - Number and sizes of files added |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
20 - Number of files removed |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
21 - Line lengths |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
22 - Topological distance to parent changeset(s) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
23 - Probability of a commit being a merge |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
24 - Probability of a newly added file being added to a new directory |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
25 - Interarrival time, and time zone, of commits |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
26 - Number of files in each directory |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
27 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
28 A few obvious properties that are not currently handled realistically: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
29 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
30 - Merges are treated as regular commits with two parents, which is not |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
31 realistic |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
32 - Modifications are not treated as operations on hunks of lines, but |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
33 as insertions and deletions of randomly chosen single lines |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
34 - Committer ID (always random) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
35 - Executability of files |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
36 - Symlinks and binary files are ignored |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
37 ''' |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
38 |
28563
62250a48dc7f
contrib: synthrepo use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents:
26587
diff
changeset
|
39 from __future__ import absolute_import |
62250a48dc7f
contrib: synthrepo use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents:
26587
diff
changeset
|
40 import bisect |
62250a48dc7f
contrib: synthrepo use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents:
26587
diff
changeset
|
41 import collections |
62250a48dc7f
contrib: synthrepo use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents:
26587
diff
changeset
|
42 import itertools |
62250a48dc7f
contrib: synthrepo use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents:
26587
diff
changeset
|
43 import json |
62250a48dc7f
contrib: synthrepo use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents:
26587
diff
changeset
|
44 import os |
62250a48dc7f
contrib: synthrepo use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents:
26587
diff
changeset
|
45 import random |
62250a48dc7f
contrib: synthrepo use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents:
26587
diff
changeset
|
46 import sys |
62250a48dc7f
contrib: synthrepo use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents:
26587
diff
changeset
|
47 import time |
29205
a0939666b836
py3: move up symbol imports to enforce import-checker rules
Yuya Nishihara <yuya@tcha.org>
parents:
28563
diff
changeset
|
48 |
a0939666b836
py3: move up symbol imports to enforce import-checker rules
Yuya Nishihara <yuya@tcha.org>
parents:
28563
diff
changeset
|
49 from mercurial.i18n import _ |
a0939666b836
py3: move up symbol imports to enforce import-checker rules
Yuya Nishihara <yuya@tcha.org>
parents:
28563
diff
changeset
|
50 from mercurial.node import ( |
a0939666b836
py3: move up symbol imports to enforce import-checker rules
Yuya Nishihara <yuya@tcha.org>
parents:
28563
diff
changeset
|
51 nullid, |
a0939666b836
py3: move up symbol imports to enforce import-checker rules
Yuya Nishihara <yuya@tcha.org>
parents:
28563
diff
changeset
|
52 nullrev, |
a0939666b836
py3: move up symbol imports to enforce import-checker rules
Yuya Nishihara <yuya@tcha.org>
parents:
28563
diff
changeset
|
53 short, |
a0939666b836
py3: move up symbol imports to enforce import-checker rules
Yuya Nishihara <yuya@tcha.org>
parents:
28563
diff
changeset
|
54 ) |
28563
62250a48dc7f
contrib: synthrepo use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents:
26587
diff
changeset
|
55 from mercurial import ( |
62250a48dc7f
contrib: synthrepo use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents:
26587
diff
changeset
|
56 context, |
38588
1c93e0237a24
diffutil: move the module out of utils package
Yuya Nishihara <yuya@tcha.org>
parents:
38587
diff
changeset
|
57 diffutil, |
28563
62250a48dc7f
contrib: synthrepo use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents:
26587
diff
changeset
|
58 error, |
62250a48dc7f
contrib: synthrepo use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents:
26587
diff
changeset
|
59 hg, |
62250a48dc7f
contrib: synthrepo use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents:
26587
diff
changeset
|
60 patch, |
32337
46ba2cdda476
registrar: move cmdutil.command to registrar module (API)
Yuya Nishihara <yuya@tcha.org>
parents:
32291
diff
changeset
|
61 registrar, |
28563
62250a48dc7f
contrib: synthrepo use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents:
26587
diff
changeset
|
62 scmutil, |
62250a48dc7f
contrib: synthrepo use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents:
26587
diff
changeset
|
63 ) |
38567
97469c5430cd
synthrepo: pass a diffopts object to context.diff
Boris Feld <boris.feld@octobus.net>
parents:
38519
diff
changeset
|
64 from mercurial.utils import ( |
97469c5430cd
synthrepo: pass a diffopts object to context.diff
Boris Feld <boris.feld@octobus.net>
parents:
38519
diff
changeset
|
65 dateutil, |
97469c5430cd
synthrepo: pass a diffopts object to context.diff
Boris Feld <boris.feld@octobus.net>
parents:
38519
diff
changeset
|
66 ) |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
67 |
29841
d5883fd055c6
extensions: change magic "shipped with hg" string
Augie Fackler <augie@google.com>
parents:
29216
diff
changeset
|
68 # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for |
25186
80c5b2666a96
extensions: document that `testedwith = 'internal'` is special
Augie Fackler <augie@google.com>
parents:
24306
diff
changeset
|
69 # extensions which SHIP WITH MERCURIAL. Non-mainline extensions should |
80c5b2666a96
extensions: document that `testedwith = 'internal'` is special
Augie Fackler <augie@google.com>
parents:
24306
diff
changeset
|
70 # be specifying the version(s) of Mercurial they are tested with, or |
80c5b2666a96
extensions: document that `testedwith = 'internal'` is special
Augie Fackler <augie@google.com>
parents:
24306
diff
changeset
|
71 # leave the attribute unspecified. |
29841
d5883fd055c6
extensions: change magic "shipped with hg" string
Augie Fackler <augie@google.com>
parents:
29216
diff
changeset
|
72 testedwith = 'ships-with-hg-core' |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
73 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
74 cmdtable = {} |
32337
46ba2cdda476
registrar: move cmdutil.command to registrar module (API)
Yuya Nishihara <yuya@tcha.org>
parents:
32291
diff
changeset
|
75 command = registrar.command(cmdtable) |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
76 |
32291
bd872f64a8ba
cleanup: use set literals
Martin von Zweigbergk <martinvonz@google.com>
parents:
29841
diff
changeset
|
77 newfile = {'new fi', 'rename', 'copy f', 'copy t'} |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
78 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
79 def zerodict(): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
80 return collections.defaultdict(lambda: 0) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
81 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
82 def roundto(x, k): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
83 if x > k * 2: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
84 return int(round(x / float(k)) * k) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
85 return int(round(x)) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
86 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
87 def parsegitdiff(lines): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
88 filename, mar, lineadd, lineremove = None, None, zerodict(), 0 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
89 binary = False |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
90 for line in lines: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
91 start = line[:6] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
92 if start == 'diff -': |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
93 if filename: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
94 yield filename, mar, lineadd, lineremove, binary |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
95 mar, lineadd, lineremove, binary = 'm', zerodict(), 0, False |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
96 filename = patch.gitre.match(line).group(1) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
97 elif start in newfile: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
98 mar = 'a' |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
99 elif start == 'GIT bi': |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
100 binary = True |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
101 elif start == 'delete': |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
102 mar = 'r' |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
103 elif start: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
104 s = start[0] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
105 if s == '-' and not line.startswith('--- '): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
106 lineremove += 1 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
107 elif s == '+' and not line.startswith('+++ '): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
108 lineadd[roundto(len(line) - 1, 5)] += 1 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
109 if filename: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
110 yield filename, mar, lineadd, lineremove, binary |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
111 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
112 @command('analyze', |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
113 [('o', 'output', '', _('write output to given file'), _('FILE')), |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
114 ('r', 'rev', [], _('analyze specified revisions'), _('REV'))], |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
115 _('hg analyze'), optionalrepo=True) |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
116 def analyze(ui, repo, *revs, **opts): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
117 '''create a simple model of a repository to use for later synthesis |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
118 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
119 This command examines every changeset in the given range (or all |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
120 of history if none are specified) and creates a simple statistical |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
121 model of the history of the repository. It also measures the directory |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
122 structure of the repository as checked out. |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
123 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
124 The model is written out to a JSON file, and can be used by |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
125 :hg:`synthesize` to create or augment a repository with synthetic |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
126 commits that have a structure that is statistically similar to the |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
127 analyzed repository. |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
128 ''' |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
129 root = repo.root |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
130 if not root.endswith(os.path.sep): |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
131 root += os.path.sep |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
132 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
133 revs = list(revs) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
134 revs.extend(opts['rev']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
135 if not revs: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
136 revs = [':'] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
137 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
138 output = opts['output'] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
139 if not output: |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
140 output = os.path.basename(root) + '.json' |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
141 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
142 if output == '-': |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
143 fp = sys.stdout |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
144 else: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
145 fp = open(output, 'w') |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
146 |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
147 # Always obtain file counts of each directory in the given root directory. |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
148 def onerror(e): |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
149 ui.warn(_('error walking directory structure: %s\n') % e) |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
150 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
151 dirs = {} |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
152 rootprefixlen = len(root) |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
153 for dirpath, dirnames, filenames in os.walk(root, onerror=onerror): |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
154 dirpathfromroot = dirpath[rootprefixlen:] |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
155 dirs[dirpathfromroot] = len(filenames) |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
156 if '.hg' in dirnames: |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
157 dirnames.remove('.hg') |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
158 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
159 lineschanged = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
160 children = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
161 p1distance = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
162 p2distance = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
163 linesinfilesadded = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
164 fileschanged = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
165 filesadded = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
166 filesremoved = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
167 linelengths = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
168 interarrival = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
169 parents = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
170 dirsadded = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
171 tzoffset = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
172 |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
173 # If a mercurial repo is available, also model the commit history. |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
174 if repo: |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
175 revs = scmutil.revrange(repo, revs) |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
176 revs.sort() |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
177 |
38408
6540333acb95
synthrepo: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents:
36607
diff
changeset
|
178 progress = ui.makeprogress(_('analyzing'), unit=_('changesets'), |
6540333acb95
synthrepo: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents:
36607
diff
changeset
|
179 total=len(revs)) |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
180 for i, rev in enumerate(revs): |
38408
6540333acb95
synthrepo: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents:
36607
diff
changeset
|
181 progress.update(i) |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
182 ctx = repo[rev] |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
183 pl = ctx.parents() |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
184 pctx = pl[0] |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
185 prev = pctx.rev() |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
186 children[prev] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
187 p1distance[rev - prev] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
188 parents[len(pl)] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
189 tzoffset[ctx.date()[1]] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
190 if len(pl) > 1: |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
191 p2distance[rev - pl[1].rev()] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
192 if prev == rev - 1: |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
193 lastctx = pctx |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
194 else: |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
195 lastctx = repo[rev - 1] |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
196 if lastctx.rev() != nullrev: |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
197 timedelta = ctx.date()[0] - lastctx.date()[0] |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
198 interarrival[roundto(timedelta, 300)] += 1 |
38587
b62000a28812
diffutil: remove diffopts() in favor of diffallopts()
Yuya Nishihara <yuya@tcha.org>
parents:
38584
diff
changeset
|
199 diffopts = diffutil.diffallopts(ui, {'git': True}) |
38519
4455e5d4d59c
context: explicitly take diffopts in `context.diff` (API)
Boris Feld <boris.feld@octobus.net>
parents:
38409
diff
changeset
|
200 diff = sum((d.splitlines() |
38567
97469c5430cd
synthrepo: pass a diffopts object to context.diff
Boris Feld <boris.feld@octobus.net>
parents:
38519
diff
changeset
|
201 for d in ctx.diff(pctx, opts=diffopts)), []) |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
202 fileadds, diradds, fileremoves, filechanges = 0, 0, 0, 0 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
203 for filename, mar, lineadd, lineremove, isbin in parsegitdiff(diff): |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
204 if isbin: |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
205 continue |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
206 added = sum(lineadd.itervalues(), 0) |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
207 if mar == 'm': |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
208 if added and lineremove: |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
209 lineschanged[roundto(added, 5), |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
210 roundto(lineremove, 5)] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
211 filechanges += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
212 elif mar == 'a': |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
213 fileadds += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
214 if '/' in filename: |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
215 filedir = filename.rsplit('/', 1)[0] |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
216 if filedir not in pctx.dirs(): |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
217 diradds += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
218 linesinfilesadded[roundto(added, 5)] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
219 elif mar == 'r': |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
220 fileremoves += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
221 for length, count in lineadd.iteritems(): |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
222 linelengths[length] += count |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
223 fileschanged[filechanges] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
224 filesadded[fileadds] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
225 dirsadded[diradds] += 1 |
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
226 filesremoved[fileremoves] += 1 |
38409
ce65c25dc161
synthrepo: close progress topics
Martin von Zweigbergk <martinvonz@google.com>
parents:
38408
diff
changeset
|
227 progress.complete() |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
228 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
229 invchildren = zerodict() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
230 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
231 for rev, count in children.iteritems(): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
232 invchildren[count] += 1 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
233 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
234 if output != '-': |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
235 ui.status(_('writing output to %s\n') % output) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
236 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
237 def pronk(d): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
238 return sorted(d.iteritems(), key=lambda x: x[1], reverse=True) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
239 |
20672
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
240 json.dump({'revs': len(revs), |
22709
889789a2ca9f
contrib/synthrepo: walk a repo's directory structure during analysis
Mike Edgar <adgar@google.com>
parents:
22708
diff
changeset
|
241 'initdirs': pronk(dirs), |
20672
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
242 'lineschanged': pronk(lineschanged), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
243 'children': pronk(invchildren), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
244 'fileschanged': pronk(fileschanged), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
245 'filesadded': pronk(filesadded), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
246 'linesinfilesadded': pronk(linesinfilesadded), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
247 'dirsadded': pronk(dirsadded), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
248 'filesremoved': pronk(filesremoved), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
249 'linelengths': pronk(linelengths), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
250 'parents': pronk(parents), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
251 'p1distance': pronk(p1distance), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
252 'p2distance': pronk(p2distance), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
253 'interarrival': pronk(interarrival), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
254 'tzoffset': pronk(tzoffset), |
05e58b08fdfe
synthrepo: move from dict() construction to {} literals
Augie Fackler <raf@durin42.com>
parents:
19322
diff
changeset
|
255 }, |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
256 fp) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
257 fp.close() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
258 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
259 @command('synthesize', |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
260 [('c', 'count', 0, _('create given number of commits'), _('COUNT')), |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
261 ('', 'dict', '', _('path to a dictionary of words'), _('FILE')), |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
262 ('', 'initfiles', 0, _('initial file count to create'), _('COUNT'))], |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
263 _('hg synthesize [OPTION].. DESCFILE')) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
264 def synthesize(ui, repo, descpath, **opts): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
265 '''synthesize commits based on a model of an existing repository |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
266 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
267 The model must have been generated by :hg:`analyze`. Commits will |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
268 be generated randomly according to the probabilities described in |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
269 the model. If --initfiles is set, the repository will be seeded with |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
270 the given number files following the modeled repository's directory |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
271 structure. |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
272 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
273 When synthesizing new content, commit descriptions, and user |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
274 names, words will be chosen randomly from a dictionary that is |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
275 presumed to contain one word per line. Use --dict to specify the |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
276 path to an alternate dictionary to use. |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
277 ''' |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
278 try: |
17887
0e2846b2482c
url: use open and not url.open for local files (issue3624)
Siddharth Agarwal <sid0@fb.com>
parents:
17734
diff
changeset
|
279 fp = hg.openpath(ui, descpath) |
25660
328739ea70c3
global: mass rewrite to use modern exception syntax
Gregory Szorc <gregory.szorc@gmail.com>
parents:
25186
diff
changeset
|
280 except Exception as err: |
26587
56b2bcea2529
error: get Abort from 'error' instead of 'util'
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
25660
diff
changeset
|
281 raise error.Abort('%s: %s' % (descpath, err[0].strerror)) |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
282 desc = json.load(fp) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
283 fp.close() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
284 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
285 def cdf(l): |
18047
9196638b08ce
synthrepo: do not crash if a list is empty
Bryan O'Sullivan <bryano@fb.com>
parents:
17887
diff
changeset
|
286 if not l: |
9196638b08ce
synthrepo: do not crash if a list is empty
Bryan O'Sullivan <bryano@fb.com>
parents:
17887
diff
changeset
|
287 return [], [] |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
288 vals, probs = zip(*sorted(l, key=lambda x: x[1], reverse=True)) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
289 t = float(sum(probs, 0)) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
290 s, cdfs = 0, [] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
291 for v in probs: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
292 s += v |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
293 cdfs.append(s / t) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
294 return vals, cdfs |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
295 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
296 lineschanged = cdf(desc['lineschanged']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
297 fileschanged = cdf(desc['fileschanged']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
298 filesadded = cdf(desc['filesadded']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
299 dirsadded = cdf(desc['dirsadded']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
300 filesremoved = cdf(desc['filesremoved']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
301 linelengths = cdf(desc['linelengths']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
302 parents = cdf(desc['parents']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
303 p1distance = cdf(desc['p1distance']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
304 p2distance = cdf(desc['p2distance']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
305 interarrival = cdf(desc['interarrival']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
306 linesinfilesadded = cdf(desc['linesinfilesadded']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
307 tzoffset = cdf(desc['tzoffset']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
308 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
309 dictfile = opts.get('dict') or '/usr/share/dict/words' |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
310 try: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
311 fp = open(dictfile, 'rU') |
25660
328739ea70c3
global: mass rewrite to use modern exception syntax
Gregory Szorc <gregory.szorc@gmail.com>
parents:
25186
diff
changeset
|
312 except IOError as err: |
26587
56b2bcea2529
error: get Abort from 'error' instead of 'util'
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
25660
diff
changeset
|
313 raise error.Abort('%s: %s' % (dictfile, err.strerror)) |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
314 words = fp.read().splitlines() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
315 fp.close() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
316 |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
317 initdirs = {} |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
318 if desc['initdirs']: |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
319 for k, v in desc['initdirs']: |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
320 initdirs[k.encode('utf-8').replace('.hg', '_hg')] = v |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
321 initdirs = renamedirs(initdirs, words) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
322 initdirscdf = cdf(initdirs) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
323 |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
324 def pick(cdf): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
325 return cdf[0][bisect.bisect_left(cdf[1], random.random())] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
326 |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
327 def pickpath(): |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
328 return os.path.join(pick(initdirscdf), random.choice(words)) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
329 |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
330 def makeline(minimum=0): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
331 total = max(minimum, pick(linelengths)) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
332 c, l = 0, [] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
333 while c < total: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
334 w = random.choice(words) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
335 c += len(w) + 1 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
336 l.append(w) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
337 return ' '.join(l) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
338 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
339 wlock = repo.wlock() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
340 lock = repo.lock() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
341 |
32291
bd872f64a8ba
cleanup: use set literals
Martin von Zweigbergk <martinvonz@google.com>
parents:
29841
diff
changeset
|
342 nevertouch = {'.hgsub', '.hgignore', '.hgtags'} |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
343 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
344 _synthesizing = _('synthesizing') |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
345 _files = _('initial files') |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
346 _changesets = _('changesets') |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
347 |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
348 # Synthesize a single initial revision adding files to the repo according |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
349 # to the modeled directory structure. |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
350 initcount = int(opts['initfiles']) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
351 if initcount and initdirs: |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
352 pctx = repo[None].parents()[0] |
23778
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
353 dirs = set(pctx.dirs()) |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
354 files = {} |
23778
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
355 |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
356 def validpath(path): |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
357 # Don't pick filenames which are already directory names. |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
358 if path in dirs: |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
359 return False |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
360 # Don't pick directories which were used as file names. |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
361 while path: |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
362 if path in files: |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
363 return False |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
364 path = os.path.dirname(path) |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
365 return True |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
366 |
38408
6540333acb95
synthrepo: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents:
36607
diff
changeset
|
367 progress = ui.makeprogress(_synthesizing, unit=_files, total=initcount) |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
368 for i in xrange(0, initcount): |
38408
6540333acb95
synthrepo: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents:
36607
diff
changeset
|
369 progress.update(i) |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
370 |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
371 path = pickpath() |
23778
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
372 while not validpath(path): |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
373 path = pickpath() |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
374 data = '%s contents\n' % path |
35398
2123e7629ec0
synthrepo: create filectx instance in 'filectxfn' callback
Martin von Zweigbergk <martinvonz@google.com>
parents:
34023
diff
changeset
|
375 files[path] = data |
23778
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
376 dir = os.path.dirname(path) |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
377 while dir and dir not in dirs: |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
378 dirs.add(dir) |
a5dbec255f14
synthrepo: new filenames must not also be new directories, and vice-versa
Mike Edgar <adgar@google.com>
parents:
23235
diff
changeset
|
379 dir = os.path.dirname(dir) |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
380 |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
381 def filectxfn(repo, memctx, path): |
35400
8a0cac20a1ad
memfilectx: make changectx argument mandatory in constructor (API)
Martin von Zweigbergk <martinvonz@google.com>
parents:
35398
diff
changeset
|
382 return context.memfilectx(repo, memctx, path, files[path]) |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
383 |
38408
6540333acb95
synthrepo: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents:
36607
diff
changeset
|
384 progress.complete() |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
385 message = 'synthesized wide repo with %d files' % (len(files),) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
386 mc = context.memctx(repo, [pctx.node(), nullid], message, |
36299
238646784294
py3: use default dict iterator instead of iterkeys
Augie Fackler <augie@google.com>
parents:
35400
diff
changeset
|
387 files, filectxfn, ui.username(), |
36607
c6061cadb400
util: extract all date-related utils in utils/dateutil module
Boris Feld <boris.feld@octobus.net>
parents:
36299
diff
changeset
|
388 '%d %d' % dateutil.makedate()) |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
389 initnode = mc.commit() |
24306
6ddc86eedc3b
style: kill ersatz if-else ternary operators
Jordi Gutiérrez Hermoso <jordigh@octave.org>
parents:
23778
diff
changeset
|
390 if ui.debugflag: |
6ddc86eedc3b
style: kill ersatz if-else ternary operators
Jordi Gutiérrez Hermoso <jordigh@octave.org>
parents:
23778
diff
changeset
|
391 hexfn = hex |
6ddc86eedc3b
style: kill ersatz if-else ternary operators
Jordi Gutiérrez Hermoso <jordigh@octave.org>
parents:
23778
diff
changeset
|
392 else: |
6ddc86eedc3b
style: kill ersatz if-else ternary operators
Jordi Gutiérrez Hermoso <jordigh@octave.org>
parents:
23778
diff
changeset
|
393 hexfn = short |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
394 ui.status(_('added commit %s with %d files\n') |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
395 % (hexfn(initnode), len(files))) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
396 |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
397 # Synthesize incremental revisions to the repository, adding repo depth. |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
398 count = int(opts['count']) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
399 heads = set(map(repo.changelog.rev, repo.heads())) |
38408
6540333acb95
synthrepo: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents:
36607
diff
changeset
|
400 progress = ui.makeprogress(_synthesizing, unit=_changesets, total=count) |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
401 for i in xrange(count): |
38408
6540333acb95
synthrepo: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents:
36607
diff
changeset
|
402 progress.update(i) |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
403 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
404 node = repo.changelog.node |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
405 revs = len(repo) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
406 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
407 def pickhead(heads, distance): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
408 if heads: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
409 lheads = sorted(heads) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
410 rev = revs - min(pick(distance), revs) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
411 if rev < lheads[-1]: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
412 rev = lheads[bisect.bisect_left(lheads, rev)] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
413 else: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
414 rev = lheads[-1] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
415 return rev, node(rev) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
416 return nullrev, nullid |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
417 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
418 r1 = revs - min(pick(p1distance), revs) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
419 p1 = node(r1) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
420 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
421 # the number of heads will grow without bound if we use a pure |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
422 # model, so artificially constrain their proliferation |
22472
2e2577b0ccbe
contrib/synthrepo: only generate 2 parents if model contains merges
Mike Edgar <adgar@google.com>
parents:
22446
diff
changeset
|
423 toomanyheads = len(heads) > random.randint(1, 20) |
2e2577b0ccbe
contrib/synthrepo: only generate 2 parents if model contains merges
Mike Edgar <adgar@google.com>
parents:
22446
diff
changeset
|
424 if p2distance[0] and (pick(parents) == 2 or toomanyheads): |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
425 r2, p2 = pickhead(heads.difference([r1]), p2distance) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
426 else: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
427 r2, p2 = nullrev, nullid |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
428 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
429 pl = [p1, p2] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
430 pctx = repo[r1] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
431 mf = pctx.manifest() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
432 mfk = mf.keys() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
433 changes = {} |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
434 if mfk: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
435 for __ in xrange(pick(fileschanged)): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
436 for __ in xrange(10): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
437 fctx = pctx.filectx(random.choice(mfk)) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
438 path = fctx.path() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
439 if not (path in nevertouch or fctx.isbinary() or |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
440 'l' in fctx.flags()): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
441 break |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
442 lines = fctx.data().splitlines() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
443 add, remove = pick(lineschanged) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
444 for __ in xrange(remove): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
445 if not lines: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
446 break |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
447 del lines[random.randrange(0, len(lines))] |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
448 for __ in xrange(add): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
449 lines.insert(random.randint(0, len(lines)), makeline()) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
450 path = fctx.path() |
35398
2123e7629ec0
synthrepo: create filectx instance in 'filectxfn' callback
Martin von Zweigbergk <martinvonz@google.com>
parents:
34023
diff
changeset
|
451 changes[path] = '\n'.join(lines) + '\n' |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
452 for __ in xrange(pick(filesremoved)): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
453 path = random.choice(mfk) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
454 for __ in xrange(10): |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
455 path = random.choice(mfk) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
456 if path not in changes: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
457 break |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
458 if filesadded: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
459 dirs = list(pctx.dirs()) |
23235
4cdc3e2810b9
synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents:
23234
diff
changeset
|
460 dirs.insert(0, '') |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
461 for __ in xrange(pick(filesadded)): |
23235
4cdc3e2810b9
synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents:
23234
diff
changeset
|
462 pathstr = '' |
4cdc3e2810b9
synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents:
23234
diff
changeset
|
463 while pathstr in dirs: |
4cdc3e2810b9
synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents:
23234
diff
changeset
|
464 path = [random.choice(dirs)] |
4cdc3e2810b9
synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents:
23234
diff
changeset
|
465 if pick(dirsadded): |
4cdc3e2810b9
synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents:
23234
diff
changeset
|
466 path.append(random.choice(words)) |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
467 path.append(random.choice(words)) |
23235
4cdc3e2810b9
synthrepo: when adding files, ensure new path is not a directory
Mike Edgar <adgar@google.com>
parents:
23234
diff
changeset
|
468 pathstr = '/'.join(filter(None, path)) |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
469 data = '\n'.join(makeline() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
470 for __ in xrange(pick(linesinfilesadded))) + '\n' |
35398
2123e7629ec0
synthrepo: create filectx instance in 'filectxfn' callback
Martin von Zweigbergk <martinvonz@google.com>
parents:
34023
diff
changeset
|
471 changes[pathstr] = data |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
472 def filectxfn(repo, memctx, path): |
35398
2123e7629ec0
synthrepo: create filectx instance in 'filectxfn' callback
Martin von Zweigbergk <martinvonz@google.com>
parents:
34023
diff
changeset
|
473 if path not in changes: |
2123e7629ec0
synthrepo: create filectx instance in 'filectxfn' callback
Martin von Zweigbergk <martinvonz@google.com>
parents:
34023
diff
changeset
|
474 return None |
35400
8a0cac20a1ad
memfilectx: make changectx argument mandatory in constructor (API)
Martin von Zweigbergk <martinvonz@google.com>
parents:
35398
diff
changeset
|
475 return context.memfilectx(repo, memctx, path, changes[path]) |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
476 if not changes: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
477 continue |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
478 if revs: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
479 date = repo['tip'].date()[0] + pick(interarrival) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
480 else: |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
481 date = time.time() - (86400 * count) |
23234
944d6cfbe166
synthrepo: synthesized dates must be positive, fit in 32-bit signed ints
Mike Edgar <adgar@google.com>
parents:
22709
diff
changeset
|
482 # dates in mercurial must be positive, fit in 32-bit signed integers. |
944d6cfbe166
synthrepo: synthesized dates must be positive, fit in 32-bit signed ints
Mike Edgar <adgar@google.com>
parents:
22709
diff
changeset
|
483 date = min(0x7fffffff, max(0, date)) |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
484 user = random.choice(words) + '@' + random.choice(words) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
485 mc = context.memctx(repo, pl, makeline(minimum=2), |
34023
ba479850c9c7
python3: replace sorted(<dict>.iterkeys()) with sorted(<dict>)
Augie Fackler <raf@durin42.com>
parents:
32337
diff
changeset
|
486 sorted(changes), |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
487 filectxfn, user, '%d %d' % (date, pick(tzoffset))) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
488 newnode = mc.commit() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
489 heads.add(repo.changelog.rev(newnode)) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
490 heads.discard(r1) |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
491 heads.discard(r2) |
38409
ce65c25dc161
synthrepo: close progress topics
Martin von Zweigbergk <martinvonz@google.com>
parents:
38408
diff
changeset
|
492 progress.complete() |
17734
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
493 |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
494 lock.release() |
619068c280fd
contrib: add a commit synthesizer for reproducing scaling problems
Bryan O'Sullivan <bryano@fb.com>
parents:
diff
changeset
|
495 wlock.release() |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
496 |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
497 def renamedirs(dirs, words): |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
498 '''Randomly rename the directory names in the per-dir file count dict.''' |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
499 wordgen = itertools.cycle(words) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
500 replacements = {'': ''} |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
501 def rename(dirpath): |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
502 '''Recursively rename the directory and all path prefixes. |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
503 |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
504 The mapping from path to renamed path is stored for all path prefixes |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
505 as in dynamic programming, ensuring linear runtime and consistent |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
506 renaming regardless of iteration order through the model. |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
507 ''' |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
508 if dirpath in replacements: |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
509 return replacements[dirpath] |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
510 head, _ = os.path.split(dirpath) |
24306
6ddc86eedc3b
style: kill ersatz if-else ternary operators
Jordi Gutiérrez Hermoso <jordigh@octave.org>
parents:
23778
diff
changeset
|
511 if head: |
6ddc86eedc3b
style: kill ersatz if-else ternary operators
Jordi Gutiérrez Hermoso <jordigh@octave.org>
parents:
23778
diff
changeset
|
512 head = rename(head) |
6ddc86eedc3b
style: kill ersatz if-else ternary operators
Jordi Gutiérrez Hermoso <jordigh@octave.org>
parents:
23778
diff
changeset
|
513 else: |
6ddc86eedc3b
style: kill ersatz if-else ternary operators
Jordi Gutiérrez Hermoso <jordigh@octave.org>
parents:
23778
diff
changeset
|
514 head = '' |
29216
ead25aa27a43
py3: convert to next() function
timeless <timeless@mozdev.org>
parents:
29205
diff
changeset
|
515 renamed = os.path.join(head, next(wordgen)) |
22708
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
516 replacements[dirpath] = renamed |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
517 return renamed |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
518 result = [] |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
519 for dirpath, count in dirs.iteritems(): |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
520 result.append([rename(dirpath.lstrip(os.sep)), count]) |
4c66e70c3488
contrib/synthrepo: generate initial repo contents using directory shape model
Mike Edgar <adgar@google.com>
parents:
22473
diff
changeset
|
521 return result |