Mercurial > hg-stable
view contrib/pylintrc @ 49773:523cacdfd324
delta-find: set the default candidate chunk size to 10
I ran performance and storage tests on repositories of various sizes and shapes
for the following values of the config : 5, 10, 20, 50, 100, no-chunking
The performance tests do not show any statistical impact on computation
times for large pushes and pulls.
For searching for an individual delta, this can provide a significant
performance improvement with a minor degradation of space-quality on the
result. (see data at the end of the commit).
For overall store size, the change :
- does not have any impact on many small repositories,
- has an observable, but very negligible impact on most larger repositories.
- One private repository we use for testing sees a small increase in size
(1%) in the narrower version.
We will try to get more numbers on a larger version of that repository to
make sure nothing pathological happens.
We pick "10" as the limit as "5" seems a bit more risky.
There are room to improve the current code, by using more aggressive filtering
and better (i.e any) sorting of the candidates. However this is already a large
improvement for pathological cases, with little impact in the common
situations.
The initial motivation for this change is to fix performance of delta
computation for a file where the previous code ended up testing 20 000 possible
candidate-bases in one go, which is… slow. This affected about ½ of the file
revisions leading to atrocious performance, especially during some push/pull
operations.
Details about individual delta finding timing:
----------------------------------------------
The vast majority of benchmark cases are unchanged but the three below. The first
two do not see any impact on the final delta. The last one sees a change in
delta-size that is negligible compared to the full text size.
### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
# benchmark.name = perf-delta-find
# benchmark.variants.rev = manifest-snapshot-many-tries-a (revision 756096)
∞: 5.844783
5: 4.473523 (-23.46%)
10: 4.970053 (-14.97%)
20: 5.770386 (-1.27%)
50 5.821358
100: 5.834887
MANIFESTLOG: rev = 756096: (no-limit)
delta-base = 301840
search-rounds = 6
try-count = 60
delta-type = snapshot
snap-depth = 7
delta-size = 179
MANIFESTLOG: rev=756096: (limit = 10)
delta-base=301840
search-rounds=9
try-count=51
delta-type=snapshot
snap-depth=7
delta-size=179
### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
# benchmark.name = perf-delta-find
# benchmark.variants.rev = manifest-snapshot-many-tries-d (revision 754060)
∞: 5.017663
5: 3.655931 (-27.14%)
10: 4.095436 (-18.38%)
20: 4.828949 (-3.76%)
50 4.987574
100: 4.994889
MANIFESTLOG: rev=754060: (no limit)
delta-base=301840
search-rounds=5
try-count=53
delta-type=snapshot
snap-depth=7
delta-size = 179
MANIFESTLOG: rev=754060: (limite = 10)
delta-base=301840
search-rounds=8
try-count=45
delta-type=snapshot
snap-depth=7
delta-size = 179
### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
# benchmark.name = perf-delta-find
# bin-env-vars.hg.flavor = rust
# benchmark.variants.rev = manifest-snapshot-many-tries-e (revision 693368)
∞: 4.869282
5: 2.039732 (-58.11%)
10: 2.413537 (-50.43%)
20: 4.449639 (-8.62%)
50 4.865863
100: 4.882649
MANIFESTLOG: rev=693368:
delta-base=693336
search-rounds=6
try-count=53
delta-type=snapshot
snap-depth=6
full-test-size=131065
delta-size=199
MANIFESTLOG: rev=693368:
delta-base=278023
search-rounds=5
try-count=21
delta-type=snapshot
snap-depth=4
full-test-size=131065
delta-size=278
Raw data for store size (in bytes) for various chunk size value below:
----------------------------------------------------------------------
440 134 384 5 pypy/.hg/store/
440 134 384 10 pypy/.hg/store/
440 134 384 20 pypy/.hg/store/
440 134 384 50 pypy/.hg/store/
440 134 384 100 pypy/.hg/store/
440 134 384 ... pypy/.hg/store/
666 987 471 5 netbsd-xsrc-2022-11-15/.hg/store/
666 987 471 10 netbsd-xsrc-2022-11-15/.hg/store/
666 987 471 20 netbsd-xsrc-2022-11-15/.hg/store/
666 987 471 50 netbsd-xsrc-2022-11-15/.hg/store/
666 987 471 100 netbsd-xsrc-2022-11-15/.hg/store/
666 987 471 ... netbsd-xsrc-2022-11-15/.hg/store/
852 844 884 5 netbsd-pkgsrc-2022-11-15/.hg/store/
852 844 884 10 netbsd-pkgsrc-2022-11-15/.hg/store/
852 844 884 20 netbsd-pkgsrc-2022-11-15/.hg/store/
852 844 884 50 netbsd-pkgsrc-2022-11-15/.hg/store/
852 844 884 100 netbsd-pkgsrc-2022-11-15/.hg/store/
852 844 884 ... netbsd-pkgsrc-2022-11-15/.hg/store/
1 504 227 981 5 netbeans-2018-08-01-sparse-zstd/.hg/store/
1 504 227 871 10 netbeans-2018-08-01-sparse-zstd/.hg/store/
1 504 227 813 20 netbeans-2018-08-01-sparse-zstd/.hg/store/
1 504 227 813 50 netbeans-2018-08-01-sparse-zstd/.hg/store/
1 504 227 813 100 netbeans-2018-08-01-sparse-zstd/.hg/store/
1 504 227 813 ... netbeans-2018-08-01-sparse-zstd/.hg/store/
3 875 801 068 5 netbsd-src-2022-11-15/.hg/store/
3 875 696 767 10 netbsd-src-2022-11-15/.hg/store/
3 875 696 757 20 netbsd-src-2022-11-15/.hg/store/
3 875 696 653 50 netbsd-src-2022-11-15/.hg/store/
3 875 696 653 100 netbsd-src-2022-11-15/.hg/store/
3 875 696 653 ... netbsd-src-2022-11-15/.hg/store/
4 531 441 314 5 mozilla-central/.hg/store/
4 531 435 157 10 mozilla-central/.hg/store/
4 531 432 045 20 mozilla-central/.hg/store/
4 531 429 119 50 mozilla-central/.hg/store/
4 531 429 119 100 mozilla-central/.hg/store/
4 531 429 119 ... mozilla-central/.hg/store/
4 875 861 390 5 mozilla-unified/.hg/store/
4 875 855 155 10 mozilla-unified/.hg/store/
4 875 852 027 20 mozilla-unified/.hg/store/
4 875 848 851 50 mozilla-unified/.hg/store/
4 875 848 851 100 mozilla-unified/.hg/store/
4 875 848 851 ... mozilla-unified/.hg/store/
11 498 764 601 5 mozilla-try/.hg/store/
11 497 968 858 10 mozilla-try/.hg/store/
11 497 958 730 20 mozilla-try/.hg/store/
11 497 927 156 50 mozilla-try/.hg/store/
11 497 925 963 100 mozilla-try/.hg/store/
11 497 923 428 ... mozilla-try/.hg/store/
10 047 914 031 5 private-repo
9 969 132 101 10 private-repo
9 944 745 015 20 private-repo
9 939 756 703 50 private-repo
9 939 833 016 100 private-repo
9 939 822 035 ... private-repo
author | Pierre-Yves David <pierre-yves.david@octobus.net> |
---|---|
date | Wed, 23 Nov 2022 19:08:27 +0100 |
parents | 86531a7038ed |
children |
line wrap: on
line source
# lint Python modules using external checkers. # # This is the main checker controlling the other ones and the reports # generation. It is itself both a raw checker and an astng checker in order # to: # * handle message activation / deactivation at the module level # * handle some basic but necessary stats'data (number of classes, methods...) # [MASTER] # Specify a configuration file. #rcfile= # Python code to execute, usually for sys.path manipulation such as # pygtk.require(). #init-hook= # Profiled execution. profile=no # Add <file or directory> to the black list. It should be a base name, not a # path. You may set this option multiple times. ignore=CVS # Pickle collected data for later comparisons. persistent=yes # Set the cache size for astng objects. cache-size=500 # List of plugins (as comma separated values of python modules names) to load, # usually to register additional checkers. load-plugins= [MESSAGES CONTROL] # Enable only checker(s) with the given id(s). This option conflicts with the # disable-checker option #enable-checker= # Enable all checker(s) except those with the given id(s). This option # conflicts with the enable-checker option #disable-checker= # Enable all messages in the listed categories (IRCWEF). #enable-msg-cat= # Disable all messages in the listed categories (IRCWEF). disable-msg-cat=I # Enable the message(s) with the given id(s). #enable-msg= # Disable the message(s) with the given id(s). # W0704: except: pass # C0111: missing docstring # W0403: for the time being absolute imports don't play nice with demandimport disable-msg=W0704,C0111,W0403 [REPORTS] # Set the output format. Available formats are text, parseable, colorized, msvs # (visual studio) and html output-format=text # Include message's id in output include-ids=yes # Put messages in a separate file for each module / package specified on the # command line instead of printing them on stdout. Reports (if any) will be # written in a file name "pylint_global.[txt|html]". files-output=no # Tells whether to display a full report or only the messages reports=yes # Python expression which should return a note less than 10 (10 is the highest # note). You have access to the variables errors warning, statement which # respectively contain the number of errors / warnings messages and the total # number of statements analyzed. This is used by the global evaluation report # (R0004). evaluation=10.0 - ((float(5 * error + warning + refactor + convention) / statement) * 10) # Add a comment according to your evaluation note. This is used by the global # evaluation report (R0004). comment=no # Enable the report(s) with the given id(s). #enable-report= # Disable the report(s) with the given id(s). #disable-report= # try to find bugs in the code using type inference # [TYPECHECK] # Tells whether missing members accessed in mixin class should be ignored. A # mixin class is detected if its name ends with "mixin" (case insensitive). ignore-mixin-members=yes # List of classes names for which member attributes should not be checked # (useful for classes with attributes dynamically set). ignored-classes=SQLObject # When zope mode is activated, add a predefined set of Zope acquired attributes # to generated-members. zope=no # List of members which are set dynamically and missed by pylint inference # system, and so shouldn't trigger E0201 when accessed. generated-members=REQUEST,acl_users,aq_parent # checks for # * unused variables / imports # * undefined variables # * redefinition of variable from builtins or from an outer scope # * use of variable before assignment # [VARIABLES] # Tells whether we should check for unused import in __init__ files. init-import=yes # A regular expression matching names used for dummy variables (i.e. not used). dummy-variables-rgx=dummy # List of additional names supposed to be defined in builtins. Remember that # you should avoid to define new builtins when possible. additional-builtins= # checks for : # * doc strings # * modules / classes / functions / methods / arguments / variables name # * number of arguments, local variables, branches, returns and statements in # functions, methods # * required module attributes # * dangerous default values as arguments # * redefinition of function / method / class # * uses of the global statement # [BASIC] # Required attributes for module, separated by a comma required-attributes= # Regular expression which should only match functions or classes name which do # not require a docstring no-docstring-rgx=__.*__ # Regular expression which should only match correct module names module-rgx=(([a-z_][a-z0-9_]*)|([A-Z][a-zA-Z0-9]+))$ # Regular expression which should only match correct module level names const-rgx=(([a-zA-Z_][a-zA-Z0-9_]*)|(__.*__))$ # Regular expression which should only match correct class names class-rgx=[a-zA-Z_][a-zA-Z0-9]+$ # Regular expression which should only match correct function names function-rgx=[a-z_][a-z0-9_]{2,30}$ # Regular expression which should only match correct method names method-rgx=[a-z_][a-z0-9_]{2,30}$ # Regular expression which should only match correct instance attribute names attr-rgx=[a-z_][a-z0-9_]{1,30}$ # Regular expression which should only match correct argument names argument-rgx=[a-z_][a-z0-9_]{0,30}$ # Regular expression which should only match correct variable names variable-rgx=[a-z_][a-z0-9_]{0,30}$ # Regular expression which should only match correct list comprehension / # generator expression variable names inlinevar-rgx=[A-Za-z_][A-Za-z0-9_]*$ # Good variable names which should always be accepted, separated by a comma good-names=i,j,k,ex,Run,_,ui,c,fn,f,fd,l # Bad variable names which should always be refused, separated by a comma bad-names=foo,bar,baz,toto,tutu,tata # List of builtins function names that should not be used, separated by a comma #bad-functions=map,filter,apply,input bad-functions=map,filter,apply,input # checks for # * external modules dependencies # * relative / wildcard imports # * cyclic imports # * uses of deprecated modules # [IMPORTS] # Deprecated modules which should not be used, separated by a comma deprecated-modules=regsub,TERMIOS,Bastion,rexec # Create a graph of every (i.e. internal and external) dependencies in the # given file (report R0402 must not be disabled) import-graph= # Create a graph of external dependencies in the given file (report R0402 must # not be disabled) ext-import-graph= # Create a graph of internal dependencies in the given file (report R0402 must # not be disabled) int-import-graph= # checks for sign of poor/misdesign: # * number of methods, attributes, local variables... # * size, complexity of functions, methods # [DESIGN] # Maximum number of arguments for function / method max-args=5 # Maximum number of locals for function / method body max-locals=15 # Maximum number of return / yield for function / method body max-returns=6 # Maximum number of branch for function / method body max-branchs=12 # Maximum number of statements in function / method body max-statements=50 # Maximum number of parents for a class (see R0901). max-parents=7 # Maximum number of attributes for a class (see R0902). max-attributes=7 # Minimum number of public methods for a class (see R0903). min-public-methods=2 # Maximum number of public methods for a class (see R0904). max-public-methods=20 # checks for : # * methods without self as first argument # * overridden methods signature # * access only to existent members via self # * attributes not defined in the __init__ method # * supported interfaces implementation # * unreachable code # [CLASSES] # List of interface methods to ignore, separated by a comma. This is used for # instance to not check methods defines in Zope's Interface base class. ignore-iface-methods=isImplementedBy,deferred,extends,names,namesAndDescriptions,queryDescriptionFor,getBases,getDescriptionFor,getDoc,getName,getTaggedValue,getTaggedValueTags,isEqualOrExtendedBy,setTaggedValue,isImplementedByInstancesOf,adaptWith,is_implemented_by # List of method names used to declare (i.e. assign) instance attributes. defining-attr-methods=__init__,__new__,setUp # checks for : # * unauthorized constructions # * strict indentation # * line length # * use of <> instead of != # [FORMAT] # Maximum number of characters on a single line. max-line-length=80 # Maximum number of lines in a module max-module-lines=1000 # String used as indentation unit. This is usually " " (4 spaces) or "\t" (1 # tab). indent-string=' ' # checks for: # * warning notes in the code like FIXME, XXX # * PEP 263: source code with non ascii character but no encoding declaration # [MISCELLANEOUS] # List of note tags to take in consideration, separated by a comma. notes=FIXME,XXX,TODO # checks for similarities and duplicated code. This computation may be # memory / CPU intensive, so you should disable it if you experiments some # problems. # [SIMILARITIES] # Minimum lines number of a similarity. min-similarity-lines=4 # Ignore comments when computing similarities. ignore-comments=yes # Ignore docstrings when computing similarities. ignore-docstrings=yes