comparison mercurial/match.py @ 40242:19ed212de2d1

match: optimize matcher when all patterns are of rootfilesin kind Internally at Google, we use narrowspecs with only rootfilesin-kind patterns. Sometimes there are thousands of such patterns (i.e. thousands of tracked directories). In such cases, it can take quite long to build and evaluate the resulting matcher. This patch optimizes matchers that have only patterns of rootfilesin so it instead of creating a regular expression, it matches the given file's directory against the set of directories. In a repo with ~3600 tracked directories, it takes about 1.35 s to build the matcher and 2.7 s to walk the dirstate before this patch. After, it takes 0.04 s to create the matcher and 0.87 s to walk the dirstate. It may be worthwhile to do similar optimizations for e.g. patterns of type "kind:", but that's not a priority for us right now. Differential Revision: https://phab.mercurial-scm.org/D5058
author Martin von Zweigbergk <martinvonz@google.com>
date Sat, 13 Oct 2018 00:22:05 -0700
parents 35ecaa999a12
children d30a19d10441
comparison
equal deleted inserted replaced
40241:81e4f039a0cd 40242:19ed212de2d1
1162 return False 1162 return False
1163 matchfuncs.append(matchsubinclude) 1163 matchfuncs.append(matchsubinclude)
1164 1164
1165 regex = '' 1165 regex = ''
1166 if kindpats: 1166 if kindpats:
1167 regex, mf = _buildregexmatch(kindpats, globsuffix) 1167 if all(k == 'rootfilesin' for k, p, s in kindpats):
1168 matchfuncs.append(mf) 1168 dirs = {p for k, p, s in kindpats}
1169 def mf(f):
1170 i = f.rfind('/')
1171 if i >= 0:
1172 dir = f[:i]
1173 else:
1174 dir = '.'
1175 return dir in dirs
1176 regex = b'rootfilesin: %s' % sorted(dirs)
1177 matchfuncs.append(mf)
1178 else:
1179 regex, mf = _buildregexmatch(kindpats, globsuffix)
1180 matchfuncs.append(mf)
1169 1181
1170 if len(matchfuncs) == 1: 1182 if len(matchfuncs) == 1:
1171 return regex, matchfuncs[0] 1183 return regex, matchfuncs[0]
1172 else: 1184 else:
1173 return regex, lambda f: any(mf(f) for mf in matchfuncs) 1185 return regex, lambda f: any(mf(f) for mf in matchfuncs)