largefiles: optimize status when files are specified (issue3144)
This fixes a performance issue with 'hg status' when files are specified
on the command-line. Previously, a large amount of largefiles code was
executed, even if files were specified on the command-line and those files
were not largefiles. This patch fixes the problem by first checking if
non-largefiles were specified on the command-line and, just letting the
normal status function handle the case if they were.
On a brand new machine, the execution time for 'hg status filename' on
a repository with largefiles was:
real 0m0.636s
user 0m0.512s
sys 0m0.120s
versus the following (the same repository, with largefiles disabled):
real 0m0.215s
user 0m0.180s
sys 0m0.032s
After this patch, the performance of 'hg status filename' on the same
repository, with largefiles enabled is:
real 0m0.228s
user 0m0.189s
sys 0m0.036s
This performance boost is also true when patterns (rather than specific
files) are specified on the command-line.
In the case where patterns are specified in addition to a file list, we
just defer to the normal codepath in order to not spend extra time
expanding the patterns to just risk having to expand them again later.
#!/usr/bin/env python
#
# posplit - split messages in paragraphs on .po/.pot files
#
# license: MIT/X11/Expat
#
import sys
import polib
def addentry(po, entry, cache):
e = cache.get(entry.msgid)
if e:
e.occurrences.extend(entry.occurrences)
else:
po.append(entry)
cache[entry.msgid] = entry
def mkentry(orig, delta, msgid, msgstr):
entry = polib.POEntry()
entry.merge(orig)
entry.msgid = msgid or orig.msgid
entry.msgstr = msgstr or orig.msgstr
entry.occurrences = [(p, int(l) + delta) for (p, l) in orig.occurrences]
return entry
if __name__ == "__main__":
po = polib.pofile(sys.argv[1])
cache = {}
entries = po[:]
po[:] = []
for entry in entries:
msgids = entry.msgid.split(u'\n\n')
if entry.msgstr:
msgstrs = entry.msgstr.split(u'\n\n')
else:
msgstrs = [u''] * len(msgids)
if len(msgids) != len(msgstrs):
# places the whole existing translation as a fuzzy
# translation for each paragraph, to give the
# translator a chance to recover part of the old
# translation - erasing extra paragraphs is
# probably better than retranslating all from start
if 'fuzzy' not in entry.flags:
entry.flags.append('fuzzy')
msgstrs = [entry.msgstr] * len(msgids)
delta = 0
for msgid, msgstr in zip(msgids, msgstrs):
if msgid:
newentry = mkentry(entry, delta, msgid, msgstr)
addentry(po, newentry, cache)
delta += 2 + msgid.count('\n')
po.save()