strip: make --keep option not set all dirstate times to 0
hg strip -k was using dirstate.rebuild() which reset all the dirstate
entries timestamps to 0. This meant that the next time hg status was
run every file was considered to be 'unsure', which caused it to do
expensive read operations on every filelog. On a repo with >150,000
files it took 70 seconds when everything was in memory. From a cold
cache it took several minutes.
The fix is to only reset files that have changed between the working
context and the destination context.
For reference, --keep means the working directory is left alone during
the strip. We have users wanting to use this operation to store their
work-in-progress as a commit on a branch while they go work on another
branch, then come back later and be able to uncommit that work and
continue working. They currently use 'git reset HARD^' to accomplish
this in git.
#!/usr/bin/env python
#
# posplit - split messages in paragraphs on .po/.pot files
#
# license: MIT/X11/Expat
#
import sys
import polib
def addentry(po, entry, cache):
e = cache.get(entry.msgid)
if e:
e.occurrences.extend(entry.occurrences)
else:
po.append(entry)
cache[entry.msgid] = entry
def mkentry(orig, delta, msgid, msgstr):
entry = polib.POEntry()
entry.merge(orig)
entry.msgid = msgid or orig.msgid
entry.msgstr = msgstr or orig.msgstr
entry.occurrences = [(p, int(l) + delta) for (p, l) in orig.occurrences]
return entry
if __name__ == "__main__":
po = polib.pofile(sys.argv[1])
cache = {}
entries = po[:]
po[:] = []
for entry in entries:
msgids = entry.msgid.split(u'\n\n')
if entry.msgstr:
msgstrs = entry.msgstr.split(u'\n\n')
else:
msgstrs = [u''] * len(msgids)
if len(msgids) != len(msgstrs):
# places the whole existing translation as a fuzzy
# translation for each paragraph, to give the
# translator a chance to recover part of the old
# translation - erasing extra paragraphs is
# probably better than retranslating all from start
if 'fuzzy' not in entry.flags:
entry.flags.append('fuzzy')
msgstrs = [entry.msgstr] * len(msgids)
delta = 0
for msgid, msgstr in zip(msgids, msgstrs):
if msgid:
newentry = mkentry(entry, delta, msgid, msgstr)
addentry(po, newentry, cache)
delta += 2 + msgid.count('\n')
po.save()