revset: make use of natively-computed set for 'draft()' and 'secret()'
If the computation of a set for each phase (done in C) is available,
we use it directly instead of applying a simple filter. This give a
massive speed-up in the vast majority of cases.
On my mercurial repo with about 15000 out of 40000 draft changesets:
revset: draft()
plain min first last
0) 0.011201 0.019950 0.009844 0.000074
1) 0.000284 2% 0.000312 1% 0.000314 3% 0.000315 x4.3
Bad performance for "last" come from the handling of the 15000 elements set
(memory allocation, filtering hidden changesets (99% of it) etc. compared to
applying the filter only on a handfuld of revisions (the first draft changesets
being close of tip).
This is not seen as an issue since:
* Timing is still pretty good and in line with all the other one,
* Current user of Vanilla Mercurial will not have 1/3 of their repo draft,
This bad effect disappears when phase's set is smaller. (about 200 secrets):
revset: secret()
plain min first last
0) 0.011181 0.022228 0.010851 0.000452
1) 0.000058 0% 0.000084 0% 0.000087 0% 0.000087 19%
#!/usr/bin/env python
#
# posplit - split messages in paragraphs on .po/.pot files
#
# license: MIT/X11/Expat
#
import re
import sys
import polib
def addentry(po, entry, cache):
e = cache.get(entry.msgid)
if e:
e.occurrences.extend(entry.occurrences)
else:
po.append(entry)
cache[entry.msgid] = entry
def mkentry(orig, delta, msgid, msgstr):
entry = polib.POEntry()
entry.merge(orig)
entry.msgid = msgid or orig.msgid
entry.msgstr = msgstr or orig.msgstr
entry.occurrences = [(p, int(l) + delta) for (p, l) in orig.occurrences]
return entry
if __name__ == "__main__":
po = polib.pofile(sys.argv[1])
cache = {}
entries = po[:]
po[:] = []
findd = re.compile(r' *\.\. (\w+)::') # for finding directives
for entry in entries:
msgids = entry.msgid.split(u'\n\n')
if entry.msgstr:
msgstrs = entry.msgstr.split(u'\n\n')
else:
msgstrs = [u''] * len(msgids)
if len(msgids) != len(msgstrs):
# places the whole existing translation as a fuzzy
# translation for each paragraph, to give the
# translator a chance to recover part of the old
# translation - erasing extra paragraphs is
# probably better than retranslating all from start
if 'fuzzy' not in entry.flags:
entry.flags.append('fuzzy')
msgstrs = [entry.msgstr] * len(msgids)
delta = 0
for msgid, msgstr in zip(msgids, msgstrs):
if msgid and msgid != '::':
newentry = mkentry(entry, delta, msgid, msgstr)
mdirective = findd.match(msgid)
if mdirective:
if not msgid[mdirective.end():].rstrip():
# only directive, nothing to translate here
continue
directive = mdirective.group(1)
if directive in ('container', 'include'):
if msgid.rstrip('\n').count('\n') == 0:
# only rst syntax, nothing to translate
continue
else:
# lines following directly, unexpected
print 'Warning: text follows line with directive' \
' %s' % directive
comment = 'do not translate: .. %s::' % directive
if not newentry.comment:
newentry.comment = comment
elif comment not in newentry.comment:
newentry.comment += '\n' + comment
addentry(po, newentry, cache)
delta += 2 + msgid.count('\n')
po.save()