Mercurial > hg
view i18n/posplit @ 42001:624d6683c705
branchmap: remove the dict interface from the branchcache class (API)
The current branchmap computation involves reading the whole branchmap from
disk, validating all the nodes even if they are not required. This leads to a
lot of time on repos which have large branchmap or a lot of branches. On large
repos, this can validate around 1000's of nodes.
On some operations, like finding whether a branch exists or not, we don't need
to validate all the nodes. Or updating heads for a single branch.
Before this patch, branchcache class was having dict interface and it was hard
to keep track of reads.
This patch removes the dict interface. Upcoming patches will implement lazy
loading and validation of data and implement better API's.
Differential Revision: https://phab.mercurial-scm.org/D6151
author | Pulkit Goyal <pulkit@yandex-team.ru> |
---|---|
date | Mon, 18 Mar 2019 18:59:38 +0300 |
parents | aaad36b88298 |
children | 47ef023d0165 |
line wrap: on
line source
#!/usr/bin/env python # # posplit - split messages in paragraphs on .po/.pot files # # license: MIT/X11/Expat # from __future__ import absolute_import, print_function import polib import re import sys def addentry(po, entry, cache): e = cache.get(entry.msgid) if e: e.occurrences.extend(entry.occurrences) # merge comments from entry for comment in entry.comment.split('\n'): if comment and comment not in e.comment: if not e.comment: e.comment = comment else: e.comment += '\n' + comment else: po.append(entry) cache[entry.msgid] = entry def mkentry(orig, delta, msgid, msgstr): entry = polib.POEntry() entry.merge(orig) entry.msgid = msgid or orig.msgid entry.msgstr = msgstr or orig.msgstr entry.occurrences = [(p, int(l) + delta) for (p, l) in orig.occurrences] return entry if __name__ == "__main__": po = polib.pofile(sys.argv[1]) cache = {} entries = po[:] po[:] = [] findd = re.compile(r' *\.\. (\w+)::') # for finding directives for entry in entries: msgids = entry.msgid.split(u'\n\n') if entry.msgstr: msgstrs = entry.msgstr.split(u'\n\n') else: msgstrs = [u''] * len(msgids) if len(msgids) != len(msgstrs): # places the whole existing translation as a fuzzy # translation for each paragraph, to give the # translator a chance to recover part of the old # translation - erasing extra paragraphs is # probably better than retranslating all from start if 'fuzzy' not in entry.flags: entry.flags.append('fuzzy') msgstrs = [entry.msgstr] * len(msgids) delta = 0 for msgid, msgstr in zip(msgids, msgstrs): if msgid and msgid != '::': newentry = mkentry(entry, delta, msgid, msgstr) mdirective = findd.match(msgid) if mdirective: if not msgid[mdirective.end():].rstrip(): # only directive, nothing to translate here delta += 2 continue directive = mdirective.group(1) if directive in ('container', 'include'): if msgid.rstrip('\n').count('\n') == 0: # only rst syntax, nothing to translate delta += 2 continue else: # lines following directly, unexpected print('Warning: text follows line with directive' ' %s' % directive) comment = 'do not translate: .. %s::' % directive if not newentry.comment: newentry.comment = comment elif comment not in newentry.comment: newentry.comment += '\n' + comment addentry(po, newentry, cache) delta += 2 + msgid.count('\n') po.save()