view i18n/posplit @ 17758:5863f0e4cd3a

histedit: replace various nodes lists with replacement graph (and issue3582) This changeset rewrites the change tracking logic of histedit to record every operation it does. Tracked operations record the full list of "old" node that will eventually be removed to the list of new nodes that replace it. Operations on temporary nodes are tracked too. Dropped changesets are also recorded as an "old" node replacement by nothing. This logic is similar to the obsolescence marker one and will be used for this purpose in later commit. This new logic implies a big amount of change in the histedit code base. histedit action functions now always return a tuple of (new-ctx, [list of rewriting operations]) The old `created`, `replaced` and `tmpnodes` are no longer returned and stored during histedit operation. When such information is necessary it is computed from the replacement graph. This computation is done in the `processreplacement` function. The `replacemap` is also dropped. It is computed at the end of the command from the graph. The `bootstrapcontinue` methods are altered to compute this different kind of information. This new mechanism requires much less information to be written on disk. Note: This changes allows a more accurate bookmark movement. bookmark on dropped changeset are now move of their parent (or replacement of their parent) instead of their children. This fix issue3582
author Pierre-Yves David <pierre-yves.david@ens-lyon.org>
date Thu, 11 Oct 2012 08:36:50 +0200
parents 4fd49329a1b5
children ff6ab0b2ebf7
line wrap: on
line source

#!/usr/bin/env python
#
# posplit - split messages in paragraphs on .po/.pot files
#
# license: MIT/X11/Expat
#

import sys
import polib

def addentry(po, entry, cache):
    e = cache.get(entry.msgid)
    if e:
        e.occurrences.extend(entry.occurrences)
    else:
        po.append(entry)
        cache[entry.msgid] = entry

def mkentry(orig, delta, msgid, msgstr):
    entry = polib.POEntry()
    entry.merge(orig)
    entry.msgid = msgid or orig.msgid
    entry.msgstr = msgstr or orig.msgstr
    entry.occurrences = [(p, int(l) + delta) for (p, l) in orig.occurrences]
    return entry

if __name__ == "__main__":
    po = polib.pofile(sys.argv[1])

    cache = {}
    entries = po[:]
    po[:] = []
    for entry in entries:
        msgids = entry.msgid.split(u'\n\n')
        if entry.msgstr:
            msgstrs = entry.msgstr.split(u'\n\n')
        else:
            msgstrs = [u''] * len(msgids)

        if len(msgids) != len(msgstrs):
            # places the whole existing translation as a fuzzy
            # translation for each paragraph, to give the
            # translator a chance to recover part of the old
            # translation - erasing extra paragraphs is
            # probably better than retranslating all from start
            if 'fuzzy' not in entry.flags:
                entry.flags.append('fuzzy')
            msgstrs = [entry.msgstr] * len(msgids)

        delta = 0
        for msgid, msgstr in zip(msgids, msgstrs):
            if msgid:
                newentry = mkentry(entry, delta, msgid, msgstr)
                addentry(po, newentry, cache)
            delta += 2 + msgid.count('\n')
    po.save()