i18n/posplit
author Matt Harbison <matt_harbison@yahoo.com>
Fri, 29 Dec 2017 23:50:42 -0500
changeset 35518 5880318624c9
parent 29153 90d84e1e427a
child 39296 d0e8933d6dad
permissions -rwxr-xr-x
debugfs: display the tested path and mount point of the filesystem, if known While implementing win32.getfstype(), I noticed that MSYS path mangling is getting in the way. Given a path \\host\share\dir: - If strong quoted, hg receives it unchanged, and it works as expected - If double quoted, it converts to \host\share\dir - If unquoted, it converts to \hostsharedir The second and third cases are problematic because those are valid paths relative to the current drive letter, so os.path.realpath() will expand it as such. The net effect is to silently turn a network path test into (typically) a "C:\" test. Additionally, the command hangs after printing out 'symlink: no' for the third case (but is interruptable with Ctrl + C). This path mangling only comes into play because of the command line arguments- it won't affect internally obtained paths. Therefore, the simplest thing to do is to provide feedback on what the command is acting on. I also added the mount point, because Windows supports nesting [1] volumes (see the examples in "Junction Points and Mounted Folders"), and it was a useful diagnostic for figuring out why the wrong filesystem was printed out in the cases above. I opted not to call os.path.realpath() on the path argument, to make it clearer that the mangling isn't being done by Mercurial. [1] https://msdn.microsoft.com/en-us/library/windows/desktop/aa364996(v=vs.85).aspx

#!/usr/bin/env python
#
# posplit - split messages in paragraphs on .po/.pot files
#
# license: MIT/X11/Expat
#

from __future__ import absolute_import, print_function

import polib
import re
import sys

def addentry(po, entry, cache):
    e = cache.get(entry.msgid)
    if e:
        e.occurrences.extend(entry.occurrences)
    else:
        po.append(entry)
        cache[entry.msgid] = entry

def mkentry(orig, delta, msgid, msgstr):
    entry = polib.POEntry()
    entry.merge(orig)
    entry.msgid = msgid or orig.msgid
    entry.msgstr = msgstr or orig.msgstr
    entry.occurrences = [(p, int(l) + delta) for (p, l) in orig.occurrences]
    return entry

if __name__ == "__main__":
    po = polib.pofile(sys.argv[1])

    cache = {}
    entries = po[:]
    po[:] = []
    findd = re.compile(r' *\.\. (\w+)::') # for finding directives
    for entry in entries:
        msgids = entry.msgid.split(u'\n\n')
        if entry.msgstr:
            msgstrs = entry.msgstr.split(u'\n\n')
        else:
            msgstrs = [u''] * len(msgids)

        if len(msgids) != len(msgstrs):
            # places the whole existing translation as a fuzzy
            # translation for each paragraph, to give the
            # translator a chance to recover part of the old
            # translation - erasing extra paragraphs is
            # probably better than retranslating all from start
            if 'fuzzy' not in entry.flags:
                entry.flags.append('fuzzy')
            msgstrs = [entry.msgstr] * len(msgids)

        delta = 0
        for msgid, msgstr in zip(msgids, msgstrs):
            if msgid and msgid != '::':
                newentry = mkentry(entry, delta, msgid, msgstr)
                mdirective = findd.match(msgid)
                if mdirective:
                    if not msgid[mdirective.end():].rstrip():
                        # only directive, nothing to translate here
                        delta += 2
                        continue
                    directive = mdirective.group(1)
                    if directive in ('container', 'include'):
                        if msgid.rstrip('\n').count('\n') == 0:
                            # only rst syntax, nothing to translate
                            delta += 2
                            continue
                        else:
                            # lines following directly, unexpected
                            print('Warning: text follows line with directive' \
                                  ' %s' % directive)
                    comment = 'do not translate: .. %s::' % directive
                    if not newentry.comment:
                        newentry.comment = comment
                    elif comment not in newentry.comment:
                        newentry.comment += '\n' + comment
                addentry(po, newentry, cache)
            delta += 2 + msgid.count('\n')
    po.save()