view i18n/posplit @ 47315:825d5a5907b4

exewrapper: avoid directly linking against python3X.dll Subsequent code calls `LoadLibrary()` to attempt to load the DLL, but because of this symbol reference, there is an attempt to load the DLL used during the build prior to `_main()` running. This causes the whole process to fail if the DLL isn't in the standard search path. That also means it will never load the DLL for HackableMercurial. (Maybe we should get rid of that for py3, since you can install python for a user without admin rights?) This could also be resolved by calling `GetProcAddress()` on the symbol and dereferencing it, but using the environment variable is consistent with the *.bat file since fc8a5c9ecee0. (The environment variable persists after the interpreter is initialized.) Far more concerning is somehow I've gotten my system into a state where setting the flag causes any output to the pager to be lost (as if it wasn't set at all) in MSYS, cmd.exe, WSL, and PowerShell using py3.9.0, but the environment variable works properly. I'm sure this flag worked on some versions of py3, so I'm not sure what's going on here. This is might be related to init config related changes in 3.8[1], since it works with 3.7.8, but fails with 3.8.1. Somebody who understands encoding issues better than I do should give some thought to if we need to make some changes to our encoding strategy on Windows with py3. With or without the flag/envvar, there is proper output if the command is directly paged by piping to `more.com` (in any environment) or `less` (in MSYS and WSL), or if paging is disabled with `--pager=no`. Legacy mode is required though when Mercurial decides to spin up a pager. [1] https://bugs.python.org/issue41941 Differential Revision: https://phab.mercurial-scm.org/D10756
author Matt Harbison <matt_harbison@yahoo.com>
date Tue, 11 May 2021 01:05:38 -0400
parents c102b704edb5
children 6000f5b25c9b
line wrap: on
line source

#!/usr/bin/env python3
#
# posplit - split messages in paragraphs on .po/.pot files
#
# license: MIT/X11/Expat
#

from __future__ import absolute_import, print_function

import polib
import re
import sys


def addentry(po, entry, cache):
    e = cache.get(entry.msgid)
    if e:
        e.occurrences.extend(entry.occurrences)

        # merge comments from entry
        for comment in entry.comment.split('\n'):
            if comment and comment not in e.comment:
                if not e.comment:
                    e.comment = comment
                else:
                    e.comment += '\n' + comment
    else:
        po.append(entry)
        cache[entry.msgid] = entry


def mkentry(orig, delta, msgid, msgstr):
    entry = polib.POEntry()
    entry.merge(orig)
    entry.msgid = msgid or orig.msgid
    entry.msgstr = msgstr or orig.msgstr
    entry.occurrences = [(p, int(l) + delta) for (p, l) in orig.occurrences]
    return entry


if __name__ == "__main__":
    po = polib.pofile(sys.argv[1])

    cache = {}
    entries = po[:]
    po[:] = []
    findd = re.compile(r' *\.\. (\w+)::')  # for finding directives
    for entry in entries:
        msgids = entry.msgid.split(u'\n\n')
        if entry.msgstr:
            msgstrs = entry.msgstr.split(u'\n\n')
        else:
            msgstrs = [u''] * len(msgids)

        if len(msgids) != len(msgstrs):
            # places the whole existing translation as a fuzzy
            # translation for each paragraph, to give the
            # translator a chance to recover part of the old
            # translation - erasing extra paragraphs is
            # probably better than retranslating all from start
            if 'fuzzy' not in entry.flags:
                entry.flags.append('fuzzy')
            msgstrs = [entry.msgstr] * len(msgids)

        delta = 0
        for msgid, msgstr in zip(msgids, msgstrs):
            if msgid and msgid != '::':
                newentry = mkentry(entry, delta, msgid, msgstr)
                mdirective = findd.match(msgid)
                if mdirective:
                    if not msgid[mdirective.end() :].rstrip():
                        # only directive, nothing to translate here
                        delta += 2
                        continue
                    directive = mdirective.group(1)
                    if directive in ('container', 'include'):
                        if msgid.rstrip('\n').count('\n') == 0:
                            # only rst syntax, nothing to translate
                            delta += 2
                            continue
                        else:
                            # lines following directly, unexpected
                            print(
                                'Warning: text follows line with directive'
                                ' %s' % directive
                            )
                    comment = 'do not translate: .. %s::' % directive
                    if not newentry.comment:
                        newentry.comment = comment
                    elif comment not in newentry.comment:
                        newentry.comment += '\n' + comment
                addentry(po, newentry, cache)
            delta += 2 + msgid.count('\n')
    po.save()