Mercurial > hg
view i18n/posplit @ 35810:113a30b87716 stable
lazymanifest: avoid reading uninitialized memory
I got errors running tests with clang UBSAN [1] enabled. One of them is:
```
--- test-dirstate.t
+++ test-dirstate.t.err
@@ -85,9 +85,115 @@
$ echo "[extensions]" >> .hg/hgrc
$ echo "dirstateex=../dirstateexception.py" >> .hg/hgrc
$ hg up 0
- abort: simulated error while recording dirstateupdates
- [255]
+ mercurial/cext/manifest.c:781:13: runtime error: load of value 190, which is not a valid value for type 'bool'
+ #0 0x7f668a8cf748 in lazymanifest_diff mercurial/cext/manifest.c:781
+ #1 0x7f6692fc1dc4 in call_function Python-2.7.11/Python/ceval.c:4350
+ .......
+ SUMMARY: UndefinedBehaviorSanitizer: invalid-bool-load mercurial/cext/manifest.c:781:13 in
+ [1]
$ hg log -r . -T '{rev}\n'
1
$ hg status
- ? a
```
While the code is not technically wrong, but switching the condition would
make clang UBSAN happy. So let's do it.
The uninitialized memory could come from, for example, `lazymanifest_copy`
allocates `self->maxlines` items but only writes the first `self->lines`
items.
[1]: https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html
Test Plan:
Run `test-dirstate.t` with UBSAN and it no longer reports the issue.
Differential Revision: https://phab.mercurial-scm.org/D1948
author | Jun Wu <quark@fb.com> |
---|---|
date | Tue, 30 Jan 2018 20:32:48 -0800 |
parents | 90d84e1e427a |
children | d0e8933d6dad |
line wrap: on
line source
#!/usr/bin/env python # # posplit - split messages in paragraphs on .po/.pot files # # license: MIT/X11/Expat # from __future__ import absolute_import, print_function import polib import re import sys def addentry(po, entry, cache): e = cache.get(entry.msgid) if e: e.occurrences.extend(entry.occurrences) else: po.append(entry) cache[entry.msgid] = entry def mkentry(orig, delta, msgid, msgstr): entry = polib.POEntry() entry.merge(orig) entry.msgid = msgid or orig.msgid entry.msgstr = msgstr or orig.msgstr entry.occurrences = [(p, int(l) + delta) for (p, l) in orig.occurrences] return entry if __name__ == "__main__": po = polib.pofile(sys.argv[1]) cache = {} entries = po[:] po[:] = [] findd = re.compile(r' *\.\. (\w+)::') # for finding directives for entry in entries: msgids = entry.msgid.split(u'\n\n') if entry.msgstr: msgstrs = entry.msgstr.split(u'\n\n') else: msgstrs = [u''] * len(msgids) if len(msgids) != len(msgstrs): # places the whole existing translation as a fuzzy # translation for each paragraph, to give the # translator a chance to recover part of the old # translation - erasing extra paragraphs is # probably better than retranslating all from start if 'fuzzy' not in entry.flags: entry.flags.append('fuzzy') msgstrs = [entry.msgstr] * len(msgids) delta = 0 for msgid, msgstr in zip(msgids, msgstrs): if msgid and msgid != '::': newentry = mkentry(entry, delta, msgid, msgstr) mdirective = findd.match(msgid) if mdirective: if not msgid[mdirective.end():].rstrip(): # only directive, nothing to translate here delta += 2 continue directive = mdirective.group(1) if directive in ('container', 'include'): if msgid.rstrip('\n').count('\n') == 0: # only rst syntax, nothing to translate delta += 2 continue else: # lines following directly, unexpected print('Warning: text follows line with directive' \ ' %s' % directive) comment = 'do not translate: .. %s::' % directive if not newentry.comment: newentry.comment = comment elif comment not in newentry.comment: newentry.comment += '\n' + comment addentry(po, newentry, cache) delta += 2 + msgid.count('\n') po.save()