view doc/docchecker @ 33698:3748098d072a

releasenotes: add similarity check function to compare incoming notes It is possible that the incoming note fragments have some similar content as the existing release notes. In case of a bug fix, we match for issueNNNN in the existing notes. For other general cases, it makes use of fuzzywuzzy library to get a similarity score. If the score is above a certain threshold, we ignore the fragment, otherwise add it. But the score might be misleading for small commit messages. So, it uses similarity function only if the length of string (in words) is above a certain value. The patch adds tests related to its usage. But it needs improvement in the sense of combining incoming notes. We can use interactive mode for adding notes. Maybe we can do this if similarity is under a certain range.
author Rishabh Madan <rishabhmadan96@gmail.com>
date Sat, 05 Aug 2017 05:25:36 +0530
parents c9ab5a0bc7c5
children 9bfbb9fc5871
line wrap: on
line source

#!/usr/bin/env python
#
# docchecker - look for problematic markup
#
# Copyright 2016 timeless <timeless@mozdev.org> and others
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.

from __future__ import absolute_import, print_function

import re
import sys

leadingline = re.compile(r'(^\s*)(\S.*)$')

checks = [
  (r""":hg:`[^`]*'[^`]*`""",
    """warning: please avoid nesting ' in :hg:`...`"""),
  (r'\w:hg:`',
    'warning: please have a space before :hg:'),
  (r"""(?:[^a-z][^'.])hg ([^,;"`]*'(?!hg)){2}""",
    '''warning: please use " instead of ' for hg ... "..."'''),
]

def check(line):
    messages = []
    for match, msg in checks:
        if re.search(match, line):
            messages.append(msg)
    if messages:
        print(line)
        for msg in messages:
            print(msg)

def work(file):
    (llead, lline) = ('', '')

    for line in file:
        # this section unwraps lines
        match = leadingline.match(line)
        if not match:
            check(lline)
            (llead, lline) = ('', '')
            continue

        lead, line = match.group(1), match.group(2)
        if (lead == llead):
            if (lline != ''):
                lline += ' ' + line
            else:
                lline = line
        else:
            check(lline)
            (llead, lline) = (lead, line)
    check(lline)

def main():
    for f in sys.argv[1:]:
        try:
            with open(f) as file:
                work(file)
        except BaseException as e:
            print("failed to process %s: %s" % (f, e))

main()