author | Matt Mackall <mpm@selenic.com> |
Mon, 12 Oct 2009 18:25:46 -0500 | |
changeset 9584 | 17da88da1abd |
parent 9540 | cad36e496640 |
child 9623 | 32727ce029de |
permissions | -rw-r--r-- |
9156
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
1 |
# minirst.py - minimal reStructuredText parser |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
2 |
# |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
3 |
# Copyright 2009 Matt Mackall <mpm@selenic.com> and others |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
4 |
# |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
5 |
# This software may be used and distributed according to the terms of the |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
6 |
# GNU General Public License version 2, incorporated herein by reference. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
7 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
8 |
"""simplified reStructuredText parser. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
9 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
10 |
This parser knows just enough about reStructuredText to parse the |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
11 |
Mercurial docstrings. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
12 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
13 |
It cheats in a major way: nested blocks are not really nested. They |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
14 |
are just indented blocks that look like they are nested. This relies |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
15 |
on the user to keep the right indentation for the blocks. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
16 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
17 |
It only supports a small subset of reStructuredText: |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
18 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
19 |
- paragraphs |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
20 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
21 |
- definition lists (must use ' ' to indent definitions) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
22 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
23 |
- lists (items must start with '-') |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
24 |
|
9293
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
25 |
- field lists (colons cannot be escaped) |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
26 |
|
9156
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
27 |
- literal blocks |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
28 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
29 |
- option lists (supports only long options without arguments) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
30 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
31 |
- inline markup is not recognized at all. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
32 |
""" |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
33 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
34 |
import re, sys, textwrap |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
35 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
36 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
37 |
def findblocks(text): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
38 |
"""Find continuous blocks of lines in text. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
39 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
40 |
Returns a list of dictionaries representing the blocks. Each block |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
41 |
has an 'indent' field and a 'lines' field. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
42 |
""" |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
43 |
blocks = [[]] |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
44 |
lines = text.splitlines() |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
45 |
for line in lines: |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
46 |
if line.strip(): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
47 |
blocks[-1].append(line) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
48 |
elif blocks[-1]: |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
49 |
blocks.append([]) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
50 |
if not blocks[-1]: |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
51 |
del blocks[-1] |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
52 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
53 |
for i, block in enumerate(blocks): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
54 |
indent = min((len(l) - len(l.lstrip())) for l in block) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
55 |
blocks[i] = dict(indent=indent, lines=[l[indent:] for l in block]) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
56 |
return blocks |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
57 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
58 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
59 |
def findliteralblocks(blocks): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
60 |
"""Finds literal blocks and adds a 'type' field to the blocks. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
61 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
62 |
Literal blocks are given the type 'literal', all other blocks are |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
63 |
given type the 'paragraph'. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
64 |
""" |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
65 |
i = 0 |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
66 |
while i < len(blocks): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
67 |
# Searching for a block that looks like this: |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
68 |
# |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
69 |
# +------------------------------+ |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
70 |
# | paragraph | |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
71 |
# | (ends with "::") | |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
72 |
# +------------------------------+ |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
73 |
# +---------------------------+ |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
74 |
# | indented literal block | |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
75 |
# +---------------------------+ |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
76 |
blocks[i]['type'] = 'paragraph' |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
77 |
if blocks[i]['lines'][-1].endswith('::') and i+1 < len(blocks): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
78 |
indent = blocks[i]['indent'] |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
79 |
adjustment = blocks[i+1]['indent'] - indent |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
80 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
81 |
if blocks[i]['lines'] == ['::']: |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
82 |
# Expanded form: remove block |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
83 |
del blocks[i] |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
84 |
i -= 1 |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
85 |
elif blocks[i]['lines'][-1].endswith(' ::'): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
86 |
# Partially minimized form: remove space and both |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
87 |
# colons. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
88 |
blocks[i]['lines'][-1] = blocks[i]['lines'][-1][:-3] |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
89 |
else: |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
90 |
# Fully minimized form: remove just one colon. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
91 |
blocks[i]['lines'][-1] = blocks[i]['lines'][-1][:-1] |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
92 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
93 |
# List items are formatted with a hanging indent. We must |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
94 |
# correct for this here while we still have the original |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
95 |
# information on the indentation of the subsequent literal |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
96 |
# blocks available. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
97 |
if blocks[i]['lines'][0].startswith('- '): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
98 |
indent += 2 |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
99 |
adjustment -= 2 |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
100 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
101 |
# Mark the following indented blocks. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
102 |
while i+1 < len(blocks) and blocks[i+1]['indent'] > indent: |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
103 |
blocks[i+1]['type'] = 'literal' |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
104 |
blocks[i+1]['indent'] -= adjustment |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
105 |
i += 1 |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
106 |
i += 1 |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
107 |
return blocks |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
108 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
109 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
110 |
def findsections(blocks): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
111 |
"""Finds sections. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
112 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
113 |
The blocks must have a 'type' field, i.e., they should have been |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
114 |
run through findliteralblocks first. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
115 |
""" |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
116 |
for block in blocks: |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
117 |
# Searching for a block that looks like this: |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
118 |
# |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
119 |
# +------------------------------+ |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
120 |
# | Section title | |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
121 |
# | ------------- | |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
122 |
# +------------------------------+ |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
123 |
if (block['type'] == 'paragraph' and |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
124 |
len(block['lines']) == 2 and |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
125 |
block['lines'][1] == '-' * len(block['lines'][0])): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
126 |
block['type'] = 'section' |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
127 |
return blocks |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
128 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
129 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
130 |
def findbulletlists(blocks): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
131 |
"""Finds bullet lists. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
132 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
133 |
The blocks must have a 'type' field, i.e., they should have been |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
134 |
run through findliteralblocks first. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
135 |
""" |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
136 |
i = 0 |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
137 |
while i < len(blocks): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
138 |
# Searching for a paragraph that looks like this: |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
139 |
# |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
140 |
# +------+-----------------------+ |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
141 |
# | "- " | list item | |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
142 |
# +------| (body elements)+ | |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
143 |
# +-----------------------+ |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
144 |
if (blocks[i]['type'] == 'paragraph' and |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
145 |
blocks[i]['lines'][0].startswith('- ')): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
146 |
items = [] |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
147 |
for line in blocks[i]['lines']: |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
148 |
if line.startswith('- '): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
149 |
items.append(dict(type='bullet', lines=[], |
9292
01e580143423
minirst: simplify bullet list indentation computation
Martin Geisler <mg@lazybytes.net>
parents:
9291
diff
changeset
|
150 |
indent=blocks[i]['indent'])) |
9156
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
151 |
line = line[2:] |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
152 |
items[-1]['lines'].append(line) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
153 |
blocks[i:i+1] = items |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
154 |
i += len(items) - 1 |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
155 |
i += 1 |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
156 |
return blocks |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
157 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
158 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
159 |
_optionre = re.compile(r'^(--[a-z-]+)((?:[ =][a-zA-Z][\w-]*)? +)(.*)$') |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
160 |
def findoptionlists(blocks): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
161 |
"""Finds option lists. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
162 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
163 |
The blocks must have a 'type' field, i.e., they should have been |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
164 |
run through findliteralblocks first. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
165 |
""" |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
166 |
i = 0 |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
167 |
while i < len(blocks): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
168 |
# Searching for a paragraph that looks like this: |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
169 |
# |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
170 |
# +----------------------------+-------------+ |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
171 |
# | "--" option " " | description | |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
172 |
# +-------+--------------------+ | |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
173 |
# | (body elements)+ | |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
174 |
# +----------------------------------+ |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
175 |
if (blocks[i]['type'] == 'paragraph' and |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
176 |
_optionre.match(blocks[i]['lines'][0])): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
177 |
options = [] |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
178 |
for line in blocks[i]['lines']: |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
179 |
m = _optionre.match(line) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
180 |
if m: |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
181 |
option, arg, rest = m.groups() |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
182 |
width = len(option) + len(arg) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
183 |
options.append(dict(type='option', lines=[], |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
184 |
indent=blocks[i]['indent'], |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
185 |
width=width)) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
186 |
options[-1]['lines'].append(line) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
187 |
blocks[i:i+1] = options |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
188 |
i += len(options) - 1 |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
189 |
i += 1 |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
190 |
return blocks |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
191 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
192 |
|
9293
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
193 |
_fieldre = re.compile(r':(?![: ])([^:]*)(?<! ):( +)(.*)') |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
194 |
def findfieldlists(blocks): |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
195 |
"""Finds fields lists. |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
196 |
|
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
197 |
The blocks must have a 'type' field, i.e., they should have been |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
198 |
run through findliteralblocks first. |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
199 |
""" |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
200 |
i = 0 |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
201 |
while i < len(blocks): |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
202 |
# Searching for a paragraph that looks like this: |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
203 |
# |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
204 |
# |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
205 |
# +--------------------+----------------------+ |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
206 |
# | ":" field name ":" | field body | |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
207 |
# +-------+------------+ | |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
208 |
# | (body elements)+ | |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
209 |
# +-----------------------------------+ |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
210 |
if (blocks[i]['type'] == 'paragraph' and |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
211 |
_fieldre.match(blocks[i]['lines'][0])): |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
212 |
indent = blocks[i]['indent'] |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
213 |
fields = [] |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
214 |
for line in blocks[i]['lines']: |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
215 |
m = _fieldre.match(line) |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
216 |
if m: |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
217 |
key, spaces, rest = m.groups() |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
218 |
width = 2 + len(key) + len(spaces) |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
219 |
fields.append(dict(type='field', lines=[], |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
220 |
indent=indent, width=width)) |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
221 |
# Turn ":foo: bar" into "foo bar". |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
222 |
line = '%s %s%s' % (key, spaces, rest) |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
223 |
fields[-1]['lines'].append(line) |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
224 |
blocks[i:i+1] = fields |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
225 |
i += len(fields) - 1 |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
226 |
i += 1 |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
227 |
return blocks |
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
228 |
|
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
229 |
|
9156
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
230 |
def finddefinitionlists(blocks): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
231 |
"""Finds definition lists. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
232 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
233 |
The blocks must have a 'type' field, i.e., they should have been |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
234 |
run through findliteralblocks first. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
235 |
""" |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
236 |
i = 0 |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
237 |
while i < len(blocks): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
238 |
# Searching for a paragraph that looks like this: |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
239 |
# |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
240 |
# +----------------------------+ |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
241 |
# | term | |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
242 |
# +--+-------------------------+--+ |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
243 |
# | definition | |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
244 |
# | (body elements)+ | |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
245 |
# +----------------------------+ |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
246 |
if (blocks[i]['type'] == 'paragraph' and |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
247 |
len(blocks[i]['lines']) > 1 and |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
248 |
not blocks[i]['lines'][0].startswith(' ') and |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
249 |
blocks[i]['lines'][1].startswith(' ')): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
250 |
definitions = [] |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
251 |
for line in blocks[i]['lines']: |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
252 |
if not line.startswith(' '): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
253 |
definitions.append(dict(type='definition', lines=[], |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
254 |
indent=blocks[i]['indent'])) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
255 |
definitions[-1]['lines'].append(line) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
256 |
definitions[-1]['hang'] = len(line) - len(line.lstrip()) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
257 |
blocks[i:i+1] = definitions |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
258 |
i += len(definitions) - 1 |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
259 |
i += 1 |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
260 |
return blocks |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
261 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
262 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
263 |
def addmargins(blocks): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
264 |
"""Adds empty blocks for vertical spacing. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
265 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
266 |
This groups bullets, options, and definitions together with no vertical |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
267 |
space between them, and adds an empty block between all other blocks. |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
268 |
""" |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
269 |
i = 1 |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
270 |
while i < len(blocks): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
271 |
if (blocks[i]['type'] == blocks[i-1]['type'] and |
9293
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
272 |
blocks[i]['type'] in ('bullet', 'option', 'field', 'definition')): |
9156
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
273 |
i += 1 |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
274 |
else: |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
275 |
blocks.insert(i, dict(lines=[''], indent=0, type='margin')) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
276 |
i += 2 |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
277 |
return blocks |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
278 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
279 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
280 |
def formatblock(block, width): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
281 |
"""Format a block according to width.""" |
9417
4c3fb45123e5
util, minirst: do not crash with COLUMNS=0
Martin Geisler <mg@lazybytes.net>
parents:
9293
diff
changeset
|
282 |
if width <= 0: |
4c3fb45123e5
util, minirst: do not crash with COLUMNS=0
Martin Geisler <mg@lazybytes.net>
parents:
9293
diff
changeset
|
283 |
width = 78 |
9156
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
284 |
indent = ' ' * block['indent'] |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
285 |
if block['type'] == 'margin': |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
286 |
return '' |
9291
cd5b6a11b607
minirst: indent literal blocks with two spaces
Martin Geisler <mg@lazybytes.net>
parents:
9156
diff
changeset
|
287 |
elif block['type'] == 'literal': |
cd5b6a11b607
minirst: indent literal blocks with two spaces
Martin Geisler <mg@lazybytes.net>
parents:
9156
diff
changeset
|
288 |
indent += ' ' |
cd5b6a11b607
minirst: indent literal blocks with two spaces
Martin Geisler <mg@lazybytes.net>
parents:
9156
diff
changeset
|
289 |
return indent + ('\n' + indent).join(block['lines']) |
cd5b6a11b607
minirst: indent literal blocks with two spaces
Martin Geisler <mg@lazybytes.net>
parents:
9156
diff
changeset
|
290 |
elif block['type'] == 'section': |
9156
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
291 |
return indent + ('\n' + indent).join(block['lines']) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
292 |
elif block['type'] == 'definition': |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
293 |
term = indent + block['lines'][0] |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
294 |
defindent = indent + block['hang'] * ' ' |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
295 |
text = ' '.join(map(str.strip, block['lines'][1:])) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
296 |
return "%s\n%s" % (term, textwrap.fill(text, width=width, |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
297 |
initial_indent=defindent, |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
298 |
subsequent_indent=defindent)) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
299 |
else: |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
300 |
initindent = subindent = indent |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
301 |
text = ' '.join(map(str.strip, block['lines'])) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
302 |
if block['type'] == 'bullet': |
9292
01e580143423
minirst: simplify bullet list indentation computation
Martin Geisler <mg@lazybytes.net>
parents:
9291
diff
changeset
|
303 |
initindent = indent + '- ' |
01e580143423
minirst: simplify bullet list indentation computation
Martin Geisler <mg@lazybytes.net>
parents:
9291
diff
changeset
|
304 |
subindent = indent + ' ' |
9293
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
305 |
elif block['type'] in ('option', 'field'): |
9156
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
306 |
subindent = indent + block['width'] * ' ' |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
307 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
308 |
return textwrap.fill(text, width=width, |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
309 |
initial_indent=initindent, |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
310 |
subsequent_indent=subindent) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
311 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
312 |
|
9540
cad36e496640
help: un-indent help topics
Martin Geisler <mg@lazybytes.net>
parents:
9417
diff
changeset
|
313 |
def format(text, width, indent=0): |
9156
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
314 |
"""Parse and format the text according to width.""" |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
315 |
blocks = findblocks(text) |
9540
cad36e496640
help: un-indent help topics
Martin Geisler <mg@lazybytes.net>
parents:
9417
diff
changeset
|
316 |
for b in blocks: |
cad36e496640
help: un-indent help topics
Martin Geisler <mg@lazybytes.net>
parents:
9417
diff
changeset
|
317 |
b['indent'] += indent |
9156
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
318 |
blocks = findliteralblocks(blocks) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
319 |
blocks = findsections(blocks) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
320 |
blocks = findbulletlists(blocks) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
321 |
blocks = findoptionlists(blocks) |
9293
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
322 |
blocks = findfieldlists(blocks) |
9156
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
323 |
blocks = finddefinitionlists(blocks) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
324 |
blocks = addmargins(blocks) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
325 |
return '\n'.join(formatblock(b, width) for b in blocks) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
326 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
327 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
328 |
if __name__ == "__main__": |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
329 |
from pprint import pprint |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
330 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
331 |
def debug(func, blocks): |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
332 |
blocks = func(blocks) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
333 |
print "*** after %s:" % func.__name__ |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
334 |
pprint(blocks) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
335 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
336 |
return blocks |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
337 |
|
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
338 |
text = open(sys.argv[1]).read() |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
339 |
blocks = debug(findblocks, text) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
340 |
blocks = debug(findliteralblocks, blocks) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
341 |
blocks = debug(findsections, blocks) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
342 |
blocks = debug(findbulletlists, blocks) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
343 |
blocks = debug(findoptionlists, blocks) |
9293
e48a48b754d3
minirst: parse field lists
Martin Geisler <mg@lazybytes.net>
parents:
9292
diff
changeset
|
344 |
blocks = debug(findfieldlists, blocks) |
9156
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
345 |
blocks = debug(finddefinitionlists, blocks) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
346 |
blocks = debug(addmargins, blocks) |
c9c7e8cdac9c
minimal reStructuredText parser
Martin Geisler <mg@lazybytes.net>
parents:
diff
changeset
|
347 |
print '\n'.join(formatblock(b, 30) for b in blocks) |