Mercurial > hg
annotate mercurial/mpatch.h @ 37711:65a23cc8e75b
cborutil: implement support for streaming encoding, bytestring decoding
The vendored cbor2 package is... a bit disappointing.
On the encoding side, it insists that you pass it something with
a write() to send data to. That means if you want to emit data to
a generator, you have to construct an e.g. io.BytesIO(), write()
to it, then get the data back out. There can be non-trivial overhead
involved.
The encoder also doesn't support indefinite types - bytestrings, arrays,
and maps that don't have a known length. Again, this is really
unfortunate because it requires you to buffer the entire source and
destination in memory to encode large things.
On the decoding side, it supports reading indefinite length types.
But it buffers them completely before returning. More sadness.
This commit implements "streaming" encoders for various CBOR types.
Encoding emits a generator of hunks. So you can efficiently stream
encoded data elsewhere.
It also implements support for emitting indefinite length bytestrings,
arrays, and maps.
On the decoding side, we only implement support for decoding an
indefinite length bytestring from a file object. It will emit a
generator of raw chunks from the source.
I didn't want to reinvent so many wheels. But profiling the wire
protocol revealed that the overhead of constructing io.BytesIO()
instances to temporarily hold results has a non-trivial overhead.
We're talking >15% of execution time for operations like
"transfer the fulltexts of all files in a revision." So I can
justify this effort.
Fortunately, CBOR is a relatively straightforward format. And we have
a reference implementation in the repo we can test against.
Differential Revision: https://phab.mercurial-scm.org/D3303
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Sat, 14 Apr 2018 16:36:15 -0700 |
parents | 761355833867 |
children | d86908050375 |
rev | line source |
---|---|
29693
b9b9f9a92481
mpatch: split mpatch into two files
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff
changeset
|
1 #ifndef _HG_MPATCH_H_ |
b9b9f9a92481
mpatch: split mpatch into two files
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff
changeset
|
2 #define _HG_MPATCH_H_ |
b9b9f9a92481
mpatch: split mpatch into two files
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff
changeset
|
3 |
29694
55dd12204b8e
mpatch: remove dependency on Python.h in mpatch.c
Maciej Fijalkowski <fijall@gmail.com>
parents:
29693
diff
changeset
|
4 #define MPATCH_ERR_NO_MEM -3 |
55dd12204b8e
mpatch: remove dependency on Python.h in mpatch.c
Maciej Fijalkowski <fijall@gmail.com>
parents:
29693
diff
changeset
|
5 #define MPATCH_ERR_CANNOT_BE_DECODED -2 |
55dd12204b8e
mpatch: remove dependency on Python.h in mpatch.c
Maciej Fijalkowski <fijall@gmail.com>
parents:
29693
diff
changeset
|
6 #define MPATCH_ERR_INVALID_PATCH -1 |
55dd12204b8e
mpatch: remove dependency on Python.h in mpatch.c
Maciej Fijalkowski <fijall@gmail.com>
parents:
29693
diff
changeset
|
7 |
29693
b9b9f9a92481
mpatch: split mpatch into two files
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff
changeset
|
8 struct mpatch_frag { |
b9b9f9a92481
mpatch: split mpatch into two files
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff
changeset
|
9 int start, end, len; |
b9b9f9a92481
mpatch: split mpatch into two files
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff
changeset
|
10 const char *data; |
b9b9f9a92481
mpatch: split mpatch into two files
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff
changeset
|
11 }; |
b9b9f9a92481
mpatch: split mpatch into two files
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff
changeset
|
12 |
b9b9f9a92481
mpatch: split mpatch into two files
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff
changeset
|
13 struct mpatch_flist { |
b9b9f9a92481
mpatch: split mpatch into two files
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff
changeset
|
14 struct mpatch_frag *base, *head, *tail; |
b9b9f9a92481
mpatch: split mpatch into two files
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff
changeset
|
15 }; |
b9b9f9a92481
mpatch: split mpatch into two files
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff
changeset
|
16 |
34800
761355833867
mpatch: reformat function prototypes with clang-format
Augie Fackler <augie@google.com>
parents:
29749
diff
changeset
|
17 int mpatch_decode(const char *bin, ssize_t len, struct mpatch_flist **res); |
29693
b9b9f9a92481
mpatch: split mpatch into two files
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff
changeset
|
18 ssize_t mpatch_calcsize(ssize_t len, struct mpatch_flist *l); |
b9b9f9a92481
mpatch: split mpatch into two files
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff
changeset
|
19 void mpatch_lfree(struct mpatch_flist *a); |
b9b9f9a92481
mpatch: split mpatch into two files
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff
changeset
|
20 int mpatch_apply(char *buf, const char *orig, ssize_t len, |
34800
761355833867
mpatch: reformat function prototypes with clang-format
Augie Fackler <augie@google.com>
parents:
29749
diff
changeset
|
21 struct mpatch_flist *l); |
761355833867
mpatch: reformat function prototypes with clang-format
Augie Fackler <augie@google.com>
parents:
29749
diff
changeset
|
22 struct mpatch_flist * |
761355833867
mpatch: reformat function prototypes with clang-format
Augie Fackler <augie@google.com>
parents:
29749
diff
changeset
|
23 mpatch_fold(void *bins, struct mpatch_flist *(*get_next_item)(void *, ssize_t), |
761355833867
mpatch: reformat function prototypes with clang-format
Augie Fackler <augie@google.com>
parents:
29749
diff
changeset
|
24 ssize_t start, ssize_t end); |
29693
b9b9f9a92481
mpatch: split mpatch into two files
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff
changeset
|
25 |
b9b9f9a92481
mpatch: split mpatch into two files
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff
changeset
|
26 #endif |