Mercurial > hg
view contrib/fuzz/dirstate_corpus.py @ 50400:95acba2c29f6
encoding: avoid quadratic time complexity when json-encoding non-UTF8 strings
Apparently the code uses "+=" with a bytes object, which is linear-time, so the
whole encoding is quadratic-time. This patch makes us use a bytearray object,
instead, which has a(n amortized-)constant-time append operation.
The encoding is still not particularly fast, but at least a 10MB file
takes tens of seconds, not many hours to encode.
author | Arseniy Alekseyev <aalekseyev@janestreet.com> |
---|---|
date | Mon, 06 Mar 2023 11:27:57 +0000 |
parents | 6000f5b25c9b |
children |
line wrap: on
line source
import argparse import os import zipfile ap = argparse.ArgumentParser() ap.add_argument("out", metavar="some.zip", type=str, nargs=1) args = ap.parse_args() reporoot = os.path.normpath(os.path.join(os.path.dirname(__file__), '..', '..')) dirstate = os.path.join(reporoot, '.hg', 'dirstate') with zipfile.ZipFile(args.out[0], "w", zipfile.ZIP_STORED) as zf: if os.path.exists(dirstate): with open(dirstate, 'rb') as f: zf.writestr("dirstate", f.read())