view tests/test-mactext.t @ 50400:95acba2c29f6

encoding: avoid quadratic time complexity when json-encoding non-UTF8 strings Apparently the code uses "+=" with a bytes object, which is linear-time, so the whole encoding is quadratic-time. This patch makes us use a bytearray object, instead, which has a(n amortized-)constant-time append operation. The encoding is still not particularly fast, but at least a 10MB file takes tens of seconds, not many hours to encode.
author Arseniy Alekseyev <aalekseyev@janestreet.com>
date Mon, 06 Mar 2023 11:27:57 +0000
parents 768056549737
children
line wrap: on
line source


  $ cat > unix2mac.py <<EOF
  > import sys
  > 
  > for path in sys.argv[1:]:
  >     data = open(path, 'rb').read()
  >     data = data.replace(b'\n', b'\r')
  >     open(path, 'wb').write(data)
  > EOF
  $ hg init
  $ echo '[hooks]' >> .hg/hgrc
  $ echo 'pretxncommit.cr = python:hgext.win32text.forbidcr' >> .hg/hgrc
  $ echo 'pretxnchangegroup.cr = python:hgext.win32text.forbidcr' >> .hg/hgrc
  $ cat .hg/hgrc
  [hooks]
  pretxncommit.cr = python:hgext.win32text.forbidcr
  pretxnchangegroup.cr = python:hgext.win32text.forbidcr

  $ echo hello > f
  $ hg add f
  $ hg ci -m 1

  $ "$PYTHON" unix2mac.py f
  $ hg ci -m 2
  attempt to commit or push text file(s) using CR line endings
  in dea860dc51ec: f
  transaction abort!
  rollback completed
  abort: pretxncommit.cr hook failed
  [40]
  $ hg cat f | f --hexdump
  
  0000: 68 65 6c 6c 6f 0a                               |hello.|
  $ f --hexdump f
  f:
  0000: 68 65 6c 6c 6f 0d                               |hello.|