Fri, 14 Aug 2020 20:45:49 -0700 worker: don't expose readinto() on _blockingreader since pickle is picky
Martin von Zweigbergk <martinvonz@google.com> [Fri, 14 Aug 2020 20:45:49 -0700] rev 45409
worker: don't expose readinto() on _blockingreader since pickle is picky The `pickle` module expects the input to be buffered and a whole object to be available when `pickle.load()` is called, which is not necessarily true when we send data from workers back to the parent process (i.e., it seems like a bad assumption for the `pickle` module to make). We added a workaround for that in https://phab.mercurial-scm.org/D8076, which made `read()` continue until all the requested bytes have been read. As we found out at work after a lot of investigation (I've spent the last two days on this), the native version of `pickle.load()` has started calling `readinto()` on the input since Python 3.8. That started being called in https://github.com/python/cpython/commit/91f4380cedbae32b49adbea2518014a5624c6523 (and only by the C version of `pickle.load()`)). Before that, it was only `read()` and `readline()` that were called. The problem with that was that `readinto()` on our `_blockingreader` was simply delegating to the underlying, *unbuffered* object. The symptom we saw was that `hg fix` started failing sometimes on Python 3.8 on Mac. It failed very relyable in some cases. I still haven't figured out under what circumstances it fails and I've been unable to reproduce it in test cases (I've tried writing larger amounts of data, using different numbers of workers, and making the formatters sleep). I have, however, been able to reproduce it 3-4 times on Linux, but then it stopped reproducing on the following few hundred attempts. To fix the problem, we can simply remove the implementation of `readinto()`, since the unpickler will then fall back to calling `read()`. The fallback was added a bit later, in https://github.com/python/cpython/commit/b19f7ecfa3adc6ba1544225317b9473649815b38. However, that commit also added checking that what `read()` returns is a `bytes`, so we also need to convert the `bytearray` we use into that. I was able to add a test for that failure at least. Differential Revision: https://phab.mercurial-scm.org/D8928
Tue, 18 Aug 2020 15:03:57 -0700 commit: clear mergestate also with --amend (issue6304)
Martin von Zweigbergk <martinvonz@google.com> [Tue, 18 Aug 2020 15:03:57 -0700] rev 45408
commit: clear mergestate also with --amend (issue6304) The `hg commit --amend` uses the in-memory code, which naturally doesn't touch the merge state (well, it shouldn't anyway; I think I've fixed bugs in that area recently). We therefore need to clear the mergestate after calling `repo.commitctx()` since we expect that from `hg commit --amend`. Differential Revision: https://phab.mercurial-scm.org/D8932
Tue, 18 Aug 2020 14:26:49 -0700 tests: add test showing that merge state is not cleared by amend
Martin von Zweigbergk <martinvonz@google.com> [Tue, 18 Aug 2020 14:26:49 -0700] rev 45407
tests: add test showing that merge state is not cleared by amend This is slightly modified version of the test case I provided in issue6304. Differential Revision: https://phab.mercurial-scm.org/D8931
Tue, 11 Aug 2020 13:43:43 +0530 requirements: introduce constants for `shared` and `relshared` requirements
Pulkit Goyal <7895pulkit@gmail.com> [Tue, 11 Aug 2020 13:43:43 +0530] rev 45406
requirements: introduce constants for `shared` and `relshared` requirements We add them to `WORKING_DIR_REQUIREMENTS` too as they should be stored in `.hg/requires` and have information about the type of working copy. Differential Revision: https://phab.mercurial-scm.org/D8926
Mon, 10 Aug 2020 15:47:21 +0530 mergestate: replace `addmergedother()` with generic `addcommitinfo()` (API)
Pulkit Goyal <7895pulkit@gmail.com> [Mon, 10 Aug 2020 15:47:21 +0530] rev 45405
mergestate: replace `addmergedother()` with generic `addcommitinfo()` (API) Storing that a file is resolved for the other parent while merging is just one case of things we will like to store in the mergestate. There are more which we will like to store. This patch replaces `addmergedother()` with a much more generic `addcommitinfo()`. Doing this, we also blinding stores the same key value pair generated by the merge code instead of touching them. Differential Revision: https://phab.mercurial-scm.org/D8923
Mon, 10 Aug 2020 15:38:45 +0530 merge: introduce `addcommitinfo()` on mergeresult object
Pulkit Goyal <7895pulkit@gmail.com> [Mon, 10 Aug 2020 15:38:45 +0530] rev 45404
merge: introduce `addcommitinfo()` on mergeresult object This makes code little bit nicer as we directly update information in the mergeresult object instead of building up a dict first and then setting it. Differential Revision: https://phab.mercurial-scm.org/D8922
Mon, 10 Aug 2020 15:34:27 +0530 merge: use collections.defaultdict() for mergeresult.commitinfo
Pulkit Goyal <7895pulkit@gmail.com> [Mon, 10 Aug 2020 15:34:27 +0530] rev 45403
merge: use collections.defaultdict() for mergeresult.commitinfo We will be storing info from mergeresult.commitinfo to mergestate._stateextras in upcoming patches, let's make them use same structure so that we don't have to make much efforts in transferring info from one to other. Differential Revision: https://phab.mercurial-scm.org/D8921
Mon, 10 Aug 2020 15:29:02 +0530 mergestate: use _stateextras instead of merge records for commit related info
Pulkit Goyal <7895pulkit@gmail.com> [Mon, 10 Aug 2020 15:29:02 +0530] rev 45402
mergestate: use _stateextras instead of merge records for commit related info There is a set of information related to a merge which is needed on commit. We want to store such information in the mergestate so that we can read it while committing. For this purpose, we are using merge records and introduced a merge entry state for that. However this won't scale and is not clean way to implement this. This patch reworks the existing logic related to this to use _stateextras and read from it. Right now the information stored is not very descriptive but it will be in next patch. Using _stateextras also makes MERGE_RECORD_MERGED_OTHER useless and only to be kept for BC. Differential Revision: https://phab.mercurial-scm.org/D8920
Mon, 10 Aug 2020 15:09:44 +0530 mergestate: use collections.defaultdict(dict) for _stateextras
Pulkit Goyal <7895pulkit@gmail.com> [Mon, 10 Aug 2020 15:09:44 +0530] rev 45401
mergestate: use collections.defaultdict(dict) for _stateextras I want to use this _stateextras more in upcoming patches to store some commit time related information. Using defaultdict will help in cleaner code around checking whether a file exists or not. Differential Revision: https://phab.mercurial-scm.org/D8919
Mon, 03 Aug 2020 23:41:50 -0700 hgweb: minimize scope of a try-block in staticfile()
Martin von Zweigbergk <martinvonz@google.com> [Mon, 03 Aug 2020 23:41:50 -0700] rev 45400
hgweb: minimize scope of a try-block in staticfile() I think the exceptions are only relevant for the `os.stat()` and `open()` calls, and maybe to the `fh.read()` call. Differential Revision: https://phab.mercurial-scm.org/D8936
Mon, 03 Aug 2020 23:38:50 -0700 hgweb: ignore web.templates config when guessing mime type for static content
Martin von Zweigbergk <martinvonz@google.com> [Mon, 03 Aug 2020 23:38:50 -0700] rev 45399
hgweb: ignore web.templates config when guessing mime type for static content Frozen binaries won't have a file-system path for static content, so I'd like to remove dependence on that. From the documentation, it seems like `mimetypes.guess_type()` only cares about the suffix, so I think it should be enough to pass in just path under the `web.templates` directory. Differential Revision: https://phab.mercurial-scm.org/D8935
Sat, 22 Aug 2020 16:03:44 -0700 hgweb: let staticfile() look up path from default location unless provided
Martin von Zweigbergk <martinvonz@google.com> [Sat, 22 Aug 2020 16:03:44 -0700] rev 45398
hgweb: let staticfile() look up path from default location unless provided This reduces duplication between the two callers. Differential Revision: https://phab.mercurial-scm.org/D8934
Mon, 03 Aug 2020 22:40:05 -0700 hgweb: handle None from templatedir() equally bad in webcommands.py
Martin von Zweigbergk <martinvonz@google.com> [Mon, 03 Aug 2020 22:40:05 -0700] rev 45397
hgweb: handle None from templatedir() equally bad in webcommands.py The following paragraph is based just on my reading of the code; I have not tried to test it. Before my recent work on templates in frozen binaries, it seems both `hgwebdir_mod.py` and `webcommands.py` would pass in an empty list into `staticfile()` when running in a frozen binary. That would then result in a variable in that function (`path`) not getting bound before its first use. I then changed that without thinking in D8786 so we passed a `None` value into the function, which made it break in another way (trying to iterate over `None`). Then I tried to fix it up in D8810, but I only changed `hgwebdir_mod.py` for some reason, and it still doesn't actually work in frozen binaries (which seems fair, since was broken before my changes too). This patch just replicates the half-assed "fix" from D8810 in `webcommands.py`, so they look more similar so I can start refactoring them in the same way. Differential Revision: https://phab.mercurial-scm.org/D8933
Thu, 13 Aug 2020 10:37:25 -0700 posixworker: avoid creating workers that end up getting no work
Martin von Zweigbergk <martinvonz@google.com> [Thu, 13 Aug 2020 10:37:25 -0700] rev 45396
posixworker: avoid creating workers that end up getting no work If `workers` (the detected or configured number of CPUs) is greater than the number of work items, then some of the workers end up getting 0 work items. Let's not create such workers. Differential Revision: https://phab.mercurial-scm.org/D8927
(0) -30000 -10000 -3000 -1000 -300 -100 -14 +14 +100 +300 +1000 +3000 tip