Mercurial > hg
changeset 33389:7e89bd0cfb86
localrepo: cache types for filtered repos (issue5043)
Python introduces a reference cycle on dynamically created types
via __mro__, making them very easy to leak. See
https://bugs.python.org/issue17950.
Previously, repo.filtered() created a type on every invocation.
Long-running processes (like `hg convert`) could call this
function thousands of times, leading to a steady memory leak.
Since we're Unable to stop the leak because this is a bug in
Python, the next best thing is to contain it.
This patch adds a cache of of the dynamically generated repoview/filter
types on the localrepo object. Since we only generate each type
once, we cap the amount of memory that can leak to something
reasonable.
After this change, `hg convert` no longer leaks memory on every
revision. The process will likely grow memory usage over time due
to e.g. larger manifests. But there are no leaks.
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Sat, 01 Jul 2017 20:51:19 -0700 |
parents | 0823f0983eaa |
children | 32331f54930c |
files | mercurial/localrepo.py mercurial/statichttprepo.py |
diffstat | 2 files changed, 20 insertions(+), 5 deletions(-) [+] |
line wrap: on
line diff
--- a/mercurial/localrepo.py Tue Jul 11 02:10:04 2017 +0900 +++ b/mercurial/localrepo.py Sat Jul 01 20:51:19 2017 -0700 @@ -430,6 +430,9 @@ # post-dirstate-status hooks self._postdsstatus = [] + # Cache of types representing filtered repos. + self._filteredrepotypes = weakref.WeakKeyDictionary() + # generic mapping between names and nodes self.names = namespaces.namespaces() @@ -539,11 +542,21 @@ def filtered(self, name): """Return a filtered version of a repository""" - # build a new class with the mixin and the current class - # (possibly subclass of the repo) - class filteredrepo(repoview.repoview, self.unfiltered().__class__): - pass - return filteredrepo(self, name) + # Python <3.4 easily leaks types via __mro__. See + # https://bugs.python.org/issue17950. We cache dynamically + # created types so this method doesn't leak on every + # invocation. + + key = self.unfiltered().__class__ + if key not in self._filteredrepotypes: + # Build a new type with the repoview mixin and the base + # class of this repo. Give it a name containing the + # filter name to aid debugging. + bases = (repoview.repoview, key) + cls = type('%sfilteredrepo' % name, bases, {}) + self._filteredrepotypes[key] = cls + + return self._filteredrepotypes[key](self, name) @repofilecache('bookmarks', 'bookmarks.current') def _bookmarks(self):
--- a/mercurial/statichttprepo.py Tue Jul 11 02:10:04 2017 +0900 +++ b/mercurial/statichttprepo.py Sat Jul 01 20:51:19 2017 -0700 @@ -165,6 +165,8 @@ self.encodepats = None self.decodepats = None self._transref = None + # Cache of types representing filtered repos. + self._filteredrepotypes = {} def _restrictcapabilities(self, caps): caps = super(statichttprepository, self)._restrictcapabilities(caps)