changeset 47246:02a4463565ea

revlog: improve documentation of the entry tuple The code in revlog, and outside revlog directly use the index's entry tuple, with direct integer indexing. This is a voluntary trade off to obtains better performance from the Python code at the expense of the developers sanity. Let's at least have a clear and central documentation about what this tuple is about. Differential Revision: https://phab.mercurial-scm.org/D10643
author Pierre-Yves David <pierre-yves.david@octobus.net>
date Mon, 03 May 2021 16:52:38 +0200
parents de63be070e02
children ba21cfd9b044
files mercurial/revlog.py
diffstat 1 files changed, 52 insertions(+), 0 deletions(-) [+]
line wrap: on
line diff
--- a/mercurial/revlog.py	Mon May 03 23:45:05 2021 +0200
+++ b/mercurial/revlog.py	Mon May 03 16:52:38 2021 +0200
@@ -284,6 +284,58 @@
     file handle, a filename, and an expected position. It should check whether
     the current position in the file handle is valid, and log/warn/fail (by
     raising).
+
+
+    Internal details
+    ----------------
+
+    A large part of the revlog logic deals with revisions' "index entries", tuple
+    objects that contains the same "items" whatever the revlog version.
+    Different versions will have different ways of storing these items (sometimes
+    not having them at all), but the tuple will always be the same. New fields
+    are usually added at the end to avoid breaking existing code that relies
+    on the existing order. The field are defined as follows:
+
+    [0] offset:
+            The byte index of the start of revision data chunk.
+            That value is shifted up by 16 bits. use "offset = field >> 16" to
+            retrieve it.
+
+        flags:
+            A flag field that carries special information or changes the behavior
+            of the revision. (see `REVIDX_*` constants for details)
+            The flag field only occupies the first 16 bits of this field,
+            use "flags = field & 0xFFFF" to retrieve the value.
+
+    [1] compressed length:
+            The size, in bytes, of the chunk on disk
+
+    [2] uncompressed length:
+            The size, in bytes, of the full revision once reconstructed.
+
+    [3] base rev:
+            Either the base of the revision delta chain (without general
+            delta), or the base of the delta (stored in the data chunk)
+            with general delta.
+
+    [4] link rev:
+            Changelog revision number of the changeset introducing this
+            revision.
+
+    [5] parent 1 rev:
+            Revision number of the first parent
+
+    [6] parent 2 rev:
+            Revision number of the second parent
+
+    [7] node id:
+            The node id of the current revision
+
+    [8] sidedata offset:
+            The byte index of the start of the revision's side-data chunk.
+
+    [9] sidedata chunk length:
+            The size, in bytes, of the revision's side-data chunk.
     """
 
     _flagserrorclass = error.RevlogError