comparison mercurial/helptext/internals/dirstate-v2.txt @ 48188:77fc340acad7

dirstate-v2: Document flags/mode/size/mtime fields of tree nodes This file format modification was previously left incomplete because of planned upcoming changes. Not all of these changes have been made yet, but documenting what exists today will help talking more widely about it. Differential Revision: https://phab.mercurial-scm.org/D11625
author Simon Sapin <simon.sapin@octobus.net>
date Mon, 11 Oct 2021 18:23:17 +0200
parents eb8092f9304f
children 6e01bcd111d2
comparison
equal deleted inserted replaced
48187:b669e40fbbd6 48188:77fc340acad7
369 not including this node itself, 369 not including this node itself,
370 that represent files tracked in the working directory. 370 that represent files tracked in the working directory.
371 (For example, `hg rm` makes a file untracked.) 371 (For example, `hg rm` makes a file untracked.)
372 This counter is used to implement `has_tracked_dir`. 372 This counter is used to implement `has_tracked_dir`.
373 373
374 * Offset 30 and more: 374 * Offset 30:
375 **TODO:** docs not written yet 375 Some boolean values packed as bits of a single byte.
376 as this part of the format might be changing soon. 376 Starting from least-significant, bit masks are::
377
378 WDIR_TRACKED = 1 << 0
379 P1_TRACKED = 1 << 1
380 P2_INFO = 1 << 2
381 HAS_MODE_AND_SIZE = 1 << 3
382 HAS_MTIME = 1 << 4
383
384 Other bits are unset. The meaning of these bits are:
385
386 `WDIR_TRACKED`
387 Set if the working directory contains a tracked file at this node’s path.
388 This is typically set and unset by `hg add` and `hg rm`.
389
390 `P1_TRACKED`
391 set if the working directory’s first parent changeset
392 (whose node identifier is found in tree metadata)
393 contains a tracked file at this node’s path.
394 This is a cache to reduce manifest lookups.
395
396 `P2_INFO`
397 Set if the file has been involved in some merge operation.
398 Either because it was actually merged,
399 or because the version in the second parent p2 version was ahead,
400 or because some rename moved it there.
401 In either case `hg status` will want it displayed as modified.
402
403 Files that would be mentioned at all in the `dirstate-v1` file format
404 have a node with at least one of the above three bits set in `dirstate-v2`.
405 Let’s call these files "tracked anywhere",
406 and "untracked" the nodes with all three of these bits unset.
407 Untracked nodes are typically for directories:
408 they hold child nodes and form the tree structure.
409 Additional untracked nodes may also exist.
410 Although implementations should strive to clean up nodes
411 that are entirely unused, other untracked nodes may also exist.
412 For example, a future version of Mercurial might in some cases
413 add nodes for untracked files or/and ignored files in the working directory
414 in order to optimize `hg status`
415 by enabling it to skip `readdir` in more cases.
416
417 When a node is for a file tracked anywhere,
418 the rest of the node data is three fields:
419
420 * Offset 31:
421 If `HAS_MODE_AND_SIZE` is unset, four zero bytes.
422 Otherwise, a 32-bit integer for the Unix mode (as in `stat_result.st_mode`)
423 expected for this file to be considered clean.
424 Only the `S_IXUSR` bit (owner has execute permission) is considered.
425
426 * Offset 35:
427 If `HAS_MTIME` is unset, four zero bytes.
428 Otherwise, a 32-bit integer for expected modified time of the file
429 (as in `stat_result.st_mtime`),
430 truncated to its 31 least-significant bits.
431 Unlike in dirstate-v1, negative values are not used.
432
433 * Offset 39:
434 If `HAS_MODE_AND_SIZE` is unset, four zero bytes.
435 Otherwise, a 32-bit integer for expected size of the file
436 truncated to its 31 least-significant bits.
437 Unlike in dirstate-v1, negative values are not used.
438
439 If an untracked node `HAS_MTIME` *unset*, this space is unused:
440
441 * Offset 31:
442 12 bytes set to zero
443
444 If an untracked node `HAS_MTIME` *set*,
445 what follows is the modification time of a directory
446 represented with separated second and sub-second components
447 since the Unix epoch:
448
449 * Offset 31:
450 The number of seconds as a signed (two’s complement) 64-bit integer.
451
452 * Offset 39:
453 The number of nanoseconds as 32-bit integer.
454 Always greater than or equal to zero, and strictly less than a billion.
455 Increasing this component makes the modification time
456 go forward or backward in time dependening
457 on the sign of the integral seconds components.
458 (Note: this is buggy because there is no negative zero integer,
459 but will be changed soon.)
460
461 The presence of a directory modification time means that at some point,
462 this path in the working directory was observed:
463
464 - To be a directory
465 - With the given modification time
466 - That time was already strictly in the past when observed,
467 meaning that later changes cannot happen in the same clock tick
468 and must cause a different modification time
469 (unless the system clock jumps back and we get unlucky,
470 which is not impossible but deemed unlikely enough).
471 - All direct children of this directory
472 (as returned by `std::fs::read_dir`)
473 either have a corresponding dirstate node,
474 or are ignored by ignore patterns whose hash is in tree metadata.
475
476 This means that if `std::fs::symlink_metadata` later reports
477 the same modification time
478 and ignored patterns haven’t changed,
479 a run of status that is not listing ignored files
480 can skip calling `std::fs::read_dir` again for this directory,
481 and iterate child dirstate nodes instead.
482
483
484 * (Offset 43: end of this node)