Autogenerated HTML docs for v2.45.1-204-gd8ab1
[git-htmldocs.git] / gitformat-pack.html
blobb5bed832fffdae13420c875c1d147c021294419f
1 <?xml version="1.0" encoding="UTF-8"?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
3 "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
4 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
5 <head>
6 <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8" />
7 <meta name="generator" content="AsciiDoc 10.2.0" />
8 <title>gitformat-pack(5)</title>
9 <style type="text/css">
10 /* Shared CSS for AsciiDoc xhtml11 and html5 backends */
12 /* Default font. */
13 body {
14 font-family: Georgia,serif;
17 /* Title font. */
18 h1, h2, h3, h4, h5, h6,
19 div.title, caption.title,
20 thead, p.table.header,
21 #toctitle,
22 #author, #revnumber, #revdate, #revremark,
23 #footer {
24 font-family: Arial,Helvetica,sans-serif;
27 body {
28 margin: 1em 5% 1em 5%;
31 a {
32 color: blue;
33 text-decoration: underline;
35 a:visited {
36 color: fuchsia;
39 em {
40 font-style: italic;
41 color: navy;
44 strong {
45 font-weight: bold;
46 color: #083194;
49 h1, h2, h3, h4, h5, h6 {
50 color: #527bbd;
51 margin-top: 1.2em;
52 margin-bottom: 0.5em;
53 line-height: 1.3;
56 h1, h2, h3 {
57 border-bottom: 2px solid silver;
59 h2 {
60 padding-top: 0.5em;
62 h3 {
63 float: left;
65 h3 + * {
66 clear: left;
68 h5 {
69 font-size: 1.0em;
72 div.sectionbody {
73 margin-left: 0;
76 hr {
77 border: 1px solid silver;
80 p {
81 margin-top: 0.5em;
82 margin-bottom: 0.5em;
85 ul, ol, li > p {
86 margin-top: 0;
88 ul > li { color: #aaa; }
89 ul > li > * { color: black; }
91 .monospaced, code, pre {
92 font-family: "Courier New", Courier, monospace;
93 font-size: inherit;
94 color: navy;
95 padding: 0;
96 margin: 0;
98 pre {
99 white-space: pre-wrap;
102 #author {
103 color: #527bbd;
104 font-weight: bold;
105 font-size: 1.1em;
107 #email {
109 #revnumber, #revdate, #revremark {
112 #footer {
113 font-size: small;
114 border-top: 2px solid silver;
115 padding-top: 0.5em;
116 margin-top: 4.0em;
118 #footer-text {
119 float: left;
120 padding-bottom: 0.5em;
122 #footer-badges {
123 float: right;
124 padding-bottom: 0.5em;
127 #preamble {
128 margin-top: 1.5em;
129 margin-bottom: 1.5em;
131 div.imageblock, div.exampleblock, div.verseblock,
132 div.quoteblock, div.literalblock, div.listingblock, div.sidebarblock,
133 div.admonitionblock {
134 margin-top: 1.0em;
135 margin-bottom: 1.5em;
137 div.admonitionblock {
138 margin-top: 2.0em;
139 margin-bottom: 2.0em;
140 margin-right: 10%;
141 color: #606060;
144 div.content { /* Block element content. */
145 padding: 0;
148 /* Block element titles. */
149 div.title, caption.title {
150 color: #527bbd;
151 font-weight: bold;
152 text-align: left;
153 margin-top: 1.0em;
154 margin-bottom: 0.5em;
156 div.title + * {
157 margin-top: 0;
160 td div.title:first-child {
161 margin-top: 0.0em;
163 div.content div.title:first-child {
164 margin-top: 0.0em;
166 div.content + div.title {
167 margin-top: 0.0em;
170 div.sidebarblock > div.content {
171 background: #ffffee;
172 border: 1px solid #dddddd;
173 border-left: 4px solid #f0f0f0;
174 padding: 0.5em;
177 div.listingblock > div.content {
178 border: 1px solid #dddddd;
179 border-left: 5px solid #f0f0f0;
180 background: #f8f8f8;
181 padding: 0.5em;
184 div.quoteblock, div.verseblock {
185 padding-left: 1.0em;
186 margin-left: 1.0em;
187 margin-right: 10%;
188 border-left: 5px solid #f0f0f0;
189 color: #888;
192 div.quoteblock > div.attribution {
193 padding-top: 0.5em;
194 text-align: right;
197 div.verseblock > pre.content {
198 font-family: inherit;
199 font-size: inherit;
201 div.verseblock > div.attribution {
202 padding-top: 0.75em;
203 text-align: left;
205 /* DEPRECATED: Pre version 8.2.7 verse style literal block. */
206 div.verseblock + div.attribution {
207 text-align: left;
210 div.admonitionblock .icon {
211 vertical-align: top;
212 font-size: 1.1em;
213 font-weight: bold;
214 text-decoration: underline;
215 color: #527bbd;
216 padding-right: 0.5em;
218 div.admonitionblock td.content {
219 padding-left: 0.5em;
220 border-left: 3px solid #dddddd;
223 div.exampleblock > div.content {
224 border-left: 3px solid #dddddd;
225 padding-left: 0.5em;
228 div.imageblock div.content { padding-left: 0; }
229 span.image img { border-style: none; vertical-align: text-bottom; }
230 a.image:visited { color: white; }
232 dl {
233 margin-top: 0.8em;
234 margin-bottom: 0.8em;
236 dt {
237 margin-top: 0.5em;
238 margin-bottom: 0;
239 font-style: normal;
240 color: navy;
242 dd > *:first-child {
243 margin-top: 0.1em;
246 ul, ol {
247 list-style-position: outside;
249 ol.arabic {
250 list-style-type: decimal;
252 ol.loweralpha {
253 list-style-type: lower-alpha;
255 ol.upperalpha {
256 list-style-type: upper-alpha;
258 ol.lowerroman {
259 list-style-type: lower-roman;
261 ol.upperroman {
262 list-style-type: upper-roman;
265 div.compact ul, div.compact ol,
266 div.compact p, div.compact p,
267 div.compact div, div.compact div {
268 margin-top: 0.1em;
269 margin-bottom: 0.1em;
272 tfoot {
273 font-weight: bold;
275 td > div.verse {
276 white-space: pre;
279 div.hdlist {
280 margin-top: 0.8em;
281 margin-bottom: 0.8em;
283 div.hdlist tr {
284 padding-bottom: 15px;
286 dt.hdlist1.strong, td.hdlist1.strong {
287 font-weight: bold;
289 td.hdlist1 {
290 vertical-align: top;
291 font-style: normal;
292 padding-right: 0.8em;
293 color: navy;
295 td.hdlist2 {
296 vertical-align: top;
298 div.hdlist.compact tr {
299 margin: 0;
300 padding-bottom: 0;
303 .comment {
304 background: yellow;
307 .footnote, .footnoteref {
308 font-size: 0.8em;
311 span.footnote, span.footnoteref {
312 vertical-align: super;
315 #footnotes {
316 margin: 20px 0 20px 0;
317 padding: 7px 0 0 0;
320 #footnotes div.footnote {
321 margin: 0 0 5px 0;
324 #footnotes hr {
325 border: none;
326 border-top: 1px solid silver;
327 height: 1px;
328 text-align: left;
329 margin-left: 0;
330 width: 20%;
331 min-width: 100px;
334 div.colist td {
335 padding-right: 0.5em;
336 padding-bottom: 0.3em;
337 vertical-align: top;
339 div.colist td img {
340 margin-top: 0.3em;
343 @media print {
344 #footer-badges { display: none; }
347 #toc {
348 margin-bottom: 2.5em;
351 #toctitle {
352 color: #527bbd;
353 font-size: 1.1em;
354 font-weight: bold;
355 margin-top: 1.0em;
356 margin-bottom: 0.1em;
359 div.toclevel0, div.toclevel1, div.toclevel2, div.toclevel3, div.toclevel4 {
360 margin-top: 0;
361 margin-bottom: 0;
363 div.toclevel2 {
364 margin-left: 2em;
365 font-size: 0.9em;
367 div.toclevel3 {
368 margin-left: 4em;
369 font-size: 0.9em;
371 div.toclevel4 {
372 margin-left: 6em;
373 font-size: 0.9em;
376 span.aqua { color: aqua; }
377 span.black { color: black; }
378 span.blue { color: blue; }
379 span.fuchsia { color: fuchsia; }
380 span.gray { color: gray; }
381 span.green { color: green; }
382 span.lime { color: lime; }
383 span.maroon { color: maroon; }
384 span.navy { color: navy; }
385 span.olive { color: olive; }
386 span.purple { color: purple; }
387 span.red { color: red; }
388 span.silver { color: silver; }
389 span.teal { color: teal; }
390 span.white { color: white; }
391 span.yellow { color: yellow; }
393 span.aqua-background { background: aqua; }
394 span.black-background { background: black; }
395 span.blue-background { background: blue; }
396 span.fuchsia-background { background: fuchsia; }
397 span.gray-background { background: gray; }
398 span.green-background { background: green; }
399 span.lime-background { background: lime; }
400 span.maroon-background { background: maroon; }
401 span.navy-background { background: navy; }
402 span.olive-background { background: olive; }
403 span.purple-background { background: purple; }
404 span.red-background { background: red; }
405 span.silver-background { background: silver; }
406 span.teal-background { background: teal; }
407 span.white-background { background: white; }
408 span.yellow-background { background: yellow; }
410 span.big { font-size: 2em; }
411 span.small { font-size: 0.6em; }
413 span.underline { text-decoration: underline; }
414 span.overline { text-decoration: overline; }
415 span.line-through { text-decoration: line-through; }
417 div.unbreakable { page-break-inside: avoid; }
421 * xhtml11 specific
423 * */
425 div.tableblock {
426 margin-top: 1.0em;
427 margin-bottom: 1.5em;
429 div.tableblock > table {
430 border: 3px solid #527bbd;
432 thead, p.table.header {
433 font-weight: bold;
434 color: #527bbd;
436 p.table {
437 margin-top: 0;
439 /* Because the table frame attribute is overridden by CSS in most browsers. */
440 div.tableblock > table[frame="void"] {
441 border-style: none;
443 div.tableblock > table[frame="hsides"] {
444 border-left-style: none;
445 border-right-style: none;
447 div.tableblock > table[frame="vsides"] {
448 border-top-style: none;
449 border-bottom-style: none;
454 * html5 specific
456 * */
458 table.tableblock {
459 margin-top: 1.0em;
460 margin-bottom: 1.5em;
462 thead, p.tableblock.header {
463 font-weight: bold;
464 color: #527bbd;
466 p.tableblock {
467 margin-top: 0;
469 table.tableblock {
470 border-width: 3px;
471 border-spacing: 0px;
472 border-style: solid;
473 border-color: #527bbd;
474 border-collapse: collapse;
476 th.tableblock, td.tableblock {
477 border-width: 1px;
478 padding: 4px;
479 border-style: solid;
480 border-color: #527bbd;
483 table.tableblock.frame-topbot {
484 border-left-style: hidden;
485 border-right-style: hidden;
487 table.tableblock.frame-sides {
488 border-top-style: hidden;
489 border-bottom-style: hidden;
491 table.tableblock.frame-none {
492 border-style: hidden;
495 th.tableblock.halign-left, td.tableblock.halign-left {
496 text-align: left;
498 th.tableblock.halign-center, td.tableblock.halign-center {
499 text-align: center;
501 th.tableblock.halign-right, td.tableblock.halign-right {
502 text-align: right;
505 th.tableblock.valign-top, td.tableblock.valign-top {
506 vertical-align: top;
508 th.tableblock.valign-middle, td.tableblock.valign-middle {
509 vertical-align: middle;
511 th.tableblock.valign-bottom, td.tableblock.valign-bottom {
512 vertical-align: bottom;
517 * manpage specific
519 * */
521 body.manpage h1 {
522 padding-top: 0.5em;
523 padding-bottom: 0.5em;
524 border-top: 2px solid silver;
525 border-bottom: 2px solid silver;
527 body.manpage h2 {
528 border-style: none;
530 body.manpage div.sectionbody {
531 margin-left: 3em;
534 @media print {
535 body.manpage div#toc { display: none; }
539 </style>
540 <script type="text/javascript">
541 /*<![CDATA[*/
542 var asciidoc = { // Namespace.
544 /////////////////////////////////////////////////////////////////////
545 // Table Of Contents generator
546 /////////////////////////////////////////////////////////////////////
548 /* Author: Mihai Bazon, September 2002
549 * http://students.infoiasi.ro/~mishoo
551 * Table Of Content generator
552 * Version: 0.4
554 * Feel free to use this script under the terms of the GNU General Public
555 * License, as long as you do not remove or alter this notice.
558 /* modified by Troy D. Hanson, September 2006. License: GPL */
559 /* modified by Stuart Rackham, 2006, 2009. License: GPL */
561 // toclevels = 1..4.
562 toc: function (toclevels) {
564 function getText(el) {
565 var text = "";
566 for (var i = el.firstChild; i != null; i = i.nextSibling) {
567 if (i.nodeType == 3 /* Node.TEXT_NODE */) // IE doesn't speak constants.
568 text += i.data;
569 else if (i.firstChild != null)
570 text += getText(i);
572 return text;
575 function TocEntry(el, text, toclevel) {
576 this.element = el;
577 this.text = text;
578 this.toclevel = toclevel;
581 function tocEntries(el, toclevels) {
582 var result = new Array;
583 var re = new RegExp('[hH]([1-'+(toclevels+1)+'])');
584 // Function that scans the DOM tree for header elements (the DOM2
585 // nodeIterator API would be a better technique but not supported by all
586 // browsers).
587 var iterate = function (el) {
588 for (var i = el.firstChild; i != null; i = i.nextSibling) {
589 if (i.nodeType == 1 /* Node.ELEMENT_NODE */) {
590 var mo = re.exec(i.tagName);
591 if (mo && (i.getAttribute("class") || i.getAttribute("className")) != "float") {
592 result[result.length] = new TocEntry(i, getText(i), mo[1]-1);
594 iterate(i);
598 iterate(el);
599 return result;
602 var toc = document.getElementById("toc");
603 if (!toc) {
604 return;
607 // Delete existing TOC entries in case we're reloading the TOC.
608 var tocEntriesToRemove = [];
609 var i;
610 for (i = 0; i < toc.childNodes.length; i++) {
611 var entry = toc.childNodes[i];
612 if (entry.nodeName.toLowerCase() == 'div'
613 && entry.getAttribute("class")
614 && entry.getAttribute("class").match(/^toclevel/))
615 tocEntriesToRemove.push(entry);
617 for (i = 0; i < tocEntriesToRemove.length; i++) {
618 toc.removeChild(tocEntriesToRemove[i]);
621 // Rebuild TOC entries.
622 var entries = tocEntries(document.getElementById("content"), toclevels);
623 for (var i = 0; i < entries.length; ++i) {
624 var entry = entries[i];
625 if (entry.element.id == "")
626 entry.element.id = "_toc_" + i;
627 var a = document.createElement("a");
628 a.href = "#" + entry.element.id;
629 a.appendChild(document.createTextNode(entry.text));
630 var div = document.createElement("div");
631 div.appendChild(a);
632 div.className = "toclevel" + entry.toclevel;
633 toc.appendChild(div);
635 if (entries.length == 0)
636 toc.parentNode.removeChild(toc);
640 /////////////////////////////////////////////////////////////////////
641 // Footnotes generator
642 /////////////////////////////////////////////////////////////////////
644 /* Based on footnote generation code from:
645 * http://www.brandspankingnew.net/archive/2005/07/format_footnote.html
648 footnotes: function () {
649 // Delete existing footnote entries in case we're reloading the footnodes.
650 var i;
651 var noteholder = document.getElementById("footnotes");
652 if (!noteholder) {
653 return;
655 var entriesToRemove = [];
656 for (i = 0; i < noteholder.childNodes.length; i++) {
657 var entry = noteholder.childNodes[i];
658 if (entry.nodeName.toLowerCase() == 'div' && entry.getAttribute("class") == "footnote")
659 entriesToRemove.push(entry);
661 for (i = 0; i < entriesToRemove.length; i++) {
662 noteholder.removeChild(entriesToRemove[i]);
665 // Rebuild footnote entries.
666 var cont = document.getElementById("content");
667 var spans = cont.getElementsByTagName("span");
668 var refs = {};
669 var n = 0;
670 for (i=0; i<spans.length; i++) {
671 if (spans[i].className == "footnote") {
672 n++;
673 var note = spans[i].getAttribute("data-note");
674 if (!note) {
675 // Use [\s\S] in place of . so multi-line matches work.
676 // Because JavaScript has no s (dotall) regex flag.
677 note = spans[i].innerHTML.match(/\s*\[([\s\S]*)]\s*/)[1];
678 spans[i].innerHTML =
679 "[<a id='_footnoteref_" + n + "' href='#_footnote_" + n +
680 "' title='View footnote' class='footnote'>" + n + "</a>]";
681 spans[i].setAttribute("data-note", note);
683 noteholder.innerHTML +=
684 "<div class='footnote' id='_footnote_" + n + "'>" +
685 "<a href='#_footnoteref_" + n + "' title='Return to text'>" +
686 n + "</a>. " + note + "</div>";
687 var id =spans[i].getAttribute("id");
688 if (id != null) refs["#"+id] = n;
691 if (n == 0)
692 noteholder.parentNode.removeChild(noteholder);
693 else {
694 // Process footnoterefs.
695 for (i=0; i<spans.length; i++) {
696 if (spans[i].className == "footnoteref") {
697 var href = spans[i].getElementsByTagName("a")[0].getAttribute("href");
698 href = href.match(/#.*/)[0]; // Because IE return full URL.
699 n = refs[href];
700 spans[i].innerHTML =
701 "[<a href='#_footnote_" + n +
702 "' title='View footnote' class='footnote'>" + n + "</a>]";
708 install: function(toclevels) {
709 var timerId;
711 function reinstall() {
712 asciidoc.footnotes();
713 if (toclevels) {
714 asciidoc.toc(toclevels);
718 function reinstallAndRemoveTimer() {
719 clearInterval(timerId);
720 reinstall();
723 timerId = setInterval(reinstall, 500);
724 if (document.addEventListener)
725 document.addEventListener("DOMContentLoaded", reinstallAndRemoveTimer, false);
726 else
727 window.onload = reinstallAndRemoveTimer;
731 asciidoc.install();
732 /*]]>*/
733 </script>
734 </head>
735 <body class="manpage">
736 <div id="header">
737 <h1>
738 gitformat-pack(5) Manual Page
739 </h1>
740 <h2>NAME</h2>
741 <div class="sectionbody">
742 <p>gitformat-pack -
743 Git pack format
744 </p>
745 </div>
746 </div>
747 <div id="content">
748 <div class="sect1">
749 <h2 id="_synopsis">SYNOPSIS</h2>
750 <div class="sectionbody">
751 <div class="verseblock">
752 <pre class="content">$GIT_DIR/objects/pack/pack-<strong>.{pack,idx}
753 $GIT_DIR/objects/pack/pack-</strong>.rev
754 $GIT_DIR/objects/pack/pack-*.mtimes
755 $GIT_DIR/objects/pack/multi-pack-index</pre>
756 <div class="attribution">
757 </div></div>
758 </div>
759 </div>
760 <div class="sect1">
761 <h2 id="_description">DESCRIPTION</h2>
762 <div class="sectionbody">
763 <div class="paragraph"><p>The Git pack format is how Git stores most of its primary repository
764 data. Over the lifetime of a repository, loose objects (if any) and
765 smaller packs are consolidated into larger pack(s). See
766 <a href="git-gc.html">git-gc(1)</a> and <a href="git-pack-objects.html">git-pack-objects(1)</a>.</p></div>
767 <div class="paragraph"><p>The pack format is also used over-the-wire, see
768 e.g. <a href="gitprotocol-v2.html">gitprotocol-v2(5)</a>, as well as being a part of
769 other container formats in the case of <a href="gitformat-bundle.html">gitformat-bundle(5)</a>.</p></div>
770 </div>
771 </div>
772 <div class="sect1">
773 <h2 id="_checksums_and_object_ids">Checksums and object IDs</h2>
774 <div class="sectionbody">
775 <div class="paragraph"><p>In a repository using the traditional SHA-1, pack checksums, index checksums,
776 and object IDs (object names) mentioned below are all computed using SHA-1.
777 Similarly, in SHA-256 repositories, these values are computed using SHA-256.</p></div>
778 </div>
779 </div>
780 <div class="sect1">
781 <h2 id="_pack_pack_files_have_the_following_format">pack-*.pack files have the following format:</h2>
782 <div class="sectionbody">
783 <div class="ulist"><ul>
784 <li>
786 A header appears at the beginning and consists of the following:
787 </p>
788 <div class="literalblock">
789 <div class="content">
790 <pre><code>4-byte signature:
791 The signature is: {'P', 'A', 'C', 'K'}</code></pre>
792 </div></div>
793 <div class="literalblock">
794 <div class="content">
795 <pre><code>4-byte version number (network byte order):
796 Git currently accepts version number 2 or 3 but
797 generates version 2 only.</code></pre>
798 </div></div>
799 <div class="literalblock">
800 <div class="content">
801 <pre><code>4-byte number of objects contained in the pack (network byte order)</code></pre>
802 </div></div>
803 <div class="literalblock">
804 <div class="content">
805 <pre><code>Observation: we cannot have more than 4G versions ;-) and
806 more than 4G objects in a pack.</code></pre>
807 </div></div>
808 </li>
809 <li>
811 The header is followed by a number of object entries, each of
812 which looks like this:
813 </p>
814 <div class="literalblock">
815 <div class="content">
816 <pre><code>(undeltified representation)
817 n-byte type and length (3-bit type, (n-1)*7+4-bit length)
818 compressed data</code></pre>
819 </div></div>
820 <div class="literalblock">
821 <div class="content">
822 <pre><code>(deltified representation)
823 n-byte type and length (3-bit type, (n-1)*7+4-bit length)
824 base object name if OBJ_REF_DELTA or a negative relative
825 offset from the delta object's position in the pack if this
826 is an OBJ_OFS_DELTA object
827 compressed delta data</code></pre>
828 </div></div>
829 <div class="literalblock">
830 <div class="content">
831 <pre><code>Observation: the length of each object is encoded in a variable
832 length format and is not constrained to 32-bit or anything.</code></pre>
833 </div></div>
834 </li>
835 <li>
837 The trailer records a pack checksum of all of the above.
838 </p>
839 </li>
840 </ul></div>
841 <div class="sect2">
842 <h3 id="_object_types">Object types</h3>
843 <div class="paragraph"><p>Valid object types are:</p></div>
844 <div class="ulist"><ul>
845 <li>
847 OBJ_COMMIT (1)
848 </p>
849 </li>
850 <li>
852 OBJ_TREE (2)
853 </p>
854 </li>
855 <li>
857 OBJ_BLOB (3)
858 </p>
859 </li>
860 <li>
862 OBJ_TAG (4)
863 </p>
864 </li>
865 <li>
867 OBJ_OFS_DELTA (6)
868 </p>
869 </li>
870 <li>
872 OBJ_REF_DELTA (7)
873 </p>
874 </li>
875 </ul></div>
876 <div class="paragraph"><p>Type 5 is reserved for future expansion. Type 0 is invalid.</p></div>
877 </div>
878 <div class="sect2">
879 <h3 id="_size_encoding">Size encoding</h3>
880 <div class="paragraph"><p>This document uses the following "size encoding" of non-negative
881 integers: From each byte, the seven least significant bits are
882 used to form the resulting integer. As long as the most significant
883 bit is 1, this process continues; the byte with MSB 0 provides the
884 last seven bits. The seven-bit chunks are concatenated. Later
885 values are more significant.</p></div>
886 <div class="paragraph"><p>This size encoding should not be confused with the "offset encoding",
887 which is also used in this document.</p></div>
888 </div>
889 <div class="sect2">
890 <h3 id="_deltified_representation">Deltified representation</h3>
891 <div class="paragraph"><p>Conceptually there are only four object types: commit, tree, tag and
892 blob. However to save space, an object could be stored as a "delta" of
893 another "base" object. These representations are assigned new types
894 ofs-delta and ref-delta, which is only valid in a pack file.</p></div>
895 <div class="paragraph"><p>Both ofs-delta and ref-delta store the "delta" to be applied to
896 another object (called <em>base object</em>) to reconstruct the object. The
897 difference between them is, ref-delta directly encodes base object
898 name. If the base object is in the same pack, ofs-delta encodes
899 the offset of the base object in the pack instead.</p></div>
900 <div class="paragraph"><p>The base object could also be deltified if it&#8217;s in the same pack.
901 Ref-delta can also refer to an object outside the pack (i.e. the
902 so-called "thin pack"). When stored on disk however, the pack should
903 be self contained to avoid cyclic dependency.</p></div>
904 <div class="paragraph"><p>The delta data starts with the size of the base object and the
905 size of the object to be reconstructed. These sizes are
906 encoded using the size encoding from above. The remainder of
907 the delta data is a sequence of instructions to reconstruct the object
908 from the base object. If the base object is deltified, it must be
909 converted to canonical form first. Each instruction appends more and
910 more data to the target object until it&#8217;s complete. There are two
911 supported instructions so far: one for copying a byte range from the
912 source object and one for inserting new data embedded in the
913 instruction itself.</p></div>
914 <div class="paragraph"><p>Each instruction has variable length. Instruction type is determined
915 by the seventh bit of the first octet. The following diagrams follow
916 the convention in RFC 1951 (Deflate compressed data format).</p></div>
917 <div class="sect3">
918 <h4 id="_instruction_to_copy_from_base_object">Instruction to copy from base object</h4>
919 <div class="literalblock">
920 <div class="content">
921 <pre><code>+----------+---------+---------+---------+---------+-------+-------+-------+
922 | 1xxxxxxx | offset1 | offset2 | offset3 | offset4 | size1 | size2 | size3 |
923 +----------+---------+---------+---------+---------+-------+-------+-------+</code></pre>
924 </div></div>
925 <div class="paragraph"><p>This is the instruction format to copy a byte range from the source
926 object. It encodes the offset to copy from and the number of bytes to
927 copy. Offset and size are in little-endian order.</p></div>
928 <div class="paragraph"><p>All offset and size bytes are optional. This is to reduce the
929 instruction size when encoding small offsets or sizes. The first seven
930 bits in the first octet determine which of the next seven octets is
931 present. If bit zero is set, offset1 is present. If bit one is set
932 offset2 is present and so on.</p></div>
933 <div class="paragraph"><p>Note that a more compact instruction does not change offset and size
934 encoding. For example, if only offset2 is omitted like below, offset3
935 still contains bits 16-23. It does not become offset2 and contains
936 bits 8-15 even if it&#8217;s right next to offset1.</p></div>
937 <div class="literalblock">
938 <div class="content">
939 <pre><code>+----------+---------+---------+
940 | 10000101 | offset1 | offset3 |
941 +----------+---------+---------+</code></pre>
942 </div></div>
943 <div class="paragraph"><p>In its most compact form, this instruction only takes up one byte
944 (0x80) with both offset and size omitted, which will have default
945 values zero. There is another exception: size zero is automatically
946 converted to 0x10000.</p></div>
947 </div>
948 <div class="sect3">
949 <h4 id="_instruction_to_add_new_data">Instruction to add new data</h4>
950 <div class="literalblock">
951 <div class="content">
952 <pre><code>+----------+============+
953 | 0xxxxxxx | data |
954 +----------+============+</code></pre>
955 </div></div>
956 <div class="paragraph"><p>This is the instruction to construct the target object without the base
957 object. The following data is appended to the target object. The first
958 seven bits of the first octet determine the size of data in
959 bytes. The size must be non-zero.</p></div>
960 </div>
961 <div class="sect3">
962 <h4 id="_reserved_instruction">Reserved instruction</h4>
963 <div class="literalblock">
964 <div class="content">
965 <pre><code>+----------+============
966 | 00000000 |
967 +----------+============</code></pre>
968 </div></div>
969 <div class="paragraph"><p>This is the instruction reserved for future expansion.</p></div>
970 </div>
971 </div>
972 </div>
973 </div>
974 <div class="sect1">
975 <h2 id="_original_version_1_pack_idx_files_have_the_following_format">Original (version 1) pack-*.idx files have the following format:</h2>
976 <div class="sectionbody">
977 <div class="ulist"><ul>
978 <li>
980 The header consists of 256 4-byte network byte order
981 integers. N-th entry of this table records the number of
982 objects in the corresponding pack, the first byte of whose
983 object name is less than or equal to N. This is called the
984 <em>first-level fan-out</em> table.
985 </p>
986 </li>
987 <li>
989 The header is followed by sorted 24-byte entries, one entry
990 per object in the pack. Each entry is:
991 </p>
992 <div class="literalblock">
993 <div class="content">
994 <pre><code>4-byte network byte order integer, recording where the
995 object is stored in the packfile as the offset from the
996 beginning.</code></pre>
997 </div></div>
998 <div class="literalblock">
999 <div class="content">
1000 <pre><code>one object name of the appropriate size.</code></pre>
1001 </div></div>
1002 </li>
1003 <li>
1005 The file is concluded with a trailer:
1006 </p>
1007 <div class="literalblock">
1008 <div class="content">
1009 <pre><code>A copy of the pack checksum at the end of the corresponding
1010 packfile.</code></pre>
1011 </div></div>
1012 <div class="literalblock">
1013 <div class="content">
1014 <pre><code>Index checksum of all of the above.</code></pre>
1015 </div></div>
1016 </li>
1017 </ul></div>
1018 <div class="paragraph"><p>Pack Idx file:</p></div>
1019 <div class="literalblock">
1020 <div class="content">
1021 <pre><code> -- +--------------------------------+
1022 fanout | fanout[0] = 2 (for example) |-.
1023 table +--------------------------------+ |
1024 | fanout[1] | |
1025 +--------------------------------+ |
1026 | fanout[2] | |
1027 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
1028 | fanout[255] = total objects |---.
1029 -- +--------------------------------+ | |
1030 main | offset | | |
1031 index | object name 00XXXXXXXXXXXXXXXX | | |
1032 table +--------------------------------+ | |
1033 | offset | | |
1034 | object name 00XXXXXXXXXXXXXXXX | | |
1035 +--------------------------------+&lt;+ |
1036 .-| offset | |
1037 | | object name 01XXXXXXXXXXXXXXXX | |
1038 | +--------------------------------+ |
1039 | | offset | |
1040 | | object name 01XXXXXXXXXXXXXXXX | |
1041 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
1042 | | offset | |
1043 | | object name FFXXXXXXXXXXXXXXXX | |
1044 --| +--------------------------------+&lt;--+
1045 trailer | | packfile checksum |
1046 | +--------------------------------+
1047 | | idxfile checksum |
1048 | +--------------------------------+
1049 .-------.
1051 Pack file entry: &lt;+</code></pre>
1052 </div></div>
1053 <div class="literalblock">
1054 <div class="content">
1055 <pre><code>packed object header:
1056 1-byte size extension bit (MSB)
1057 type (next 3 bit)
1058 size0 (lower 4-bit)
1059 n-byte sizeN (as long as MSB is set, each 7-bit)
1060 size0..sizeN form 4+7+7+..+7 bit integer, size0
1061 is the least significant part, and sizeN is the
1062 most significant part.
1063 packed object data:
1064 If it is not DELTA, then deflated bytes (the size above
1065 is the size before compression).
1066 If it is REF_DELTA, then
1067 base object name (the size above is the
1068 size of the delta data that follows).
1069 delta data, deflated.
1070 If it is OFS_DELTA, then
1071 n-byte offset (see below) interpreted as a negative
1072 offset from the type-byte of the header of the
1073 ofs-delta entry (the size above is the size of
1074 the delta data that follows).
1075 delta data, deflated.</code></pre>
1076 </div></div>
1077 <div class="literalblock">
1078 <div class="content">
1079 <pre><code>offset encoding:
1080 n bytes with MSB set in all but the last one.
1081 The offset is then the number constructed by
1082 concatenating the lower 7 bit of each byte, and
1083 for n &gt;= 2 adding 2^7 + 2^14 + ... + 2^(7*(n-1))
1084 to the result.</code></pre>
1085 </div></div>
1086 </div>
1087 </div>
1088 <div class="sect1">
1089 <h2 id="_version_2_pack_idx_files_support_packs_larger_than_4_gib_and">Version 2 pack-*.idx files support packs larger than 4 GiB, and</h2>
1090 <div class="sectionbody">
1091 <div class="literalblock">
1092 <div class="content">
1093 <pre><code>have some other reorganizations. They have the format:</code></pre>
1094 </div></div>
1095 <div class="ulist"><ul>
1096 <li>
1098 A 4-byte magic number <em>\377tOc</em> which is an unreasonable
1099 fanout[0] value.
1100 </p>
1101 </li>
1102 <li>
1104 A 4-byte version number (= 2)
1105 </p>
1106 </li>
1107 <li>
1109 A 256-entry fan-out table just like v1.
1110 </p>
1111 </li>
1112 <li>
1114 A table of sorted object names. These are packed together
1115 without offset values to reduce the cache footprint of the
1116 binary search for a specific object name.
1117 </p>
1118 </li>
1119 <li>
1121 A table of 4-byte CRC32 values of the packed object data.
1122 This is new in v2 so compressed data can be copied directly
1123 from pack to pack during repacking without undetected
1124 data corruption.
1125 </p>
1126 </li>
1127 <li>
1129 A table of 4-byte offset values (in network byte order).
1130 These are usually 31-bit pack file offsets, but large
1131 offsets are encoded as an index into the next table with
1132 the msbit set.
1133 </p>
1134 </li>
1135 <li>
1137 A table of 8-byte offset entries (empty for pack files less
1138 than 2 GiB). Pack files are organized with heavily used
1139 objects toward the front, so most object references should
1140 not need to refer to this table.
1141 </p>
1142 </li>
1143 <li>
1145 The same trailer as a v1 pack file:
1146 </p>
1147 <div class="literalblock">
1148 <div class="content">
1149 <pre><code>A copy of the pack checksum at the end of the
1150 corresponding packfile.</code></pre>
1151 </div></div>
1152 <div class="literalblock">
1153 <div class="content">
1154 <pre><code>Index checksum of all of the above.</code></pre>
1155 </div></div>
1156 </li>
1157 </ul></div>
1158 </div>
1159 </div>
1160 <div class="sect1">
1161 <h2 id="_pack_rev_files_have_the_format">pack-*.rev files have the format:</h2>
1162 <div class="sectionbody">
1163 <div class="ulist"><ul>
1164 <li>
1166 A 4-byte magic number <em>0x52494458</em> (<em>RIDX</em>).
1167 </p>
1168 </li>
1169 <li>
1171 A 4-byte version identifier (= 1).
1172 </p>
1173 </li>
1174 <li>
1176 A 4-byte hash function identifier (= 1 for SHA-1, 2 for SHA-256).
1177 </p>
1178 </li>
1179 <li>
1181 A table of index positions (one per packed object, num_objects in
1182 total, each a 4-byte unsigned integer in network order), sorted by
1183 their corresponding offsets in the packfile.
1184 </p>
1185 </li>
1186 <li>
1188 A trailer, containing a:
1189 </p>
1190 <div class="literalblock">
1191 <div class="content">
1192 <pre><code>checksum of the corresponding packfile, and</code></pre>
1193 </div></div>
1194 <div class="literalblock">
1195 <div class="content">
1196 <pre><code>a checksum of all of the above.</code></pre>
1197 </div></div>
1198 </li>
1199 </ul></div>
1200 <div class="paragraph"><p>All 4-byte numbers are in network order.</p></div>
1201 </div>
1202 </div>
1203 <div class="sect1">
1204 <h2 id="_pack_mtimes_files_have_the_format">pack-*.mtimes files have the format:</h2>
1205 <div class="sectionbody">
1206 <div class="paragraph"><p>All 4-byte numbers are in network byte order.</p></div>
1207 <div class="ulist"><ul>
1208 <li>
1210 A 4-byte magic number <em>0x4d544d45</em> (<em>MTME</em>).
1211 </p>
1212 </li>
1213 <li>
1215 A 4-byte version identifier (= 1).
1216 </p>
1217 </li>
1218 <li>
1220 A 4-byte hash function identifier (= 1 for SHA-1, 2 for SHA-256).
1221 </p>
1222 </li>
1223 <li>
1225 A table of 4-byte unsigned integers. The ith value is the
1226 modification time (mtime) of the ith object in the corresponding
1227 pack by lexicographic (index) order. The mtimes count standard
1228 epoch seconds.
1229 </p>
1230 </li>
1231 <li>
1233 A trailer, containing a checksum of the corresponding packfile,
1234 and a checksum of all of the above (each having length according
1235 to the specified hash function).
1236 </p>
1237 </li>
1238 </ul></div>
1239 </div>
1240 </div>
1241 <div class="sect1">
1242 <h2 id="_multi_pack_index_midx_files_have_the_following_format">multi-pack-index (MIDX) files have the following format:</h2>
1243 <div class="sectionbody">
1244 <div class="paragraph"><p>The multi-pack-index files refer to multiple pack-files and loose objects.</p></div>
1245 <div class="paragraph"><p>In order to allow extensions that add extra data to the MIDX, we organize
1246 the body into "chunks" and provide a lookup table at the beginning of the
1247 body. The header includes certain length values, such as the number of packs,
1248 the number of base MIDX files, hash lengths and types.</p></div>
1249 <div class="paragraph"><p>All 4-byte numbers are in network order.</p></div>
1250 <div class="paragraph"><p>HEADER:</p></div>
1251 <div class="literalblock">
1252 <div class="content">
1253 <pre><code>4-byte signature:
1254 The signature is: {'M', 'I', 'D', 'X'}</code></pre>
1255 </div></div>
1256 <div class="literalblock">
1257 <div class="content">
1258 <pre><code>1-byte version number:
1259 Git only writes or recognizes version 1.</code></pre>
1260 </div></div>
1261 <div class="literalblock">
1262 <div class="content">
1263 <pre><code>1-byte Object Id Version
1264 We infer the length of object IDs (OIDs) from this value:
1265 1 =&gt; SHA-1
1266 2 =&gt; SHA-256
1267 If the hash type does not match the repository's hash algorithm,
1268 the multi-pack-index file should be ignored with a warning
1269 presented to the user.</code></pre>
1270 </div></div>
1271 <div class="literalblock">
1272 <div class="content">
1273 <pre><code>1-byte number of "chunks"</code></pre>
1274 </div></div>
1275 <div class="literalblock">
1276 <div class="content">
1277 <pre><code>1-byte number of base multi-pack-index files:
1278 This value is currently always zero.</code></pre>
1279 </div></div>
1280 <div class="literalblock">
1281 <div class="content">
1282 <pre><code>4-byte number of pack files</code></pre>
1283 </div></div>
1284 <div class="paragraph"><p>CHUNK LOOKUP:</p></div>
1285 <div class="literalblock">
1286 <div class="content">
1287 <pre><code>(C + 1) * 12 bytes providing the chunk offsets:
1288 First 4 bytes describe chunk id. Value 0 is a terminating label.
1289 Other 8 bytes provide offset in current file for chunk to start.
1290 (Chunks are provided in file-order, so you can infer the length
1291 using the next chunk position if necessary.)</code></pre>
1292 </div></div>
1293 <div class="literalblock">
1294 <div class="content">
1295 <pre><code>The CHUNK LOOKUP matches the table of contents from
1296 the chunk-based file format, see linkgit:gitformat-chunk[5].</code></pre>
1297 </div></div>
1298 <div class="literalblock">
1299 <div class="content">
1300 <pre><code>The remaining data in the body is described one chunk at a time, and
1301 these chunks may be given in any order. Chunks are required unless
1302 otherwise specified.</code></pre>
1303 </div></div>
1304 <div class="paragraph"><p>CHUNK DATA:</p></div>
1305 <div class="literalblock">
1306 <div class="content">
1307 <pre><code>Packfile Names (ID: {'P', 'N', 'A', 'M'})
1308 Store the names of packfiles as a sequence of NUL-terminated
1309 strings. There is no extra padding between the filenames,
1310 and they are listed in lexicographic order. The chunk itself
1311 is padded at the end with between 0 and 3 NUL bytes to make the
1312 chunk size a multiple of 4 bytes.</code></pre>
1313 </div></div>
1314 <div class="literalblock">
1315 <div class="content">
1316 <pre><code>Bitmapped Packfiles (ID: {'B', 'T', 'M', 'P'})
1317 Stores a table of two 4-byte unsigned integers in network order.
1318 Each table entry corresponds to a single pack (in the order that
1319 they appear above in the `PNAM` chunk). The values for each table
1320 entry are as follows:
1321 - The first bit position (in pseudo-pack order, see below) to
1322 contain an object from that pack.
1323 - The number of bits whose objects are selected from that pack.</code></pre>
1324 </div></div>
1325 <div class="literalblock">
1326 <div class="content">
1327 <pre><code>OID Fanout (ID: {'O', 'I', 'D', 'F'})
1328 The ith entry, F[i], stores the number of OIDs with first
1329 byte at most i. Thus F[255] stores the total
1330 number of objects.</code></pre>
1331 </div></div>
1332 <div class="literalblock">
1333 <div class="content">
1334 <pre><code>OID Lookup (ID: {'O', 'I', 'D', 'L'})
1335 The OIDs for all objects in the MIDX are stored in lexicographic
1336 order in this chunk.</code></pre>
1337 </div></div>
1338 <div class="literalblock">
1339 <div class="content">
1340 <pre><code>Object Offsets (ID: {'O', 'O', 'F', 'F'})
1341 Stores two 4-byte values for every object.
1342 1: The pack-int-id for the pack storing this object.
1343 2: The offset within the pack.
1344 If all offsets are less than 2^32, then the large offset chunk
1345 will not exist and offsets are stored as in IDX v1.
1346 If there is at least one offset value larger than 2^32-1, then
1347 the large offset chunk must exist, and offsets larger than
1348 2^31-1 must be stored in it instead. If the large offset chunk
1349 exists and the 31st bit is on, then removing that bit reveals
1350 the row in the large offsets containing the 8-byte offset of
1351 this object.</code></pre>
1352 </div></div>
1353 <div class="literalblock">
1354 <div class="content">
1355 <pre><code>[Optional] Object Large Offsets (ID: {'L', 'O', 'F', 'F'})
1356 8-byte offsets into large packfiles.</code></pre>
1357 </div></div>
1358 <div class="literalblock">
1359 <div class="content">
1360 <pre><code>[Optional] Bitmap pack order (ID: {'R', 'I', 'D', 'X'})
1361 A list of MIDX positions (one per object in the MIDX, num_objects in
1362 total, each a 4-byte unsigned integer in network byte order), sorted
1363 according to their relative bitmap/pseudo-pack positions.</code></pre>
1364 </div></div>
1365 <div class="paragraph"><p>TRAILER:</p></div>
1366 <div class="literalblock">
1367 <div class="content">
1368 <pre><code>Index checksum of the above contents.</code></pre>
1369 </div></div>
1370 </div>
1371 </div>
1372 <div class="sect1">
1373 <h2 id="_multi_pack_index_reverse_indexes">multi-pack-index reverse indexes</h2>
1374 <div class="sectionbody">
1375 <div class="paragraph"><p>Similar to the pack-based reverse index, the multi-pack index can also
1376 be used to generate a reverse index.</p></div>
1377 <div class="paragraph"><p>Instead of mapping between offset, pack-, and index position, this
1378 reverse index maps between an object&#8217;s position within the MIDX, and
1379 that object&#8217;s position within a pseudo-pack that the MIDX describes
1380 (i.e., the ith entry of the multi-pack reverse index holds the MIDX
1381 position of ith object in pseudo-pack order).</p></div>
1382 <div class="paragraph"><p>To clarify the difference between these orderings, consider a multi-pack
1383 reachability bitmap (which does not yet exist, but is what we are
1384 building towards here). Each bit needs to correspond to an object in the
1385 MIDX, and so we need an efficient mapping from bit position to MIDX
1386 position.</p></div>
1387 <div class="paragraph"><p>One solution is to let bits occupy the same position in the oid-sorted
1388 index stored by the MIDX. But because oids are effectively random, their
1389 resulting reachability bitmaps would have no locality, and thus compress
1390 poorly. (This is the reason that single-pack bitmaps use the pack
1391 ordering, and not the .idx ordering, for the same purpose.)</p></div>
1392 <div class="paragraph"><p>So we&#8217;d like to define an ordering for the whole MIDX based around
1393 pack ordering, which has far better locality (and thus compresses more
1394 efficiently). We can think of a pseudo-pack created by the concatenation
1395 of all of the packs in the MIDX. E.g., if we had a MIDX with three packs
1396 (a, b, c), with 10, 15, and 20 objects respectively, we can imagine an
1397 ordering of the objects like:</p></div>
1398 <div class="literalblock">
1399 <div class="content">
1400 <pre><code>|a,0|a,1|...|a,9|b,0|b,1|...|b,14|c,0|c,1|...|c,19|</code></pre>
1401 </div></div>
1402 <div class="paragraph"><p>where the ordering of the packs is defined by the MIDX&#8217;s pack list,
1403 and then the ordering of objects within each pack is the same as the
1404 order in the actual packfile.</p></div>
1405 <div class="paragraph"><p>Given the list of packs and their counts of objects, you can
1406 naïvely reconstruct that pseudo-pack ordering (e.g., the object at
1407 position 27 must be (c,1) because packs "a" and "b" consumed 25 of the
1408 slots). But there&#8217;s a catch. Objects may be duplicated between packs, in
1409 which case the MIDX only stores one pointer to the object (and thus we&#8217;d
1410 want only one slot in the bitmap).</p></div>
1411 <div class="paragraph"><p>Callers could handle duplicates themselves by reading objects in order
1412 of their bit-position, but that&#8217;s linear in the number of objects, and
1413 much too expensive for ordinary bitmap lookups. Building a reverse index
1414 solves this, since it is the logical inverse of the index, and that
1415 index has already removed duplicates. But, building a reverse index on
1416 the fly can be expensive. Since we already have an on-disk format for
1417 pack-based reverse indexes, let&#8217;s reuse it for the MIDX&#8217;s pseudo-pack,
1418 too.</p></div>
1419 <div class="paragraph"><p>Objects from the MIDX are ordered as follows to string together the
1420 pseudo-pack. Let <code>pack(o)</code> return the pack from which <code>o</code> was selected
1421 by the MIDX, and define an ordering of packs based on their numeric ID
1422 (as stored by the MIDX). Let <code>offset(o)</code> return the object offset of <code>o</code>
1423 within <code>pack(o)</code>. Then, compare <code>o1</code> and <code>o2</code> as follows:</p></div>
1424 <div class="ulist"><ul>
1425 <li>
1427 If one of <code>pack(o1)</code> and <code>pack(o2)</code> is preferred and the other
1428 is not, then the preferred one sorts first.
1429 </p>
1430 <div class="paragraph"><p>(This is a detail that allows the MIDX bitmap to determine which
1431 pack should be used by the pack-reuse mechanism, since it can ask
1432 the MIDX for the pack containing the object at bit position 0).</p></div>
1433 </li>
1434 <li>
1436 If <code>pack(o1) ≠ pack(o2)</code>, then sort the two objects in descending
1437 order based on the pack ID.
1438 </p>
1439 </li>
1440 <li>
1442 Otherwise, <code>pack(o1) = pack(o2)</code>, and the objects are sorted in
1443 pack-order (i.e., <code>o1</code> sorts ahead of <code>o2</code> exactly when <code>offset(o1)
1444 &lt; offset(o2)</code>).
1445 </p>
1446 </li>
1447 </ul></div>
1448 <div class="paragraph"><p>In short, a MIDX&#8217;s pseudo-pack is the de-duplicated concatenation of
1449 objects in packs stored by the MIDX, laid out in pack order, and the
1450 packs arranged in MIDX order (with the preferred pack coming first).</p></div>
1451 <div class="paragraph"><p>The MIDX&#8217;s reverse index is stored in the optional <em>RIDX</em> chunk within
1452 the MIDX itself.</p></div>
1453 <div class="sect2">
1454 <h3 id="_code_btmp_code_chunk"><code>BTMP</code> chunk</h3>
1455 <div class="paragraph"><p>The Bitmapped Packfiles (<code>BTMP</code>) chunk encodes additional information
1456 about the objects in the multi-pack index&#8217;s reachability bitmap. Recall
1457 that objects from the MIDX are arranged in "pseudo-pack" order (see
1458 above) for reachability bitmaps.</p></div>
1459 <div class="paragraph"><p>From the example above, suppose we have packs "a", "b", and "c", with
1460 10, 15, and 20 objects, respectively. In pseudo-pack order, those would
1461 be arranged as follows:</p></div>
1462 <div class="literalblock">
1463 <div class="content">
1464 <pre><code>|a,0|a,1|...|a,9|b,0|b,1|...|b,14|c,0|c,1|...|c,19|</code></pre>
1465 </div></div>
1466 <div class="paragraph"><p>When working with single-pack bitmaps (or, equivalently, multi-pack
1467 reachability bitmaps with a preferred pack), <a href="git-pack-objects.html">git-pack-objects(1)</a>
1468 performs &#8220;verbatim&#8221; reuse, attempting to reuse chunks of the bitmapped
1469 or preferred packfile instead of adding objects to the packing list.</p></div>
1470 <div class="paragraph"><p>When a chunk of bytes is reused from an existing pack, any objects
1471 contained therein do not need to be added to the packing list, saving
1472 memory and CPU time. But a chunk from an existing packfile can only be
1473 reused when the following conditions are met:</p></div>
1474 <div class="ulist"><ul>
1475 <li>
1477 The chunk contains only objects which were requested by the caller
1478 (i.e. does not contain any objects which the caller didn&#8217;t ask for
1479 explicitly or implicitly).
1480 </p>
1481 </li>
1482 <li>
1484 All objects stored in non-thin packs as offset- or reference-deltas
1485 also include their base object in the resulting pack.
1486 </p>
1487 </li>
1488 </ul></div>
1489 <div class="paragraph"><p>The <code>BTMP</code> chunk encodes the necessary information in order to implement
1490 multi-pack reuse over a set of packfiles as described above.
1491 Specifically, the <code>BTMP</code> chunk encodes three pieces of information (all
1492 32-bit unsigned integers in network byte-order) for each packfile <code>p</code>
1493 that is stored in the MIDX, as follows:</p></div>
1494 <div class="dlist"><dl>
1495 <dt class="hdlist1">
1496 <code>bitmap_pos</code>
1497 </dt>
1498 <dd>
1500 The first bit position (in pseudo-pack order) in the
1501 multi-pack index&#8217;s reachability bitmap occupied by an object from <code>p</code>.
1502 </p>
1503 </dd>
1504 <dt class="hdlist1">
1505 <code>bitmap_nr</code>
1506 </dt>
1507 <dd>
1509 The number of bit positions (including the one at
1510 <code>bitmap_pos</code>) that encode objects from that pack <code>p</code>.
1511 </p>
1512 </dd>
1513 </dl></div>
1514 <div class="paragraph"><p>For example, the <code>BTMP</code> chunk corresponding to the above example (with
1515 packs &#8220;a&#8221;, &#8220;b&#8221;, and &#8220;c&#8221;) would look like:</p></div>
1516 <div class="tableblock">
1517 <table rules="all"
1518 width="100%"
1519 frame="border"
1520 cellspacing="0" cellpadding="4">
1521 <col width="20%" />
1522 <col width="40%" />
1523 <col width="40%" />
1524 <tbody>
1525 <tr>
1526 <td align="left" valign="top"><p class="table"></p></td>
1527 <td align="left" valign="top"><p class="table"><code>bitmap_pos</code></p></td>
1528 <td align="left" valign="top"><p class="table"><code>bitmap_nr</code></p></td>
1529 </tr>
1530 <tr>
1531 <td align="left" valign="top"><p class="table">packfile &#8220;a&#8221;</p></td>
1532 <td align="left" valign="top"><p class="table"><code>0</code></p></td>
1533 <td align="left" valign="top"><p class="table"><code>10</code></p></td>
1534 </tr>
1535 <tr>
1536 <td align="left" valign="top"><p class="table">packfile &#8220;b&#8221;</p></td>
1537 <td align="left" valign="top"><p class="table"><code>10</code></p></td>
1538 <td align="left" valign="top"><p class="table"><code>15</code></p></td>
1539 </tr>
1540 <tr>
1541 <td align="left" valign="top"><p class="table">packfile &#8220;c&#8221;</p></td>
1542 <td align="left" valign="top"><p class="table"><code>25</code></p></td>
1543 <td align="left" valign="top"><p class="table"><code>20</code></p></td>
1544 </tr>
1545 </tbody>
1546 </table>
1547 </div>
1548 <div class="paragraph"><p>With this information in place, we can treat each packfile as
1549 individually reusable in the same fashion as verbatim pack reuse is
1550 performed on individual packs prior to the implementation of the <code>BTMP</code>
1551 chunk.</p></div>
1552 </div>
1553 </div>
1554 </div>
1555 <div class="sect1">
1556 <h2 id="_cruft_packs">cruft packs</h2>
1557 <div class="sectionbody">
1558 <div class="paragraph"><p>The cruft packs feature offer an alternative to Git&#8217;s traditional mechanism of
1559 removing unreachable objects. This document provides an overview of Git&#8217;s
1560 pruning mechanism, and how a cruft pack can be used instead to accomplish the
1561 same.</p></div>
1562 <div class="sect2">
1563 <h3 id="_background">Background</h3>
1564 <div class="paragraph"><p>To remove unreachable objects from your repository, Git offers <code>git repack -Ad</code>
1565 (see <a href="git-repack.html">git-repack(1)</a>). Quoting from the documentation:</p></div>
1566 <div class="listingblock">
1567 <div class="content">
1568 <pre><code>[...] unreachable objects in a previous pack become loose, unpacked objects,
1569 instead of being left in the old pack. [...] loose unreachable objects will be
1570 pruned according to normal expiry rules with the next 'git gc' invocation.</code></pre>
1571 </div></div>
1572 <div class="paragraph"><p>Unreachable objects aren&#8217;t removed immediately, since doing so could race with
1573 an incoming push which may reference an object which is about to be deleted.
1574 Instead, those unreachable objects are stored as loose objects and stay that way
1575 until they are older than the expiration window, at which point they are removed
1576 by <a href="git-prune.html">git-prune(1)</a>.</p></div>
1577 <div class="paragraph"><p>Git must store these unreachable objects loose in order to keep track of their
1578 per-object mtimes. If these unreachable objects were written into one big pack,
1579 then either freshening that pack (because an object contained within it was
1580 re-written) or creating a new pack of unreachable objects would cause the pack&#8217;s
1581 mtime to get updated, and the objects within it would never leave the expiration
1582 window. Instead, objects are stored loose in order to keep track of the
1583 individual object mtimes and avoid a situation where all cruft objects are
1584 freshened at once.</p></div>
1585 <div class="paragraph"><p>This can lead to undesirable situations when a repository contains many
1586 unreachable objects which have not yet left the grace period. Having large
1587 directories in the shards of <code>.git/objects</code> can lead to decreased performance in
1588 the repository. But given enough unreachable objects, this can lead to inode
1589 starvation and degrade the performance of the whole system. Since we
1590 can never pack those objects, these repositories often take up a large amount of
1591 disk space, since we can only zlib compress them, but not store them in delta
1592 chains.</p></div>
1593 </div>
1594 <div class="sect2">
1595 <h3 id="_cruft_packs_2">Cruft packs</h3>
1596 <div class="paragraph"><p>A cruft pack eliminates the need for storing unreachable objects in a loose
1597 state by including the per-object mtimes in a separate file alongside a single
1598 pack containing all loose objects.</p></div>
1599 <div class="paragraph"><p>A cruft pack is written by <code>git repack --cruft</code> when generating a new pack.
1600 <a href="git-pack-objects.html">git-pack-objects(1)</a>'s <code>--cruft</code> option. Note that <code>git repack --cruft</code>
1601 is a classic all-into-one repack, meaning that everything in the resulting pack is
1602 reachable, and everything else is unreachable. Once written, the <code>--cruft</code>
1603 option instructs <code>git repack</code> to generate another pack containing only objects
1604 not packed in the previous step (which equates to packing all unreachable
1605 objects together). This progresses as follows:</p></div>
1606 <div class="olist arabic"><ol class="arabic">
1607 <li>
1609 Enumerate every object, marking any object which is (a) not contained in a
1610 kept-pack, and (b) whose mtime is within the grace period as a traversal
1611 tip.
1612 </p>
1613 </li>
1614 <li>
1616 Perform a reachability traversal based on the tips gathered in the previous
1617 step, adding every object along the way to the pack.
1618 </p>
1619 </li>
1620 <li>
1622 Write the pack out, along with a <code>.mtimes</code> file that records the per-object
1623 timestamps.
1624 </p>
1625 </li>
1626 </ol></div>
1627 <div class="paragraph"><p>This mode is invoked internally by <a href="git-repack.html">git-repack(1)</a> when instructed to
1628 write a cruft pack. Crucially, the set of in-core kept packs is exactly the set
1629 of packs which will not be deleted by the repack; in other words, they contain
1630 all of the repository&#8217;s reachable objects.</p></div>
1631 <div class="paragraph"><p>When a repository already has a cruft pack, <code>git repack --cruft</code> typically only
1632 adds objects to it. An exception to this is when <code>git repack</code> is given the
1633 <code>--cruft-expiration</code> option, which allows the generated cruft pack to omit
1634 expired objects instead of waiting for <a href="git-gc.html">git-gc(1)</a> to expire those objects
1635 later on.</p></div>
1636 <div class="paragraph"><p>It is <a href="git-gc.html">git-gc(1)</a> that is typically responsible for removing expired
1637 unreachable objects.</p></div>
1638 </div>
1639 <div class="sect2">
1640 <h3 id="_alternatives">Alternatives</h3>
1641 <div class="paragraph"><p>Notable alternatives to this design include:</p></div>
1642 <div class="ulist"><ul>
1643 <li>
1645 The location of the per-object mtime data.
1646 </p>
1647 </li>
1648 </ul></div>
1649 <div class="paragraph"><p>On the location of mtime data, a new auxiliary file tied to the pack was chosen
1650 to avoid complicating the <code>.idx</code> format. If the <code>.idx</code> format were ever to gain
1651 support for optional chunks of data, it may make sense to consolidate the
1652 <code>.mtimes</code> format into the <code>.idx</code> itself.</p></div>
1653 </div>
1654 </div>
1655 </div>
1656 <div class="sect1">
1657 <h2 id="_git">GIT</h2>
1658 <div class="sectionbody">
1659 <div class="paragraph"><p>Part of the <a href="git.html">git(1)</a> suite</p></div>
1660 </div>
1661 </div>
1662 </div>
1663 <div id="footnotes"><hr /></div>
1664 <div id="footer">
1665 <div id="footer-text">
1666 Last updated
1667 2024-01-12 16:26:54 PST
1668 </div>
1669 </div>
1670 </body>
1671 </html>