Autogenerated HTML docs for v2.45.1-204-gd8ab1
[git-htmldocs.git] / git-filter-branch.html
blobe18168076bd2fb4b9685d7cc501c37b6fb9a21d6
1 <?xml version="1.0" encoding="UTF-8"?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
3 "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
4 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
5 <head>
6 <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8" />
7 <meta name="generator" content="AsciiDoc 10.2.0" />
8 <title>git-filter-branch(1)</title>
9 <style type="text/css">
10 /* Shared CSS for AsciiDoc xhtml11 and html5 backends */
12 /* Default font. */
13 body {
14 font-family: Georgia,serif;
17 /* Title font. */
18 h1, h2, h3, h4, h5, h6,
19 div.title, caption.title,
20 thead, p.table.header,
21 #toctitle,
22 #author, #revnumber, #revdate, #revremark,
23 #footer {
24 font-family: Arial,Helvetica,sans-serif;
27 body {
28 margin: 1em 5% 1em 5%;
31 a {
32 color: blue;
33 text-decoration: underline;
35 a:visited {
36 color: fuchsia;
39 em {
40 font-style: italic;
41 color: navy;
44 strong {
45 font-weight: bold;
46 color: #083194;
49 h1, h2, h3, h4, h5, h6 {
50 color: #527bbd;
51 margin-top: 1.2em;
52 margin-bottom: 0.5em;
53 line-height: 1.3;
56 h1, h2, h3 {
57 border-bottom: 2px solid silver;
59 h2 {
60 padding-top: 0.5em;
62 h3 {
63 float: left;
65 h3 + * {
66 clear: left;
68 h5 {
69 font-size: 1.0em;
72 div.sectionbody {
73 margin-left: 0;
76 hr {
77 border: 1px solid silver;
80 p {
81 margin-top: 0.5em;
82 margin-bottom: 0.5em;
85 ul, ol, li > p {
86 margin-top: 0;
88 ul > li { color: #aaa; }
89 ul > li > * { color: black; }
91 .monospaced, code, pre {
92 font-family: "Courier New", Courier, monospace;
93 font-size: inherit;
94 color: navy;
95 padding: 0;
96 margin: 0;
98 pre {
99 white-space: pre-wrap;
102 #author {
103 color: #527bbd;
104 font-weight: bold;
105 font-size: 1.1em;
107 #email {
109 #revnumber, #revdate, #revremark {
112 #footer {
113 font-size: small;
114 border-top: 2px solid silver;
115 padding-top: 0.5em;
116 margin-top: 4.0em;
118 #footer-text {
119 float: left;
120 padding-bottom: 0.5em;
122 #footer-badges {
123 float: right;
124 padding-bottom: 0.5em;
127 #preamble {
128 margin-top: 1.5em;
129 margin-bottom: 1.5em;
131 div.imageblock, div.exampleblock, div.verseblock,
132 div.quoteblock, div.literalblock, div.listingblock, div.sidebarblock,
133 div.admonitionblock {
134 margin-top: 1.0em;
135 margin-bottom: 1.5em;
137 div.admonitionblock {
138 margin-top: 2.0em;
139 margin-bottom: 2.0em;
140 margin-right: 10%;
141 color: #606060;
144 div.content { /* Block element content. */
145 padding: 0;
148 /* Block element titles. */
149 div.title, caption.title {
150 color: #527bbd;
151 font-weight: bold;
152 text-align: left;
153 margin-top: 1.0em;
154 margin-bottom: 0.5em;
156 div.title + * {
157 margin-top: 0;
160 td div.title:first-child {
161 margin-top: 0.0em;
163 div.content div.title:first-child {
164 margin-top: 0.0em;
166 div.content + div.title {
167 margin-top: 0.0em;
170 div.sidebarblock > div.content {
171 background: #ffffee;
172 border: 1px solid #dddddd;
173 border-left: 4px solid #f0f0f0;
174 padding: 0.5em;
177 div.listingblock > div.content {
178 border: 1px solid #dddddd;
179 border-left: 5px solid #f0f0f0;
180 background: #f8f8f8;
181 padding: 0.5em;
184 div.quoteblock, div.verseblock {
185 padding-left: 1.0em;
186 margin-left: 1.0em;
187 margin-right: 10%;
188 border-left: 5px solid #f0f0f0;
189 color: #888;
192 div.quoteblock > div.attribution {
193 padding-top: 0.5em;
194 text-align: right;
197 div.verseblock > pre.content {
198 font-family: inherit;
199 font-size: inherit;
201 div.verseblock > div.attribution {
202 padding-top: 0.75em;
203 text-align: left;
205 /* DEPRECATED: Pre version 8.2.7 verse style literal block. */
206 div.verseblock + div.attribution {
207 text-align: left;
210 div.admonitionblock .icon {
211 vertical-align: top;
212 font-size: 1.1em;
213 font-weight: bold;
214 text-decoration: underline;
215 color: #527bbd;
216 padding-right: 0.5em;
218 div.admonitionblock td.content {
219 padding-left: 0.5em;
220 border-left: 3px solid #dddddd;
223 div.exampleblock > div.content {
224 border-left: 3px solid #dddddd;
225 padding-left: 0.5em;
228 div.imageblock div.content { padding-left: 0; }
229 span.image img { border-style: none; vertical-align: text-bottom; }
230 a.image:visited { color: white; }
232 dl {
233 margin-top: 0.8em;
234 margin-bottom: 0.8em;
236 dt {
237 margin-top: 0.5em;
238 margin-bottom: 0;
239 font-style: normal;
240 color: navy;
242 dd > *:first-child {
243 margin-top: 0.1em;
246 ul, ol {
247 list-style-position: outside;
249 ol.arabic {
250 list-style-type: decimal;
252 ol.loweralpha {
253 list-style-type: lower-alpha;
255 ol.upperalpha {
256 list-style-type: upper-alpha;
258 ol.lowerroman {
259 list-style-type: lower-roman;
261 ol.upperroman {
262 list-style-type: upper-roman;
265 div.compact ul, div.compact ol,
266 div.compact p, div.compact p,
267 div.compact div, div.compact div {
268 margin-top: 0.1em;
269 margin-bottom: 0.1em;
272 tfoot {
273 font-weight: bold;
275 td > div.verse {
276 white-space: pre;
279 div.hdlist {
280 margin-top: 0.8em;
281 margin-bottom: 0.8em;
283 div.hdlist tr {
284 padding-bottom: 15px;
286 dt.hdlist1.strong, td.hdlist1.strong {
287 font-weight: bold;
289 td.hdlist1 {
290 vertical-align: top;
291 font-style: normal;
292 padding-right: 0.8em;
293 color: navy;
295 td.hdlist2 {
296 vertical-align: top;
298 div.hdlist.compact tr {
299 margin: 0;
300 padding-bottom: 0;
303 .comment {
304 background: yellow;
307 .footnote, .footnoteref {
308 font-size: 0.8em;
311 span.footnote, span.footnoteref {
312 vertical-align: super;
315 #footnotes {
316 margin: 20px 0 20px 0;
317 padding: 7px 0 0 0;
320 #footnotes div.footnote {
321 margin: 0 0 5px 0;
324 #footnotes hr {
325 border: none;
326 border-top: 1px solid silver;
327 height: 1px;
328 text-align: left;
329 margin-left: 0;
330 width: 20%;
331 min-width: 100px;
334 div.colist td {
335 padding-right: 0.5em;
336 padding-bottom: 0.3em;
337 vertical-align: top;
339 div.colist td img {
340 margin-top: 0.3em;
343 @media print {
344 #footer-badges { display: none; }
347 #toc {
348 margin-bottom: 2.5em;
351 #toctitle {
352 color: #527bbd;
353 font-size: 1.1em;
354 font-weight: bold;
355 margin-top: 1.0em;
356 margin-bottom: 0.1em;
359 div.toclevel0, div.toclevel1, div.toclevel2, div.toclevel3, div.toclevel4 {
360 margin-top: 0;
361 margin-bottom: 0;
363 div.toclevel2 {
364 margin-left: 2em;
365 font-size: 0.9em;
367 div.toclevel3 {
368 margin-left: 4em;
369 font-size: 0.9em;
371 div.toclevel4 {
372 margin-left: 6em;
373 font-size: 0.9em;
376 span.aqua { color: aqua; }
377 span.black { color: black; }
378 span.blue { color: blue; }
379 span.fuchsia { color: fuchsia; }
380 span.gray { color: gray; }
381 span.green { color: green; }
382 span.lime { color: lime; }
383 span.maroon { color: maroon; }
384 span.navy { color: navy; }
385 span.olive { color: olive; }
386 span.purple { color: purple; }
387 span.red { color: red; }
388 span.silver { color: silver; }
389 span.teal { color: teal; }
390 span.white { color: white; }
391 span.yellow { color: yellow; }
393 span.aqua-background { background: aqua; }
394 span.black-background { background: black; }
395 span.blue-background { background: blue; }
396 span.fuchsia-background { background: fuchsia; }
397 span.gray-background { background: gray; }
398 span.green-background { background: green; }
399 span.lime-background { background: lime; }
400 span.maroon-background { background: maroon; }
401 span.navy-background { background: navy; }
402 span.olive-background { background: olive; }
403 span.purple-background { background: purple; }
404 span.red-background { background: red; }
405 span.silver-background { background: silver; }
406 span.teal-background { background: teal; }
407 span.white-background { background: white; }
408 span.yellow-background { background: yellow; }
410 span.big { font-size: 2em; }
411 span.small { font-size: 0.6em; }
413 span.underline { text-decoration: underline; }
414 span.overline { text-decoration: overline; }
415 span.line-through { text-decoration: line-through; }
417 div.unbreakable { page-break-inside: avoid; }
421 * xhtml11 specific
423 * */
425 div.tableblock {
426 margin-top: 1.0em;
427 margin-bottom: 1.5em;
429 div.tableblock > table {
430 border: 3px solid #527bbd;
432 thead, p.table.header {
433 font-weight: bold;
434 color: #527bbd;
436 p.table {
437 margin-top: 0;
439 /* Because the table frame attribute is overridden by CSS in most browsers. */
440 div.tableblock > table[frame="void"] {
441 border-style: none;
443 div.tableblock > table[frame="hsides"] {
444 border-left-style: none;
445 border-right-style: none;
447 div.tableblock > table[frame="vsides"] {
448 border-top-style: none;
449 border-bottom-style: none;
454 * html5 specific
456 * */
458 table.tableblock {
459 margin-top: 1.0em;
460 margin-bottom: 1.5em;
462 thead, p.tableblock.header {
463 font-weight: bold;
464 color: #527bbd;
466 p.tableblock {
467 margin-top: 0;
469 table.tableblock {
470 border-width: 3px;
471 border-spacing: 0px;
472 border-style: solid;
473 border-color: #527bbd;
474 border-collapse: collapse;
476 th.tableblock, td.tableblock {
477 border-width: 1px;
478 padding: 4px;
479 border-style: solid;
480 border-color: #527bbd;
483 table.tableblock.frame-topbot {
484 border-left-style: hidden;
485 border-right-style: hidden;
487 table.tableblock.frame-sides {
488 border-top-style: hidden;
489 border-bottom-style: hidden;
491 table.tableblock.frame-none {
492 border-style: hidden;
495 th.tableblock.halign-left, td.tableblock.halign-left {
496 text-align: left;
498 th.tableblock.halign-center, td.tableblock.halign-center {
499 text-align: center;
501 th.tableblock.halign-right, td.tableblock.halign-right {
502 text-align: right;
505 th.tableblock.valign-top, td.tableblock.valign-top {
506 vertical-align: top;
508 th.tableblock.valign-middle, td.tableblock.valign-middle {
509 vertical-align: middle;
511 th.tableblock.valign-bottom, td.tableblock.valign-bottom {
512 vertical-align: bottom;
517 * manpage specific
519 * */
521 body.manpage h1 {
522 padding-top: 0.5em;
523 padding-bottom: 0.5em;
524 border-top: 2px solid silver;
525 border-bottom: 2px solid silver;
527 body.manpage h2 {
528 border-style: none;
530 body.manpage div.sectionbody {
531 margin-left: 3em;
534 @media print {
535 body.manpage div#toc { display: none; }
539 </style>
540 <script type="text/javascript">
541 /*<![CDATA[*/
542 var asciidoc = { // Namespace.
544 /////////////////////////////////////////////////////////////////////
545 // Table Of Contents generator
546 /////////////////////////////////////////////////////////////////////
548 /* Author: Mihai Bazon, September 2002
549 * http://students.infoiasi.ro/~mishoo
551 * Table Of Content generator
552 * Version: 0.4
554 * Feel free to use this script under the terms of the GNU General Public
555 * License, as long as you do not remove or alter this notice.
558 /* modified by Troy D. Hanson, September 2006. License: GPL */
559 /* modified by Stuart Rackham, 2006, 2009. License: GPL */
561 // toclevels = 1..4.
562 toc: function (toclevels) {
564 function getText(el) {
565 var text = "";
566 for (var i = el.firstChild; i != null; i = i.nextSibling) {
567 if (i.nodeType == 3 /* Node.TEXT_NODE */) // IE doesn't speak constants.
568 text += i.data;
569 else if (i.firstChild != null)
570 text += getText(i);
572 return text;
575 function TocEntry(el, text, toclevel) {
576 this.element = el;
577 this.text = text;
578 this.toclevel = toclevel;
581 function tocEntries(el, toclevels) {
582 var result = new Array;
583 var re = new RegExp('[hH]([1-'+(toclevels+1)+'])');
584 // Function that scans the DOM tree for header elements (the DOM2
585 // nodeIterator API would be a better technique but not supported by all
586 // browsers).
587 var iterate = function (el) {
588 for (var i = el.firstChild; i != null; i = i.nextSibling) {
589 if (i.nodeType == 1 /* Node.ELEMENT_NODE */) {
590 var mo = re.exec(i.tagName);
591 if (mo && (i.getAttribute("class") || i.getAttribute("className")) != "float") {
592 result[result.length] = new TocEntry(i, getText(i), mo[1]-1);
594 iterate(i);
598 iterate(el);
599 return result;
602 var toc = document.getElementById("toc");
603 if (!toc) {
604 return;
607 // Delete existing TOC entries in case we're reloading the TOC.
608 var tocEntriesToRemove = [];
609 var i;
610 for (i = 0; i < toc.childNodes.length; i++) {
611 var entry = toc.childNodes[i];
612 if (entry.nodeName.toLowerCase() == 'div'
613 && entry.getAttribute("class")
614 && entry.getAttribute("class").match(/^toclevel/))
615 tocEntriesToRemove.push(entry);
617 for (i = 0; i < tocEntriesToRemove.length; i++) {
618 toc.removeChild(tocEntriesToRemove[i]);
621 // Rebuild TOC entries.
622 var entries = tocEntries(document.getElementById("content"), toclevels);
623 for (var i = 0; i < entries.length; ++i) {
624 var entry = entries[i];
625 if (entry.element.id == "")
626 entry.element.id = "_toc_" + i;
627 var a = document.createElement("a");
628 a.href = "#" + entry.element.id;
629 a.appendChild(document.createTextNode(entry.text));
630 var div = document.createElement("div");
631 div.appendChild(a);
632 div.className = "toclevel" + entry.toclevel;
633 toc.appendChild(div);
635 if (entries.length == 0)
636 toc.parentNode.removeChild(toc);
640 /////////////////////////////////////////////////////////////////////
641 // Footnotes generator
642 /////////////////////////////////////////////////////////////////////
644 /* Based on footnote generation code from:
645 * http://www.brandspankingnew.net/archive/2005/07/format_footnote.html
648 footnotes: function () {
649 // Delete existing footnote entries in case we're reloading the footnodes.
650 var i;
651 var noteholder = document.getElementById("footnotes");
652 if (!noteholder) {
653 return;
655 var entriesToRemove = [];
656 for (i = 0; i < noteholder.childNodes.length; i++) {
657 var entry = noteholder.childNodes[i];
658 if (entry.nodeName.toLowerCase() == 'div' && entry.getAttribute("class") == "footnote")
659 entriesToRemove.push(entry);
661 for (i = 0; i < entriesToRemove.length; i++) {
662 noteholder.removeChild(entriesToRemove[i]);
665 // Rebuild footnote entries.
666 var cont = document.getElementById("content");
667 var spans = cont.getElementsByTagName("span");
668 var refs = {};
669 var n = 0;
670 for (i=0; i<spans.length; i++) {
671 if (spans[i].className == "footnote") {
672 n++;
673 var note = spans[i].getAttribute("data-note");
674 if (!note) {
675 // Use [\s\S] in place of . so multi-line matches work.
676 // Because JavaScript has no s (dotall) regex flag.
677 note = spans[i].innerHTML.match(/\s*\[([\s\S]*)]\s*/)[1];
678 spans[i].innerHTML =
679 "[<a id='_footnoteref_" + n + "' href='#_footnote_" + n +
680 "' title='View footnote' class='footnote'>" + n + "</a>]";
681 spans[i].setAttribute("data-note", note);
683 noteholder.innerHTML +=
684 "<div class='footnote' id='_footnote_" + n + "'>" +
685 "<a href='#_footnoteref_" + n + "' title='Return to text'>" +
686 n + "</a>. " + note + "</div>";
687 var id =spans[i].getAttribute("id");
688 if (id != null) refs["#"+id] = n;
691 if (n == 0)
692 noteholder.parentNode.removeChild(noteholder);
693 else {
694 // Process footnoterefs.
695 for (i=0; i<spans.length; i++) {
696 if (spans[i].className == "footnoteref") {
697 var href = spans[i].getElementsByTagName("a")[0].getAttribute("href");
698 href = href.match(/#.*/)[0]; // Because IE return full URL.
699 n = refs[href];
700 spans[i].innerHTML =
701 "[<a href='#_footnote_" + n +
702 "' title='View footnote' class='footnote'>" + n + "</a>]";
708 install: function(toclevels) {
709 var timerId;
711 function reinstall() {
712 asciidoc.footnotes();
713 if (toclevels) {
714 asciidoc.toc(toclevels);
718 function reinstallAndRemoveTimer() {
719 clearInterval(timerId);
720 reinstall();
723 timerId = setInterval(reinstall, 500);
724 if (document.addEventListener)
725 document.addEventListener("DOMContentLoaded", reinstallAndRemoveTimer, false);
726 else
727 window.onload = reinstallAndRemoveTimer;
731 asciidoc.install();
732 /*]]>*/
733 </script>
734 </head>
735 <body class="manpage">
736 <div id="header">
737 <h1>
738 git-filter-branch(1) Manual Page
739 </h1>
740 <h2>NAME</h2>
741 <div class="sectionbody">
742 <p>git-filter-branch -
743 Rewrite branches
744 </p>
745 </div>
746 </div>
747 <div id="content">
748 <div class="sect1">
749 <h2 id="_synopsis">SYNOPSIS</h2>
750 <div class="sectionbody">
751 <div class="verseblock">
752 <pre class="content"><em>git filter-branch</em> [--setup &lt;command&gt;] [--subdirectory-filter &lt;directory&gt;]
753 [--env-filter &lt;command&gt;] [--tree-filter &lt;command&gt;]
754 [--index-filter &lt;command&gt;] [--parent-filter &lt;command&gt;]
755 [--msg-filter &lt;command&gt;] [--commit-filter &lt;command&gt;]
756 [--tag-name-filter &lt;command&gt;] [--prune-empty]
757 [--original &lt;namespace&gt;] [-d &lt;directory&gt;] [-f | --force]
758 [--state-branch &lt;branch&gt;] [--] [&lt;rev-list-options&gt;&#8230;]</pre>
759 <div class="attribution">
760 </div></div>
761 </div>
762 </div>
763 <div class="sect1">
764 <h2 id="_warning">WARNING</h2>
765 <div class="sectionbody">
766 <div class="paragraph"><p><em>git filter-branch</em> has a plethora of pitfalls that can produce non-obvious
767 manglings of the intended history rewrite (and can leave you with little
768 time to investigate such problems since it has such abysmal performance).
769 These safety and performance issues cannot be backward compatibly fixed and
770 as such, its use is not recommended. Please use an alternative history
771 filtering tool such as <a href="https://github.com/newren/git-filter-repo/">git
772 filter-repo</a>. If you still need to use <em>git filter-branch</em>, please
773 carefully read <a href="#SAFETY">[SAFETY]</a> (and <a href="#PERFORMANCE">[PERFORMANCE]</a>) to learn about the land
774 mines of filter-branch, and then vigilantly avoid as many of the hazards
775 listed there as reasonably possible.</p></div>
776 </div>
777 </div>
778 <div class="sect1">
779 <h2 id="_description">DESCRIPTION</h2>
780 <div class="sectionbody">
781 <div class="paragraph"><p>Lets you rewrite Git revision history by rewriting the branches mentioned
782 in the &lt;rev-list-options&gt;, applying custom filters on each revision.
783 Those filters can modify each tree (e.g. removing a file or running
784 a perl rewrite on all files) or information about each commit.
785 Otherwise, all information (including original commit times or merge
786 information) will be preserved.</p></div>
787 <div class="paragraph"><p>The command will only rewrite the <em>positive</em> refs mentioned in the
788 command line (e.g. if you pass <em>a..b</em>, only <em>b</em> will be rewritten).
789 If you specify no filters, the commits will be recommitted without any
790 changes, which would normally have no effect. Nevertheless, this may be
791 useful in the future for compensating for some Git bugs or such,
792 therefore such a usage is permitted.</p></div>
793 <div class="paragraph"><p><strong>NOTE</strong>: This command honors <code>.git/info/grafts</code> file and refs in
794 the <code>refs/replace/</code> namespace.
795 If you have any grafts or replacement refs defined, running this command
796 will make them permanent.</p></div>
797 <div class="paragraph"><p><strong>WARNING</strong>! The rewritten history will have different object names for all
798 the objects and will not converge with the original branch. You will not
799 be able to easily push and distribute the rewritten branch on top of the
800 original branch. Please do not use this command if you do not know the
801 full implications, and avoid using it anyway, if a simple single commit
802 would suffice to fix your problem. (See the "RECOVERING FROM UPSTREAM
803 REBASE" section in <a href="git-rebase.html">git-rebase(1)</a> for further information about
804 rewriting published history.)</p></div>
805 <div class="paragraph"><p>Always verify that the rewritten version is correct: The original refs,
806 if different from the rewritten ones, will be stored in the namespace
807 <em>refs/original/</em>.</p></div>
808 <div class="paragraph"><p>Note that since this operation is very I/O expensive, it might
809 be a good idea to redirect the temporary directory off-disk with the
810 <code>-d</code> option, e.g. on tmpfs. Reportedly the speedup is very noticeable.</p></div>
811 <div class="sect2">
812 <h3 id="_filters">Filters</h3>
813 <div class="paragraph"><p>The filters are applied in the order as listed below. The &lt;command&gt;
814 argument is always evaluated in the shell context using the <em>eval</em> command
815 (with the notable exception of the commit filter, for technical reasons).
816 Prior to that, the <code>$GIT_COMMIT</code> environment variable will be set to contain
817 the id of the commit being rewritten. Also, GIT_AUTHOR_NAME,
818 GIT_AUTHOR_EMAIL, GIT_AUTHOR_DATE, GIT_COMMITTER_NAME, GIT_COMMITTER_EMAIL,
819 and GIT_COMMITTER_DATE are taken from the current commit and exported to
820 the environment, in order to affect the author and committer identities of
821 the replacement commit created by <a href="git-commit-tree.html">git-commit-tree(1)</a> after the
822 filters have run.</p></div>
823 <div class="paragraph"><p>If any evaluation of &lt;command&gt; returns a non-zero exit status, the whole
824 operation will be aborted.</p></div>
825 <div class="paragraph"><p>A <em>map</em> function is available that takes an "original sha1 id" argument
826 and outputs a "rewritten sha1 id" if the commit has been already
827 rewritten, and "original sha1 id" otherwise; the <em>map</em> function can
828 return several ids on separate lines if your commit filter emitted
829 multiple commits.</p></div>
830 </div>
831 </div>
832 </div>
833 <div class="sect1">
834 <h2 id="_options">OPTIONS</h2>
835 <div class="sectionbody">
836 <div class="dlist"><dl>
837 <dt class="hdlist1">
838 --setup &lt;command&gt;
839 </dt>
840 <dd>
842 This is not a real filter executed for each commit but a one
843 time setup just before the loop. Therefore no commit-specific
844 variables are defined yet. Functions or variables defined here
845 can be used or modified in the following filter steps except
846 the commit filter, for technical reasons.
847 </p>
848 </dd>
849 <dt class="hdlist1">
850 --subdirectory-filter &lt;directory&gt;
851 </dt>
852 <dd>
854 Only look at the history which touches the given subdirectory.
855 The result will contain that directory (and only that) as its
856 project root. Implies <a href="#Remap_to_ancestor">[Remap_to_ancestor]</a>.
857 </p>
858 </dd>
859 <dt class="hdlist1">
860 --env-filter &lt;command&gt;
861 </dt>
862 <dd>
864 This filter may be used if you only need to modify the environment
865 in which the commit will be performed. Specifically, you might
866 want to rewrite the author/committer name/email/time environment
867 variables (see <a href="git-commit-tree.html">git-commit-tree(1)</a> for details).
868 </p>
869 </dd>
870 <dt class="hdlist1">
871 --tree-filter &lt;command&gt;
872 </dt>
873 <dd>
875 This is the filter for rewriting the tree and its contents.
876 The argument is evaluated in shell with the working
877 directory set to the root of the checked out tree. The new tree
878 is then used as-is (new files are auto-added, disappeared files
879 are auto-removed - neither .gitignore files nor any other ignore
880 rules <strong>HAVE ANY EFFECT</strong>!).
881 </p>
882 </dd>
883 <dt class="hdlist1">
884 --index-filter &lt;command&gt;
885 </dt>
886 <dd>
888 This is the filter for rewriting the index. It is similar to the
889 tree filter but does not check out the tree, which makes it much
890 faster. Frequently used with <code>git rm --cached
891 --ignore-unmatch ...</code>, see EXAMPLES below. For hairy
892 cases, see <a href="git-update-index.html">git-update-index(1)</a>.
893 </p>
894 </dd>
895 <dt class="hdlist1">
896 --parent-filter &lt;command&gt;
897 </dt>
898 <dd>
900 This is the filter for rewriting the commit&#8217;s parent list.
901 It will receive the parent string on stdin and shall output
902 the new parent string on stdout. The parent string is in
903 the format described in <a href="git-commit-tree.html">git-commit-tree(1)</a>: empty for
904 the initial commit, "-p parent" for a normal commit and
905 "-p parent1 -p parent2 -p parent3 &#8230;" for a merge commit.
906 </p>
907 </dd>
908 <dt class="hdlist1">
909 --msg-filter &lt;command&gt;
910 </dt>
911 <dd>
913 This is the filter for rewriting the commit messages.
914 The argument is evaluated in the shell with the original
915 commit message on standard input; its standard output is
916 used as the new commit message.
917 </p>
918 </dd>
919 <dt class="hdlist1">
920 --commit-filter &lt;command&gt;
921 </dt>
922 <dd>
924 This is the filter for performing the commit.
925 If this filter is specified, it will be called instead of the
926 <em>git commit-tree</em> command, with arguments of the form
927 "&lt;TREE_ID&gt; [(-p &lt;PARENT_COMMIT_ID&gt;)&#8230;]" and the log message on
928 stdin. The commit id is expected on stdout.
929 </p>
930 <div class="paragraph"><p>As a special extension, the commit filter may emit multiple
931 commit ids; in that case, the rewritten children of the original commit will
932 have all of them as parents.</p></div>
933 <div class="paragraph"><p>You can use the <em>map</em> convenience function in this filter, and other
934 convenience functions, too. For example, calling <em>skip_commit "$@"</em>
935 will leave out the current commit (but not its changes! If you want
936 that, use <em>git rebase</em> instead).</p></div>
937 <div class="paragraph"><p>You can also use the <code>git_commit_non_empty_tree "$@"</code> instead of
938 <code>git commit-tree "$@"</code> if you don&#8217;t wish to keep commits with a single parent
939 and that makes no change to the tree.</p></div>
940 </dd>
941 <dt class="hdlist1">
942 --tag-name-filter &lt;command&gt;
943 </dt>
944 <dd>
946 This is the filter for rewriting tag names. When passed,
947 it will be called for every tag ref that points to a rewritten
948 object (or to a tag object which points to a rewritten object).
949 The original tag name is passed via standard input, and the new
950 tag name is expected on standard output.
951 </p>
952 <div class="paragraph"><p>The original tags are not deleted, but can be overwritten;
953 use "--tag-name-filter cat" to simply update the tags. In this
954 case, be very careful and make sure you have the old tags
955 backed up in case the conversion has run afoul.</p></div>
956 <div class="paragraph"><p>Nearly proper rewriting of tag objects is supported. If the tag has
957 a message attached, a new tag object will be created with the same message,
958 author, and timestamp. If the tag has a signature attached, the
959 signature will be stripped. It is by definition impossible to preserve
960 signatures. The reason this is "nearly" proper, is because ideally if
961 the tag did not change (points to the same object, has the same name, etc.)
962 it should retain any signature. That is not the case, signatures will always
963 be removed, buyer beware. There is also no support for changing the
964 author or timestamp (or the tag message for that matter). Tags which point
965 to other tags will be rewritten to point to the underlying commit.</p></div>
966 </dd>
967 <dt class="hdlist1">
968 --prune-empty
969 </dt>
970 <dd>
972 Some filters will generate empty commits that leave the tree untouched.
973 This option instructs git-filter-branch to remove such commits if they
974 have exactly one or zero non-pruned parents; merge commits will
975 therefore remain intact. This option cannot be used together with
976 <code>--commit-filter</code>, though the same effect can be achieved by using the
977 provided <code>git_commit_non_empty_tree</code> function in a commit filter.
978 </p>
979 </dd>
980 <dt class="hdlist1">
981 --original &lt;namespace&gt;
982 </dt>
983 <dd>
985 Use this option to set the namespace where the original commits
986 will be stored. The default value is <em>refs/original</em>.
987 </p>
988 </dd>
989 <dt class="hdlist1">
990 -d &lt;directory&gt;
991 </dt>
992 <dd>
994 Use this option to set the path to the temporary directory used for
995 rewriting. When applying a tree filter, the command needs to
996 temporarily check out the tree to some directory, which may consume
997 considerable space in case of large projects. By default it
998 does this in the <code>.git-rewrite/</code> directory but you can override
999 that choice by this parameter.
1000 </p>
1001 </dd>
1002 <dt class="hdlist1">
1004 </dt>
1005 <dt class="hdlist1">
1006 --force
1007 </dt>
1008 <dd>
1010 <em>git filter-branch</em> refuses to start with an existing temporary
1011 directory or when there are already refs starting with
1012 <em>refs/original/</em>, unless forced.
1013 </p>
1014 </dd>
1015 <dt class="hdlist1">
1016 --state-branch &lt;branch&gt;
1017 </dt>
1018 <dd>
1020 This option will cause the mapping from old to new objects to
1021 be loaded from named branch upon startup and saved as a new
1022 commit to that branch upon exit, enabling incremental of large
1023 trees. If <em>&lt;branch&gt;</em> does not exist it will be created.
1024 </p>
1025 </dd>
1026 <dt class="hdlist1">
1027 &lt;rev-list options&gt;&#8230;
1028 </dt>
1029 <dd>
1031 Arguments for <em>git rev-list</em>. All positive refs included by
1032 these options are rewritten. You may also specify options
1033 such as <code>--all</code>, but you must use <code>--</code> to separate them from
1034 the <em>git filter-branch</em> options. Implies <a href="#Remap_to_ancestor">[Remap_to_ancestor]</a>.
1035 </p>
1036 </dd>
1037 </dl></div>
1038 <div class="sect2">
1039 <h3 id="Remap_to_ancestor">Remap to ancestor</h3>
1040 <div class="paragraph"><p>By using <a href="git-rev-list.html">git-rev-list(1)</a> arguments, e.g., path limiters, you can limit the
1041 set of revisions which get rewritten. However, positive refs on the command
1042 line are distinguished: we don&#8217;t let them be excluded by such limiters. For
1043 this purpose, they are instead rewritten to point at the nearest ancestor that
1044 was not excluded.</p></div>
1045 </div>
1046 </div>
1047 </div>
1048 <div class="sect1">
1049 <h2 id="_exit_status">EXIT STATUS</h2>
1050 <div class="sectionbody">
1051 <div class="paragraph"><p>On success, the exit status is <code>0</code>. If the filter can&#8217;t find any commits to
1052 rewrite, the exit status is <code>2</code>. On any other error, the exit status may be
1053 any other non-zero value.</p></div>
1054 </div>
1055 </div>
1056 <div class="sect1">
1057 <h2 id="_examples">EXAMPLES</h2>
1058 <div class="sectionbody">
1059 <div class="paragraph"><p>Suppose you want to remove a file (containing confidential information
1060 or copyright violation) from all commits:</p></div>
1061 <div class="listingblock">
1062 <div class="content">
1063 <pre><code>git filter-branch --tree-filter 'rm filename' HEAD</code></pre>
1064 </div></div>
1065 <div class="paragraph"><p>However, if the file is absent from the tree of some commit,
1066 a simple <code>rm filename</code> will fail for that tree and commit.
1067 Thus you may instead want to use <code>rm -f filename</code> as the script.</p></div>
1068 <div class="paragraph"><p>Using <code>--index-filter</code> with <em>git rm</em> yields a significantly faster
1069 version. Like with using <code>rm filename</code>, <code>git rm --cached filename</code>
1070 will fail if the file is absent from the tree of a commit. If you
1071 want to "completely forget" a file, it does not matter when it entered
1072 history, so we also add <code>--ignore-unmatch</code>:</p></div>
1073 <div class="listingblock">
1074 <div class="content">
1075 <pre><code>git filter-branch --index-filter 'git rm --cached --ignore-unmatch filename' HEAD</code></pre>
1076 </div></div>
1077 <div class="paragraph"><p>Now, you will get the rewritten history saved in HEAD.</p></div>
1078 <div class="paragraph"><p>To rewrite the repository to look as if <code>foodir/</code> had been its project
1079 root, and discard all other history:</p></div>
1080 <div class="listingblock">
1081 <div class="content">
1082 <pre><code>git filter-branch --subdirectory-filter foodir -- --all</code></pre>
1083 </div></div>
1084 <div class="paragraph"><p>Thus you can, e.g., turn a library subdirectory into a repository of
1085 its own. Note the <code>--</code> that separates <em>filter-branch</em> options from
1086 revision options, and the <code>--all</code> to rewrite all branches and tags.</p></div>
1087 <div class="paragraph"><p>To set a commit (which typically is at the tip of another
1088 history) to be the parent of the current initial commit, in
1089 order to paste the other history behind the current history:</p></div>
1090 <div class="listingblock">
1091 <div class="content">
1092 <pre><code>git filter-branch --parent-filter 'sed "s/^\$/-p &lt;graft-id&gt;/"' HEAD</code></pre>
1093 </div></div>
1094 <div class="paragraph"><p>(if the parent string is empty - which happens when we are dealing with
1095 the initial commit - add graftcommit as a parent). Note that this assumes
1096 history with a single root (that is, no merge without common ancestors
1097 happened). If this is not the case, use:</p></div>
1098 <div class="listingblock">
1099 <div class="content">
1100 <pre><code>git filter-branch --parent-filter \
1101 'test $GIT_COMMIT = &lt;commit-id&gt; &amp;&amp; echo "-p &lt;graft-id&gt;" || cat' HEAD</code></pre>
1102 </div></div>
1103 <div class="paragraph"><p>or even simpler:</p></div>
1104 <div class="listingblock">
1105 <div class="content">
1106 <pre><code>git replace --graft $commit-id $graft-id
1107 git filter-branch $graft-id..HEAD</code></pre>
1108 </div></div>
1109 <div class="paragraph"><p>To remove commits authored by "Darl McBribe" from the history:</p></div>
1110 <div class="listingblock">
1111 <div class="content">
1112 <pre><code>git filter-branch --commit-filter '
1113 if [ "$GIT_AUTHOR_NAME" = "Darl McBribe" ];
1114 then
1115 skip_commit "$@";
1116 else
1117 git commit-tree "$@";
1118 fi' HEAD</code></pre>
1119 </div></div>
1120 <div class="paragraph"><p>The function <em>skip_commit</em> is defined as follows:</p></div>
1121 <div class="listingblock">
1122 <div class="content">
1123 <pre><code>skip_commit()
1125 shift;
1126 while [ -n "$1" ];
1128 shift;
1129 map "$1";
1130 shift;
1131 done;
1132 }</code></pre>
1133 </div></div>
1134 <div class="paragraph"><p>The shift magic first throws away the tree id and then the -p
1135 parameters. Note that this handles merges properly! In case Darl
1136 committed a merge between P1 and P2, it will be propagated properly
1137 and all children of the merge will become merge commits with P1,P2
1138 as their parents instead of the merge commit.</p></div>
1139 <div class="paragraph"><p><strong>NOTE</strong> the changes introduced by the commits, and which are not reverted
1140 by subsequent commits, will still be in the rewritten branch. If you want
1141 to throw out <em>changes</em> together with the commits, you should use the
1142 interactive mode of <em>git rebase</em>.</p></div>
1143 <div class="paragraph"><p>You can rewrite the commit log messages using <code>--msg-filter</code>. For
1144 example, <em>git svn-id</em> strings in a repository created by <em>git svn</em> can
1145 be removed this way:</p></div>
1146 <div class="listingblock">
1147 <div class="content">
1148 <pre><code>git filter-branch --msg-filter '
1149 sed -e "/^git-svn-id:/d"
1150 '</code></pre>
1151 </div></div>
1152 <div class="paragraph"><p>If you need to add <em>Acked-by</em> lines to, say, the last 10 commits (none
1153 of which is a merge), use this command:</p></div>
1154 <div class="listingblock">
1155 <div class="content">
1156 <pre><code>git filter-branch --msg-filter '
1157 cat &amp;&amp;
1158 echo "Acked-by: Bugs Bunny &lt;bunny@bugzilla.org&gt;"
1159 ' HEAD~10..HEAD</code></pre>
1160 </div></div>
1161 <div class="paragraph"><p>The <code>--env-filter</code> option can be used to modify committer and/or author
1162 identity. For example, if you found out that your commits have the wrong
1163 identity due to a misconfigured user.email, you can make a correction,
1164 before publishing the project, like this:</p></div>
1165 <div class="listingblock">
1166 <div class="content">
1167 <pre><code>git filter-branch --env-filter '
1168 if test "$GIT_AUTHOR_EMAIL" = "root@localhost"
1169 then
1170 GIT_AUTHOR_EMAIL=john@example.com
1172 if test "$GIT_COMMITTER_EMAIL" = "root@localhost"
1173 then
1174 GIT_COMMITTER_EMAIL=john@example.com
1176 ' -- --all</code></pre>
1177 </div></div>
1178 <div class="paragraph"><p>To restrict rewriting to only part of the history, specify a revision
1179 range in addition to the new branch name. The new branch name will
1180 point to the top-most revision that a <em>git rev-list</em> of this range
1181 will print.</p></div>
1182 <div class="paragraph"><p>Consider this history:</p></div>
1183 <div class="listingblock">
1184 <div class="content">
1185 <pre><code> D--E--F--G--H
1187 A--B-----C</code></pre>
1188 </div></div>
1189 <div class="paragraph"><p>To rewrite only commits D,E,F,G,H, but leave A, B and C alone, use:</p></div>
1190 <div class="listingblock">
1191 <div class="content">
1192 <pre><code>git filter-branch ... C..H</code></pre>
1193 </div></div>
1194 <div class="paragraph"><p>To rewrite commits E,F,G,H, use one of these:</p></div>
1195 <div class="listingblock">
1196 <div class="content">
1197 <pre><code>git filter-branch ... C..H --not D
1198 git filter-branch ... D..H --not C</code></pre>
1199 </div></div>
1200 <div class="paragraph"><p>To move the whole tree into a subdirectory, or remove it from there:</p></div>
1201 <div class="listingblock">
1202 <div class="content">
1203 <pre><code>git filter-branch --index-filter \
1204 'git ls-files -s | sed "s-\t\"*-&amp;newsubdir/-" |
1205 GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
1206 git update-index --index-info &amp;&amp;
1207 mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"' HEAD</code></pre>
1208 </div></div>
1209 </div>
1210 </div>
1211 <div class="sect1">
1212 <h2 id="_checklist_for_shrinking_a_repository">CHECKLIST FOR SHRINKING A REPOSITORY</h2>
1213 <div class="sectionbody">
1214 <div class="paragraph"><p>git-filter-branch can be used to get rid of a subset of files,
1215 usually with some combination of <code>--index-filter</code> and
1216 <code>--subdirectory-filter</code>. People expect the resulting repository to
1217 be smaller than the original, but you need a few more steps to
1218 actually make it smaller, because Git tries hard not to lose your
1219 objects until you tell it to. First make sure that:</p></div>
1220 <div class="ulist"><ul>
1221 <li>
1223 You really removed all variants of a filename, if a blob was moved
1224 over its lifetime. <code>git log --name-only --follow --all -- filename</code>
1225 can help you find renames.
1226 </p>
1227 </li>
1228 <li>
1230 You really filtered all refs: use <code>--tag-name-filter cat -- --all</code>
1231 when calling git-filter-branch.
1232 </p>
1233 </li>
1234 </ul></div>
1235 <div class="paragraph"><p>Then there are two ways to get a smaller repository. A safer way is
1236 to clone, that keeps your original intact.</p></div>
1237 <div class="ulist"><ul>
1238 <li>
1240 Clone it with <code>git clone file:///path/to/repo</code>. The clone
1241 will not have the removed objects. See <a href="git-clone.html">git-clone(1)</a>. (Note
1242 that cloning with a plain path just hardlinks everything!)
1243 </p>
1244 </li>
1245 </ul></div>
1246 <div class="paragraph"><p>If you really don&#8217;t want to clone it, for whatever reasons, check the
1247 following points instead (in this order). This is a very destructive
1248 approach, so <strong>make a backup</strong> or go back to cloning it. You have been
1249 warned.</p></div>
1250 <div class="ulist"><ul>
1251 <li>
1253 Remove the original refs backed up by git-filter-branch: say <code>git
1254 for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git
1255 update-ref -d</code>.
1256 </p>
1257 </li>
1258 <li>
1260 Expire all reflogs with <code>git reflog expire --expire=now --all</code>.
1261 </p>
1262 </li>
1263 <li>
1265 Garbage collect all unreferenced objects with <code>git gc --prune=now</code>
1266 (or if your git-gc is not new enough to support arguments to
1267 <code>--prune</code>, use <code>git repack -ad; git prune</code> instead).
1268 </p>
1269 </li>
1270 </ul></div>
1271 </div>
1272 </div>
1273 <div class="sect1">
1274 <h2 id="PERFORMANCE">PERFORMANCE</h2>
1275 <div class="sectionbody">
1276 <div class="paragraph"><p>The performance of git-filter-branch is glacially slow; its design makes it
1277 impossible for a backward-compatible implementation to ever be fast:</p></div>
1278 <div class="ulist"><ul>
1279 <li>
1281 In editing files, git-filter-branch by design checks out each and
1282 every commit as it existed in the original repo. If your repo has
1283 <code>10^5</code> files and <code>10^5</code> commits, but each commit only modifies five
1284 files, then git-filter-branch will make you do <code>10^10</code> modifications,
1285 despite only having (at most) <code>5*10^5</code> unique blobs.
1286 </p>
1287 </li>
1288 <li>
1290 If you try and cheat and try to make git-filter-branch only work on
1291 files modified in a commit, then two things happen
1292 </p>
1293 <div class="ulist"><ul>
1294 <li>
1296 you run into problems with deletions whenever the user is simply
1297 trying to rename files (because attempting to delete files that
1298 don&#8217;t exist looks like a no-op; it takes some chicanery to remap
1299 deletes across file renames when the renames happen via arbitrary
1300 user-provided shell)
1301 </p>
1302 </li>
1303 <li>
1305 even if you succeed at the map-deletes-for-renames chicanery, you
1306 still technically violate backward compatibility because users
1307 are allowed to filter files in ways that depend upon topology of
1308 commits instead of filtering solely based on file contents or
1309 names (though this has not been observed in the wild).
1310 </p>
1311 </li>
1312 </ul></div>
1313 </li>
1314 <li>
1316 Even if you don&#8217;t need to edit files but only want to e.g. rename or
1317 remove some and thus can avoid checking out each file (i.e. you can
1318 use --index-filter), you still are passing shell snippets for your
1319 filters. This means that for every commit, you have to have a
1320 prepared git repo where those filters can be run. That&#8217;s a
1321 significant setup.
1322 </p>
1323 </li>
1324 <li>
1326 Further, several additional files are created or updated per commit
1327 by git-filter-branch. Some of these are for supporting the
1328 convenience functions provided by git-filter-branch (such as map()),
1329 while others are for keeping track of internal state (but could have
1330 also been accessed by user filters; one of git-filter-branch&#8217;s
1331 regression tests does so). This essentially amounts to using the
1332 filesystem as an IPC mechanism between git-filter-branch and the
1333 user-provided filters. Disks tend to be a slow IPC mechanism, and
1334 writing these files also effectively represents a forced
1335 synchronization point between separate processes that we hit with
1336 every commit.
1337 </p>
1338 </li>
1339 <li>
1341 The user-provided shell commands will likely involve a pipeline of
1342 commands, resulting in the creation of many processes per commit.
1343 Creating and running another process takes a widely varying amount
1344 of time between operating systems, but on any platform it is very
1345 slow relative to invoking a function.
1346 </p>
1347 </li>
1348 <li>
1350 git-filter-branch itself is written in shell, which is kind of slow.
1351 This is the one performance issue that could be backward-compatibly
1352 fixed, but compared to the above problems that are intrinsic to the
1353 design of git-filter-branch, the language of the tool itself is a
1354 relatively minor issue.
1355 </p>
1356 <div class="ulist"><ul>
1357 <li>
1359 Side note: Unfortunately, people tend to fixate on the
1360 written-in-shell aspect and periodically ask if git-filter-branch
1361 could be rewritten in another language to fix the performance
1362 issues. Not only does that ignore the bigger intrinsic problems
1363 with the design, it&#8217;d help less than you&#8217;d expect: if
1364 git-filter-branch itself were not shell, then the convenience
1365 functions (map(), skip_commit(), etc) and the <code>--setup</code> argument
1366 could no longer be executed once at the beginning of the program
1367 but would instead need to be prepended to every user filter (and
1368 thus re-executed with every commit).
1369 </p>
1370 </li>
1371 </ul></div>
1372 </li>
1373 </ul></div>
1374 <div class="paragraph"><p>The <a href="https://github.com/newren/git-filter-repo/">git filter-repo</a> tool is
1375 an alternative to git-filter-branch which does not suffer from these
1376 performance problems or the safety problems (mentioned below). For those
1377 with existing tooling which relies upon git-filter-branch, <em>git
1378 filter-repo</em> also provides
1379 <a href="https://github.com/newren/git-filter-repo/blob/master/contrib/filter-repo-demos/filter-lamely">filter-lamely</a>,
1380 a drop-in git-filter-branch replacement (with a few caveats). While
1381 filter-lamely suffers from all the same safety issues as
1382 git-filter-branch, it at least ameliorates the performance issues a
1383 little.</p></div>
1384 </div>
1385 </div>
1386 <div class="sect1">
1387 <h2 id="SAFETY">SAFETY</h2>
1388 <div class="sectionbody">
1389 <div class="paragraph"><p>git-filter-branch is riddled with gotchas resulting in various ways to
1390 easily corrupt repos or end up with a mess worse than what you started
1391 with:</p></div>
1392 <div class="ulist"><ul>
1393 <li>
1395 Someone can have a set of "working and tested filters" which they
1396 document or provide to a coworker, who then runs them on a different
1397 OS where the same commands are not working/tested (some examples in
1398 the git-filter-branch manpage are also affected by this).
1399 BSD vs. GNU userland differences can really bite. If lucky, error
1400 messages are spewed. But just as likely, the commands either don&#8217;t
1401 do the filtering requested, or silently corrupt by making some
1402 unwanted change. The unwanted change may only affect a few commits,
1403 so it&#8217;s not necessarily obvious either. (The fact that problems
1404 won&#8217;t necessarily be obvious means they are likely to go unnoticed
1405 until the rewritten history is in use for quite a while, at which
1406 point it&#8217;s really hard to justify another flag-day for another
1407 rewrite.)
1408 </p>
1409 </li>
1410 <li>
1412 Filenames with spaces are often mishandled by shell snippets since
1413 they cause problems for shell pipelines. Not everyone is familiar
1414 with find -print0, xargs -0, git-ls-files -z, etc. Even people who
1415 are familiar with these may assume such flags are not relevant
1416 because someone else renamed any such files in their repo back
1417 before the person doing the filtering joined the project. And
1418 often, even those familiar with handling arguments with spaces may
1419 not do so just because they aren&#8217;t in the mindset of thinking about
1420 everything that could possibly go wrong.
1421 </p>
1422 </li>
1423 <li>
1425 Non-ascii filenames can be silently removed despite being in a
1426 desired directory. Keeping only wanted paths is often done using
1427 pipelines like <code>git ls-files | grep -v ^WANTED_DIR/ | xargs git rm</code>.
1428 ls-files will only quote filenames if needed, so folks may not
1429 notice that one of the files didn&#8217;t match the regex (at least not
1430 until it&#8217;s much too late). Yes, someone who knows about
1431 core.quotePath can avoid this (unless they have other special
1432 characters like \t, \n, or "), and people who use ls-files -z with
1433 something other than grep can avoid this, but that doesn&#8217;t mean they
1434 will.
1435 </p>
1436 </li>
1437 <li>
1439 Similarly, when moving files around, one can find that filenames
1440 with non-ascii or special characters end up in a different
1441 directory, one that includes a double quote character. (This is
1442 technically the same issue as above with quoting, but perhaps an
1443 interesting different way that it can and has manifested as a
1444 problem.)
1445 </p>
1446 </li>
1447 <li>
1449 It&#8217;s far too easy to accidentally mix up old and new history. It&#8217;s
1450 still possible with any tool, but git-filter-branch almost
1451 invites it. If lucky, the only downside is users getting frustrated
1452 that they don&#8217;t know how to shrink their repo and remove the old
1453 stuff. If unlucky, they merge old and new history and end up with
1454 multiple "copies" of each commit, some of which have unwanted or
1455 sensitive files and others which don&#8217;t. This comes about in
1456 multiple different ways:
1457 </p>
1458 <div class="ulist"><ul>
1459 <li>
1461 the default to only doing a partial history rewrite (<em>--all</em> is not
1462 the default and few examples show it)
1463 </p>
1464 </li>
1465 <li>
1467 the fact that there&#8217;s no automatic post-run cleanup
1468 </p>
1469 </li>
1470 <li>
1472 the fact that --tag-name-filter (when used to rename tags) doesn&#8217;t
1473 remove the old tags but just adds new ones with the new name
1474 </p>
1475 </li>
1476 <li>
1478 the fact that little educational information is provided to inform
1479 users of the ramifications of a rewrite and how to avoid mixing old
1480 and new history. For example, this man page discusses how users
1481 need to understand that they need to rebase their changes for all
1482 their branches on top of new history (or delete and reclone), but
1483 that&#8217;s only one of multiple concerns to consider. See the
1484 "DISCUSSION" section of the git filter-repo manual page for more
1485 details.
1486 </p>
1487 </li>
1488 </ul></div>
1489 </li>
1490 <li>
1492 Annotated tags can be accidentally converted to lightweight tags,
1493 due to either of two issues:
1494 </p>
1495 <div class="ulist"><ul>
1496 <li>
1498 Someone can do a history rewrite, realize they messed up, restore
1499 from the backups in refs/original/, and then redo their
1500 git-filter-branch command. (The backup in refs/original/ is not a
1501 real backup; it dereferences tags first.)
1502 </p>
1503 </li>
1504 <li>
1506 Running git-filter-branch with either --tags or --all in your
1507 &lt;rev-list-options&gt;. In order to retain annotated tags as
1508 annotated, you must use --tag-name-filter (and must not have
1509 restored from refs/original/ in a previously botched rewrite).
1510 </p>
1511 </li>
1512 </ul></div>
1513 </li>
1514 <li>
1516 Any commit messages that specify an encoding will become corrupted
1517 by the rewrite; git-filter-branch ignores the encoding, takes the
1518 original bytes, and feeds it to commit-tree without telling it the
1519 proper encoding. (This happens whether or not --msg-filter is
1520 used.)
1521 </p>
1522 </li>
1523 <li>
1525 Commit messages (even if they are all UTF-8) by default become
1526 corrupted due to not being updated&#8201;&#8212;&#8201;any references to other commit
1527 hashes in commit messages will now refer to no-longer-extant
1528 commits.
1529 </p>
1530 </li>
1531 <li>
1533 There are no facilities for helping users find what unwanted crud
1534 they should delete, which means they are much more likely to have
1535 incomplete or partial cleanups that sometimes result in confusion
1536 and people wasting time trying to understand. (For example, folks
1537 tend to just look for big files to delete instead of big directories
1538 or extensions, and once they do so, then sometime later folks using
1539 the new repository who are going through history will notice a build
1540 artifact directory that has some files but not others, or a cache of
1541 dependencies (node_modules or similar) which couldn&#8217;t have ever been
1542 functional since it&#8217;s missing some files.)
1543 </p>
1544 </li>
1545 <li>
1547 If --prune-empty isn&#8217;t specified, then the filtering process can
1548 create hoards of confusing empty commits
1549 </p>
1550 </li>
1551 <li>
1553 If --prune-empty is specified, then intentionally placed empty
1554 commits from before the filtering operation are also pruned instead
1555 of just pruning commits that became empty due to filtering rules.
1556 </p>
1557 </li>
1558 <li>
1560 If --prune-empty is specified, sometimes empty commits are missed
1561 and left around anyway (a somewhat rare bug, but it happens&#8230;)
1562 </p>
1563 </li>
1564 <li>
1566 A minor issue, but users who have a goal to update all names and
1567 emails in a repository may be led to --env-filter which will only
1568 update authors and committers, missing taggers.
1569 </p>
1570 </li>
1571 <li>
1573 If the user provides a --tag-name-filter that maps multiple tags to
1574 the same name, no warning or error is provided; git-filter-branch
1575 simply overwrites each tag in some undocumented pre-defined order
1576 resulting in only one tag at the end. (A git-filter-branch
1577 regression test requires this surprising behavior.)
1578 </p>
1579 </li>
1580 </ul></div>
1581 <div class="paragraph"><p>Also, the poor performance of git-filter-branch often leads to safety
1582 issues:</p></div>
1583 <div class="ulist"><ul>
1584 <li>
1586 Coming up with the correct shell snippet to do the filtering you
1587 want is sometimes difficult unless you&#8217;re just doing a trivial
1588 modification such as deleting a couple files. Unfortunately, people
1589 often learn if the snippet is right or wrong by trying it out, but
1590 the rightness or wrongness can vary depending on special
1591 circumstances (spaces in filenames, non-ascii filenames, funny
1592 author names or emails, invalid timezones, presence of grafts or
1593 replace objects, etc.), meaning they may have to wait a long time,
1594 hit an error, then restart. The performance of git-filter-branch is
1595 so bad that this cycle is painful, reducing the time available to
1596 carefully re-check (to say nothing about what it does to the
1597 patience of the person doing the rewrite even if they do technically
1598 have more time available). This problem is extra compounded because
1599 errors from broken filters may not be shown for a long time and/or
1600 get lost in a sea of output. Even worse, broken filters often just
1601 result in silent incorrect rewrites.
1602 </p>
1603 </li>
1604 <li>
1606 To top it all off, even when users finally find working commands,
1607 they naturally want to share them. But they may be unaware that
1608 their repo didn&#8217;t have some special cases that someone else&#8217;s does.
1609 So, when someone else with a different repository runs the same
1610 commands, they get hit by the problems above. Or, the user just
1611 runs commands that really were vetted for special cases, but they
1612 run it on a different OS where it doesn&#8217;t work, as noted above.
1613 </p>
1614 </li>
1615 </ul></div>
1616 </div>
1617 </div>
1618 <div class="sect1">
1619 <h2 id="_git">GIT</h2>
1620 <div class="sectionbody">
1621 <div class="paragraph"><p>Part of the <a href="git.html">git(1)</a> suite</p></div>
1622 </div>
1623 </div>
1624 </div>
1625 <div id="footnotes"><hr /></div>
1626 <div id="footer">
1627 <div id="footer-text">
1628 Last updated
1629 2024-02-08 15:45:59 PST
1630 </div>
1631 </div>
1632 </body>
1633 </html>