@@ -9,21 +9,21 @@ GIT index format
99 - A 12-byte header consisting of
1010
1111 4-byte signature:
12- The signature is { 'D', 'I', 'R', 'C' }
12+ The signature is { 'D', 'I', 'R', 'C' } (stands for "dircache")
1313
1414 4-byte version number:
1515 The current supported versions are 2 and 3.
1616
1717 32-bit number of index entries.
1818
19- - A number of sorted index entries
19+ - A number of sorted index entries (see below).
2020
2121 - Extensions
2222
2323 Extensions are identified by signature. Optional extensions can
2424 be ignored if GIT does not understand them.
2525
26- GIT currently supports tree cache and resolve undo extensions.
26+ GIT currently supports cached tree and resolve undo extensions.
2727
2828 4-byte extension signature. If the first byte is 'A'..'Z' the
2929 extension is optional and can be ignored.
@@ -38,8 +38,9 @@ GIT index format
3838== Index entry
3939
4040 Index entries are sorted in ascending order on the name field,
41- interpreted as a string of unsigned bytes. Entries with the same
42- name are sorted by their stage field.
41+ interpreted as a string of unsigned bytes (i.e. memcmp() order, no
42+ localization, no special casing of directory separator '/'). Entries
43+ with the same name are sorted by their stage field.
4344
4445 32-bit ctime seconds, the last time a file's metadata changed
4546 this is stat(2) data
@@ -62,12 +63,13 @@ GIT index format
6263 32-bit mode, split into (high to low bits)
6364
6465 4-bit object type
65- valid values in binary are 1000 (blob ), 1010 (symbolic link)
66+ valid values in binary are 1000 (regular file ), 1010 (symbolic link)
6667 and 1110 (gitlink)
6768
6869 3-bit unused
6970
70- 9-bit unix permission (only 0755 and 0644 are valid)
71+ 9-bit unix permission. Only 0755 and 0644 are valid for regular files.
72+ Symbolic links and gitlinks have value 0 in this field.
7173
7274 32-bit uid
7375 this is stat(2) data
@@ -76,19 +78,20 @@ GIT index format
7678 this is stat(2) data
7779
7880 32-bit file size
79- This is the on-disk size from stat(2)
81+ This is the on-disk size from stat(2), truncated to 32-bit.
8082
8183 160-bit SHA-1 for the represented object
8284
83- A 16-bit field split into (high to low bits)
85+ A 16-bit 'flags' field split into (high to low bits)
8486
8587 1-bit assume-valid flag
8688
8789 1-bit extended flag (must be zero in version 2)
8890
8991 2-bit stage (during merge)
9092
91- 12-bit name length if the length is less than 0x0FFF
93+ 12-bit name length if the length is less than 0xFFF; otherwise 0xFFF
94+ is stored in this field.
9295
9396 (Version 3) A 16-bit field, only applicable if the "extended flag"
9497 above is 1, split into (high to low bits).
@@ -103,63 +106,80 @@ GIT index format
103106
104107 Entry path name (variable length) relative to top level directory
105108 (without leading slash). '/' is used as path separator. The special
106- paths ".", ".." and ".git" (without quotes) are disallowed.
109+ path components ".", ".." and ".git" (without quotes) are disallowed.
107110 Trailing slash is also disallowed.
108111
109112 The exact encoding is undefined, but the '.' and '/' characters
110- are encoded in 7-bit ASCII and the encoding cannot contain a nul
111- byte. Generally a superset of ASCII .
113+ are encoded in 7-bit ASCII and the encoding cannot contain a NUL
114+ byte (iow, this is a UNIX pathname) .
112115
113116 1-8 nul bytes as necessary to pad the entry to a multiple of eight bytes
114117 while keeping the name NUL-terminated.
115118
116119== Extensions
117120
118- === Tree cache
121+ === Cached tree
119122
120- Tree cache extension contains pre-computed hashes for trees that can
123+ Cached tree extension contains pre-computed hashes for trees that can
121124 be derived from the index. It helps speed up tree object generation
122125 from index for a new commit.
123126
124127 When a path is updated in index, the path must be invalidated and
125128 removed from tree cache.
126129
127- - Extension tag { 'T', 'R', 'E', 'E' }
130+ The signature for this extension is { 'T', 'R', 'E', 'E' }.
128131
129- - 32-bit size
132+ A series of entries fill the entire extension; each of which
133+ consists of:
130134
131- - A number of entries
135+ - NUL-terminated path component (relative to its parent directory);
132136
133- NUL-terminated tree name
137+ - ASCII decimal number of entries in the index that is covered by the
138+ tree this entry represents (entry_count);
134139
135- Blank-terminated ASCII decimal number of entries in this tree
140+ - A space ( ASCII 32);
136141
137- Newline-terminated position of this tree in the parent tree. 0 for
138- the root tree
142+ - ASCII decimal number that represents the number of subtrees this
143+ tree has;
139144
140- 160-bit SHA-1 for this tree and it's children
145+ - A newline (ASCII 10); and
146+
147+ - 160-bit object name for the object that would result from writing
148+ this span of index as a tree.
149+
150+ An entry can be in an invalidated state and is represented by having -1
151+ in the entry_count field.
152+
153+ The entries are written out in the top-down, depth-first order. The
154+ first entry represents the root level of the repository, followed by the
155+ first subtree---let's call this A---of the root level (with its name
156+ relative to the root level), followed by the first subtree of A (with
157+ its name relative to A), ...
141158
142159=== Resolve undo
143160
144- A conflict is represented in index as a set of higher stage entries.
161+ A conflict is represented in the index as a set of higher stage entries.
145162 When a conflict is resolved (e.g. with "git add path"), these higher
146- stage entries will be removed and a stage-0 entry with proper
147- resoluton is added.
163+ stage entries will be removed and a stage-0 entry with proper resoluton
164+ is added.
148165
149- Resolve undo extension saves these higher stage entries so that
150- conflicts can be recreated (e.g. with "git checkout -m"), in case
151- users want to redo a conflict resolution from scratch.
166+ When these higher stage entries are removed, they are saved in the
167+ resolve undo extension, so that conflicts can be recreated (e.g. with
168+ "git checkout -m"), in case users want to redo a conflict resolution
169+ from scratch.
152170
153- - Extension tag { 'R', 'E', 'U', 'C' }
171+ The signature for this extension is { 'R', 'E', 'U', 'C' }.
154172
155- - 32-bit size
173+ A series of entries fill the entire extension; each of which
174+ consists of:
156175
157- - A number of conflict entries
176+ - NUL-terminated pathname the entry describes (relative to the root of
177+ the repository, i.e. full pathname);
158178
159- NUL-terminated conflict path
179+ - Three NUL-terminated ASCII octal numbers, entry mode of entries in
180+ stage 1 to 3 (a missing stage is represented by "0" in this field);
181+ and
160182
161- Three NUL-terminated ASCII octal numbers, entry mode of entries in
162- stage 1 to 3 .
183+ - At most three 160-bit object names of the entry in stages from 1 to 3
184+ (nothing is written for a missing stage) .
163185
164- At most three 160-bit SHA-1s of the entry in three stages from 1
165- to 3. SHA-1 is not saved for any stage with entry mode zero.
0 commit comments