@@ -9,21 +9,21 @@ GIT index format
9
9
- A 12-byte header consisting of
10
10
11
11
4-byte signature:
12
- The signature is { 'D', 'I', 'R', 'C' }
12
+ The signature is { 'D', 'I', 'R', 'C' } (stands for "dircache")
13
13
14
14
4-byte version number:
15
15
The current supported versions are 2 and 3.
16
16
17
17
32-bit number of index entries.
18
18
19
- - A number of sorted index entries
19
+ - A number of sorted index entries (see below).
20
20
21
21
- Extensions
22
22
23
23
Extensions are identified by signature. Optional extensions can
24
24
be ignored if GIT does not understand them.
25
25
26
- GIT currently supports tree cache and resolve undo extensions.
26
+ GIT currently supports cached tree and resolve undo extensions.
27
27
28
28
4-byte extension signature. If the first byte is 'A'..'Z' the
29
29
extension is optional and can be ignored.
@@ -38,8 +38,9 @@ GIT index format
38
38
== Index entry
39
39
40
40
Index entries are sorted in ascending order on the name field,
41
- interpreted as a string of unsigned bytes. Entries with the same
42
- name are sorted by their stage field.
41
+ interpreted as a string of unsigned bytes (i.e. memcmp() order, no
42
+ localization, no special casing of directory separator '/'). Entries
43
+ with the same name are sorted by their stage field.
43
44
44
45
32-bit ctime seconds, the last time a file's metadata changed
45
46
this is stat(2) data
@@ -62,12 +63,13 @@ GIT index format
62
63
32-bit mode, split into (high to low bits)
63
64
64
65
4-bit object type
65
- valid values in binary are 1000 (blob ), 1010 (symbolic link)
66
+ valid values in binary are 1000 (regular file ), 1010 (symbolic link)
66
67
and 1110 (gitlink)
67
68
68
69
3-bit unused
69
70
70
- 9-bit unix permission (only 0755 and 0644 are valid)
71
+ 9-bit unix permission. Only 0755 and 0644 are valid for regular files.
72
+ Symbolic links and gitlinks have value 0 in this field.
71
73
72
74
32-bit uid
73
75
this is stat(2) data
@@ -76,19 +78,20 @@ GIT index format
76
78
this is stat(2) data
77
79
78
80
32-bit file size
79
- This is the on-disk size from stat(2)
81
+ This is the on-disk size from stat(2), truncated to 32-bit.
80
82
81
83
160-bit SHA-1 for the represented object
82
84
83
- A 16-bit field split into (high to low bits)
85
+ A 16-bit 'flags' field split into (high to low bits)
84
86
85
87
1-bit assume-valid flag
86
88
87
89
1-bit extended flag (must be zero in version 2)
88
90
89
91
2-bit stage (during merge)
90
92
91
- 12-bit name length if the length is less than 0x0FFF
93
+ 12-bit name length if the length is less than 0xFFF; otherwise 0xFFF
94
+ is stored in this field.
92
95
93
96
(Version 3) A 16-bit field, only applicable if the "extended flag"
94
97
above is 1, split into (high to low bits).
@@ -103,63 +106,80 @@ GIT index format
103
106
104
107
Entry path name (variable length) relative to top level directory
105
108
(without leading slash). '/' is used as path separator. The special
106
- paths ".", ".." and ".git" (without quotes) are disallowed.
109
+ path components ".", ".." and ".git" (without quotes) are disallowed.
107
110
Trailing slash is also disallowed.
108
111
109
112
The exact encoding is undefined, but the '.' and '/' characters
110
- are encoded in 7-bit ASCII and the encoding cannot contain a nul
111
- byte. Generally a superset of ASCII .
113
+ are encoded in 7-bit ASCII and the encoding cannot contain a NUL
114
+ byte (iow, this is a UNIX pathname) .
112
115
113
116
1-8 nul bytes as necessary to pad the entry to a multiple of eight bytes
114
117
while keeping the name NUL-terminated.
115
118
116
119
== Extensions
117
120
118
- === Tree cache
121
+ === Cached tree
119
122
120
- Tree cache extension contains pre-computed hashes for trees that can
123
+ Cached tree extension contains pre-computed hashes for trees that can
121
124
be derived from the index. It helps speed up tree object generation
122
125
from index for a new commit.
123
126
124
127
When a path is updated in index, the path must be invalidated and
125
128
removed from tree cache.
126
129
127
- - Extension tag { 'T', 'R', 'E', 'E' }
130
+ The signature for this extension is { 'T', 'R', 'E', 'E' }.
128
131
129
- - 32-bit size
132
+ A series of entries fill the entire extension; each of which
133
+ consists of:
130
134
131
- - A number of entries
135
+ - NUL-terminated path component (relative to its parent directory);
132
136
133
- NUL-terminated tree name
137
+ - ASCII decimal number of entries in the index that is covered by the
138
+ tree this entry represents (entry_count);
134
139
135
- Blank-terminated ASCII decimal number of entries in this tree
140
+ - A space ( ASCII 32);
136
141
137
- Newline-terminated position of this tree in the parent tree. 0 for
138
- the root tree
142
+ - ASCII decimal number that represents the number of subtrees this
143
+ tree has;
139
144
140
- 160-bit SHA-1 for this tree and it's children
145
+ - A newline (ASCII 10); and
146
+
147
+ - 160-bit object name for the object that would result from writing
148
+ this span of index as a tree.
149
+
150
+ An entry can be in an invalidated state and is represented by having -1
151
+ in the entry_count field.
152
+
153
+ The entries are written out in the top-down, depth-first order. The
154
+ first entry represents the root level of the repository, followed by the
155
+ first subtree---let's call this A---of the root level (with its name
156
+ relative to the root level), followed by the first subtree of A (with
157
+ its name relative to A), ...
141
158
142
159
=== Resolve undo
143
160
144
- A conflict is represented in index as a set of higher stage entries.
161
+ A conflict is represented in the index as a set of higher stage entries.
145
162
When a conflict is resolved (e.g. with "git add path"), these higher
146
- stage entries will be removed and a stage-0 entry with proper
147
- resoluton is added.
163
+ stage entries will be removed and a stage-0 entry with proper resoluton
164
+ is added.
148
165
149
- Resolve undo extension saves these higher stage entries so that
150
- conflicts can be recreated (e.g. with "git checkout -m"), in case
151
- users want to redo a conflict resolution from scratch.
166
+ When these higher stage entries are removed, they are saved in the
167
+ resolve undo extension, so that conflicts can be recreated (e.g. with
168
+ "git checkout -m"), in case users want to redo a conflict resolution
169
+ from scratch.
152
170
153
- - Extension tag { 'R', 'E', 'U', 'C' }
171
+ The signature for this extension is { 'R', 'E', 'U', 'C' }.
154
172
155
- - 32-bit size
173
+ A series of entries fill the entire extension; each of which
174
+ consists of:
156
175
157
- - A number of conflict entries
176
+ - NUL-terminated pathname the entry describes (relative to the root of
177
+ the repository, i.e. full pathname);
158
178
159
- NUL-terminated conflict path
179
+ - Three NUL-terminated ASCII octal numbers, entry mode of entries in
180
+ stage 1 to 3 (a missing stage is represented by "0" in this field);
181
+ and
160
182
161
- Three NUL-terminated ASCII octal numbers, entry mode of entries in
162
- stage 1 to 3 .
183
+ - At most three 160-bit object names of the entry in stages from 1 to 3
184
+ (nothing is written for a missing stage) .
163
185
164
- At most three 160-bit SHA-1s of the entry in three stages from 1
165
- to 3. SHA-1 is not saved for any stage with entry mode zero.
0 commit comments