Skip to content

Commit 8c7d051

Browse files
pcloudsgitster
authored andcommitted
doc: technical details about the index file format
This bases on the original work by Robin Rosenberg. Signed-off-by: Robin Rosenberg <[email protected]> Signed-off-by: Nguyễn Thái Ngọc Duy <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 154adcf commit 8c7d051

File tree

1 file changed

+165
-0
lines changed

1 file changed

+165
-0
lines changed
Lines changed: 165 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,165 @@
1+
GIT index format
2+
================
3+
4+
= The git index file has the following format
5+
6+
All binary numbers are in network byte order. Version 2 is described
7+
here unless stated otherwise.
8+
9+
- A 12-byte header consisting of
10+
11+
4-byte signature:
12+
The signature is { 'D', 'I', 'R', 'C' }
13+
14+
4-byte version number:
15+
The current supported versions are 2 and 3.
16+
17+
32-bit number of index entries.
18+
19+
- A number of sorted index entries
20+
21+
- Extensions
22+
23+
Extensions are identified by signature. Optional extensions can
24+
be ignored if GIT does not understand them.
25+
26+
GIT currently supports tree cache and resolve undo extensions.
27+
28+
4-byte extension signature. If the first byte is 'A'..'Z' the
29+
extension is optional and can be ignored.
30+
31+
32-bit size of the extension
32+
33+
Extension data
34+
35+
- 160-bit SHA-1 over the content of the index file before this
36+
checksum.
37+
38+
== Index entry
39+
40+
Index entries are sorted in ascending order on the name field,
41+
interpreted as a string of unsigned bytes. Entries with the same
42+
name are sorted by their stage field.
43+
44+
32-bit ctime seconds, the last time a file's metadata changed
45+
this is stat(2) data
46+
47+
32-bit ctime nanosecond fractions
48+
this is stat(2) data
49+
50+
32-bit mtime seconds, the last time a file's data changed
51+
this is stat(2) data
52+
53+
32-bit mtime nanosecond fractions
54+
this is stat(2) data
55+
56+
32-bit dev
57+
this is stat(2) data
58+
59+
32-bit ino
60+
this is stat(2) data
61+
62+
32-bit mode, split into (high to low bits)
63+
64+
4-bit object type
65+
valid values in binary are 1000 (blob), 1010 (symbolic link)
66+
and 1110 (gitlink)
67+
68+
3-bit unused
69+
70+
9-bit unix permission (only 0755 and 0644 are valid)
71+
72+
32-bit uid
73+
this is stat(2) data
74+
75+
32-bit gid
76+
this is stat(2) data
77+
78+
32-bit file size
79+
This is the on-disk size from stat(2)
80+
81+
160-bit SHA-1 for the represented object
82+
83+
A 16-bit field split into (high to low bits)
84+
85+
1-bit assume-valid flag
86+
87+
1-bit extended flag (must be zero in version 2)
88+
89+
2-bit stage (during merge)
90+
91+
12-bit name length if the length is less than 0x0FFF
92+
93+
(Version 3) A 16-bit field, only applicable if the "extended flag"
94+
above is 1, split into (high to low bits).
95+
96+
1-bit reserved for future
97+
98+
1-bit skip-worktree flag (used by sparse checkout)
99+
100+
1-bit intent-to-add flag (used by "git add -N")
101+
102+
13-bit unused, must be zero
103+
104+
Entry path name (variable length) relative to top level directory
105+
(without leading slash). '/' is used as path separator. The special
106+
paths ".", ".." and ".git" (without quotes) are disallowed.
107+
Trailing slash is also disallowed.
108+
109+
The exact encoding is undefined, but the '.' and '/' characters
110+
are encoded in 7-bit ASCII and the encoding cannot contain a nul
111+
byte. Generally a superset of ASCII.
112+
113+
1-8 nul bytes as necessary to pad the entry to a multiple of eight bytes
114+
while keeping the name NUL-terminated.
115+
116+
== Extensions
117+
118+
=== Tree cache
119+
120+
Tree cache extension contains pre-computed hashes for trees that can
121+
be derived from the index. It helps speed up tree object generation
122+
from index for a new commit.
123+
124+
When a path is updated in index, the path must be invalidated and
125+
removed from tree cache.
126+
127+
- Extension tag { 'T', 'R', 'E', 'E' }
128+
129+
- 32-bit size
130+
131+
- A number of entries
132+
133+
NUL-terminated tree name
134+
135+
Blank-terminated ASCII decimal number of entries in this tree
136+
137+
Newline-terminated position of this tree in the parent tree. 0 for
138+
the root tree
139+
140+
160-bit SHA-1 for this tree and it's children
141+
142+
=== Resolve undo
143+
144+
A conflict is represented in index as a set of higher stage entries.
145+
When a conflict is resolved (e.g. with "git add path"), these higher
146+
stage entries will be removed and a stage-0 entry with proper
147+
resoluton is added.
148+
149+
Resolve undo extension saves these higher stage entries so that
150+
conflicts can be recreated (e.g. with "git checkout -m"), in case
151+
users want to redo a conflict resolution from scratch.
152+
153+
- Extension tag { 'R', 'E', 'U', 'C' }
154+
155+
- 32-bit size
156+
157+
- A number of conflict entries
158+
159+
NUL-terminated conflict path
160+
161+
Three NUL-terminated ASCII octal numbers, entry mode of entries in
162+
stage 1 to 3.
163+
164+
At most three 160-bit SHA-1s of the entry in three stages from 1
165+
to 3. SHA-1 is not saved for any stage with entry mode zero.

0 commit comments

Comments
 (0)