Skip to content

Commit 1ebec8d

Browse files
peffgitster
authored andcommitted
fast-import: duplicate into history rather than passing ownership
Fast-import's read_next_command() has somewhat odd memory ownership semantics for the command_buf strbuf. After reading a command, we copy the strbuf's pointer (without duplicating the string) into our cmd_hist array of recent commands. And then when we're about to read a new command, we clear the strbuf by calling strbuf_detach(), dropping ownership from the strbuf (leaving the cmd_hist reference as the remaining owner). This has a few surprising implications: - if the strbuf hasn't been copied into cmd_hist (e.g., because we haven't ready any commands yet), then the strbuf_detach() will leak the resulting string - any modification to command_buf risks invalidating the pointer held by cmd_hist. There doesn't seem to be any way to trigger this currently (since we tend to modify it only by detaching and reading in a new value), but it's subtly dangerous. - any pointers into an input string will remain valid as long as cmd_hist points to them. So in general, you can point into command_buf.buf and call read_next_command() up to 100 times before your string is cycled out and freed, leaving you with a dangling pointer. This makes it easy to miss bugs during testing, as they might trigger only for a sufficiently large commit (e.g., the bug fixed in the previous commit). Instead, let's make a new string to copy the command into the history array, rather than having dual ownership with the old. Then we can drop the strbuf_detach() calls entirely, and just reuse the same buffer within command_buf over and over. We'd normally have to strbuf_reset() it before using it again, but in both cases here we're using strbuf_getline(), which does it automatically for us. This fixes the leak, and it means that even a single call to read_next_command() will invalidate any held pointers, making it easier to find bugs. In fact, we can drop the extra input lines added to the test case by the previous commit, as the unfixed bug would now trigger just from reading the commit message, even without any modified files in the commit. Reported-by: Mike Hommey <[email protected]> Signed-off-by: Jeff King <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 9756082 commit 1ebec8d

File tree

2 files changed

+1
-8
lines changed

2 files changed

+1
-8
lines changed

fast-import.c

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1763,7 +1763,6 @@ static int read_next_command(void)
17631763
} else {
17641764
struct recent_command *rc;
17651765

1766-
strbuf_detach(&command_buf, NULL);
17671766
stdin_eof = strbuf_getline_lf(&command_buf, stdin);
17681767
if (stdin_eof)
17691768
return EOF;
@@ -1784,7 +1783,7 @@ static int read_next_command(void)
17841783
free(rc->buf);
17851784
}
17861785

1787-
rc->buf = command_buf.buf;
1786+
rc->buf = xstrdup(command_buf.buf);
17881787
rc->prev = cmd_tail;
17891788
rc->next = cmd_hist.prev;
17901789
rc->prev->next = rc;
@@ -1833,7 +1832,6 @@ static int parse_data(struct strbuf *sb, uintmax_t limit, uintmax_t *len_res)
18331832
char *term = xstrdup(data);
18341833
size_t term_len = command_buf.len - (data - command_buf.buf);
18351834

1836-
strbuf_detach(&command_buf, NULL);
18371835
for (;;) {
18381836
if (strbuf_getline_lf(&command_buf, stdin) == EOF)
18391837
die("EOF in data (terminator '%s' not found)", term);

t/t9300-fast-import.sh

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3314,11 +3314,6 @@ test_expect_success 'X: handling encoding' '
33143314
33153315
printf "Pi: \360\nCOMMIT\n" >>input &&
33163316
3317-
for i in $(test_seq 100)
3318-
do
3319-
echo "M 644 $EMPTY_BLOB file-$i"
3320-
done >>input &&
3321-
33223317
git fast-import <input &&
33233318
git cat-file -p encoding | grep $(printf "\360") &&
33243319
git log -1 --format=%B encoding | grep $(printf "\317\200")

0 commit comments

Comments
 (0)