Skip to content

Commit f3b166b

Browse files
manmitaben-schwen
andauthored
moved check for eof in fread to avoid segfault
* merge from master * fix(7407): added eof instead of x1A * fix(7407): refined the test and news * Update NEWS.md for version v1.18.0 Updated NEWS.md with fixes and enhancements for fread and sum functions. * Update NEWS.md for fread segfault issue Fixed formatting and clarified the segfault issue for fread. * fix(7407): Remove comment at eof check in fread.c * fix(7407): Changed the test to single line --------- Co-authored-by: Benjamin Schwendinger <52290390+ben-schwen@users.noreply.github.com>
1 parent 5bc7750 commit f3b166b

3 files changed

Lines changed: 8 additions & 2 deletions

File tree

NEWS.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,8 @@
2828

2929
4. `sum(<int64 column>)` by group is correct with missing entries and GForce activated ([#7571](https://github.com/Rdatatable/data.table/issues/7571)). Thanks to @rweberc for the report and @manmita for the fix. The issue was caused by a faulty early `break` that spilled between groups, and resulted in silently incorrect results!
3030

31+
5. `fread(text=)` could segfault when reading text input ending with a `\x1a` (ASCII SUB) character after a long line, [#7407](https://github.com/Rdatatable/data.table/issues/7407) which is solved by adding check for eof. Thanks @aitap for the report and @manmita for the fix.
32+
3133
### Notes
3234

3335
1. {data.table} now depends on R 3.5.0 (2018).

inst/tests/tests.Rraw

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22031,3 +22031,7 @@ if (test_bit64) local({
2203122031
merged$gforce_mean, merged$true_mean
2203222032
)
2203322033
})
22034+
22035+
# 7407 Test for fread() handling \x1A (ASCII SUB) at end of input
22036+
txt = paste0("foo\n", strrep("a", 4096 * 100), "\x1A")
22037+
test(2359.1, nchar(fread(txt)$foo), 409600L)

src/fread.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -349,9 +349,9 @@ static inline bool end_of_field(const char *ch)
349349
// We use eol() because that looks at eol_one_r inside it w.r.t. \r
350350
// \0 (maybe more than one) before eof are part of field and do not end it; eol() returns false for \0 but the ch==eof will return true for the \0 at eof.
351351
// Comment characters terminate a field immediately and take precedence over separators.
352-
return *ch == sep || ((uint8_t)*ch <= 13 && (ch == eof || eol(&ch))) || (commentChar && *ch == commentChar);
353352
if (*ch == sep) return true;
354-
if ((uint8_t)*ch <= 13 && (ch == eof || eol(&ch))) return true;
353+
if (ch == eof) return true;
354+
if ((uint8_t)*ch <= 13 && eol(&ch)) return true;
355355
if (!commentChar) return false;
356356
return *ch == commentChar;
357357
}

0 commit comments

Comments
 (0)