Skip to content

Commit 3384596

Browse files
authored
GH-48673: [C++] Fix ToStringWithoutContextLines to check for :\d+ pattern before removing lines (#48674)
### Rationale for this change This PR proposes to fix the todo https://github.com/apache/arrow/blob/7ebc88c8fae62ed97bc30865c845c8061132af7e/cpp/src/arrow/status.cc#L131-L134 which would allows a better parsing for line numbers. I could not find the relevant example to demonstrate within this project but assume that we have a test such as: (Generated by ChatGPT) ```cpp TEST(BlockParser, ErrorMessageWithColonsPreserved) { Status st(StatusCode::Invalid, "CSV parse error: Row #2: Expected 2 columns, got 3: 12:34:56,key:value,data\n" "Error details: Time format: 12:34:56, Key: value\n" "parser_test.cc:940 Parse(parser, csv, &out_size)"); std::string expected_msg = "Invalid: CSV parse error: Row #2: Expected 2 columns, got 3: 12:34:56,key:value,data\n" "Error details: Time format: 12:34:56, Key: value"; ASSERT_RAISES_WITH_MESSAGE(Invalid, expected_msg, st); } // Test with URL-like data (another common case with colons) TEST(BlockParser, ErrorMessageWithURLPreserved) { Status st(StatusCode::Invalid, "CSV parse error: Row #2: Expected 1 columns, got 2: http://arrow.apache.org:8080/api,data\n" "URL: http://arrow.apache.org:8080/api\n" "parser_test.cc:974 Parse(parser, csv, &out_size)"); std::string expected_msg = "Invalid: CSV parse error: Row #2: Expected 1 columns, got 2: http://arrow.apache.org:8080/api,data\n" "URL: http://arrow.apache.org:8080/api"; ASSERT_RAISES_WITH_MESSAGE(Invalid, expected_msg, st); } ``` then it fails. ### What changes are included in this PR? Fixed `Status::ToStringWithoutContextLines()` to only remove context lines matching the `filename:line` pattern (`:\d+`), preventing legitimate error messages containing colons from being incorrectly stripped. ### Are these changes tested? Manually tested, and unittests were added, with `cmake .. --preset ninja-debug -DARROW_EXTRA_ERROR_CONTEXT=ON`. ### Are there any user-facing changes? No, test-only. * GitHub Issue: #48673 Authored-by: Hyukjin Kwon <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
1 parent 08175e5 commit 3384596

File tree

2 files changed

+37
-2
lines changed

2 files changed

+37
-2
lines changed

cpp/src/arrow/status.cc

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
#include "arrow/status.h"
1414

1515
#include <cassert>
16+
#include <cctype>
1617
#include <cstdlib>
1718
#include <iostream>
1819
#ifdef ARROW_EXTRA_ERROR_CONTEXT
@@ -131,8 +132,25 @@ std::string Status::ToStringWithoutContextLines() const {
131132
if (last_new_line_position == std::string::npos) {
132133
break;
133134
}
134-
// TODO: We may want to check /:\d+ /
135-
if (message.find(":", last_new_line_position) == std::string::npos) {
135+
// Check for the pattern ":\d+ " (colon followed by one or more digits and a space)
136+
// to identify context lines in the format "filename:line expr"
137+
auto colon_position = message.find(":", last_new_line_position);
138+
if (colon_position == std::string::npos) {
139+
break;
140+
}
141+
// Verify that the colon is followed by one or more digits and then a space
142+
size_t pos = colon_position + 1;
143+
if (pos >= message.size() ||
144+
!std::isdigit(static_cast<unsigned char>(message[pos]))) {
145+
break;
146+
}
147+
// Skip all digits
148+
while (pos < message.size() &&
149+
std::isdigit(static_cast<unsigned char>(message[pos]))) {
150+
pos++;
151+
}
152+
// Check if followed by a space
153+
if (pos >= message.size() || message[pos] != ' ') {
136154
break;
137155
}
138156
message = message.substr(0, last_new_line_position);

cpp/src/arrow/status_test.cc

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -342,4 +342,21 @@ TEST(StatusTest, ReturnIfNotOk) {
342342
ASSERT_EQ(StripContext(st.message()), "StatusLike: 43");
343343
}
344344

345+
#ifdef ARROW_EXTRA_ERROR_CONTEXT
346+
TEST(StatusTest, ToStringWithoutContextLines) {
347+
Status status = Status::IOError("base error");
348+
status.AddContextLine("file1.cc", 42, "expr");
349+
status.AddContextLine("file2.cc", 100, "expr");
350+
351+
ASSERT_EQ(status.ToStringWithoutContextLines(), "IOError: base error");
352+
353+
Status status2(StatusCode::Invalid,
354+
"Error message\nThis line has: a colon but no digits");
355+
status2.AddContextLine("file.cc", 20, "expr");
356+
357+
ASSERT_EQ(status2.ToStringWithoutContextLines(),
358+
"Invalid: Error message\nThis line has: a colon but no digits");
359+
}
360+
#endif
361+
345362
} // namespace arrow

0 commit comments

Comments
 (0)