feat(c/driver/postgresql): implement copy writer for time types#4057
feat(c/driver/postgresql): implement copy writer for time types#4057Gawaboumga wants to merge 4 commits intoapache:mainfrom
Conversation
c/driver/postgresql/copy/reader.h
Outdated
| // Ensure the target type can hold the converted value (TIME32 -> int32). | ||
| if constexpr (std::is_same<OutT, int32_t>::value) { | ||
| if (out64 < (std::numeric_limits<int32_t>::min)() || | ||
| out64 > (std::numeric_limits<int32_t>::max)()) { | ||
| ArrowErrorSet(error, | ||
| "[libpq] TIME value %" PRId64 | ||
| " usec converts to %" PRId64 | ||
| " which overflows int32 for Arrow TIME32", | ||
| time_usec, out64); | ||
| return EOVERFLOW; | ||
| } |
There was a problem hiding this comment.
Is this possible given we're already validating that the value fits in 24 hours?
There was a problem hiding this comment.
Indeed, this should never happen.
c/driver/postgresql/copy/reader.h
Outdated
| const int32_t out32 = static_cast<int32_t>(out64); | ||
| NANOARROW_RETURN_NOT_OK(ArrowBufferAppend(data_, &out32, sizeof(out32))); | ||
| } else { | ||
| const int64_t out = static_cast<int64_t>(out64); |
There was a problem hiding this comment.
I removed the symmetry and avoided the redundant cast.
| break; | ||
| } | ||
|
|
||
| if (micros < 0 || micros > kUsecsPerDay) { |
There was a problem hiding this comment.
Hmm. If we assume the Arrow data isn't necessarily valid, don't we have to watch for overflow when we do the multiplication above? Or if we do assume the data is valid, then this can't happen, right?
There was a problem hiding this comment.
I reused the overflow validation logic of Duration or Timestamp
- Since we already ensure that time is within a day, it is not mandatory - const int64_t out = static_cast<int64_t>(out64); is a redundant cast - Fix ODR of constants
- Reuse overflow validation logic
lidavidm
left a comment
There was a problem hiding this comment.
Can you format the code? https://github.com/apache/arrow-adbc/actions/runs/23013027066/job/66911167472?pr=4057
| }, | ||
| { | ||
| "name": "value", | ||
| "format": "ttm", |
There was a problem hiding this comment.
| "format": "ttm", | |
| "format": "ttu", |
We always read microseconds, so let's expect microseconds (also the values below need to be adjusted)
There was a problem hiding this comment.
| "format": "+s", | ||
| "children": [ | ||
| { | ||
| "name": "value", |
| "children": [ | ||
| { | ||
| "name": "value", | ||
| "format": "ttm", |
There was a problem hiding this comment.
I expect you will need microseconds here and below
| // part: expected | ||
|
|
||
| {"idx": 0, "value": null} | ||
| {"idx": 1, "value": 0} | ||
| {"idx": 2, "value": 1} | ||
| {"idx": 3, "value": 3723123456} | ||
| {"idx": 4, "value": 86399999999} | ||
| {"idx": 5, "value": 86400000000} |
There was a problem hiding this comment.
It appears these values do not line up with what's expected (https://github.com/adbc-drivers/validation/blob/main/adbc_drivers_validation/queries/ingest/time_us.txtcase)
Frankly there should be no need to override this case?
Hello,
I tried to implement the copy for postgresql for data types: time32 (s, ms) and time64 (us, ns). I may be one of the few using this ^^
I am not sure to fully grasp how work the ".txtcase" files.
Kind regards,
Closes #3841