#851 Make '_corrupt_records' a nullable field.#823
Conversation
WalkthroughThe top-level Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
spark-cobol/src/main/scala/za/co/absa/cobrix/spark/cobol/schema/CobolSchema.scala (1)
132-141:⚠️ Potential issue | 🟠 MajorUpdate CobolSchemaSpec.scala test expectation at line 309: change
_corrupt_fields: array (nullable = false)tonullable = true.The schema change is correct and the data serialization behavior (empty arrays
[]rather thannull) matches the intentional distinction betweennullable = true(the array itself can be null) andcontainsNull = false(array elements are never null).However, CobolSchemaSpec.scala line 309 still asserts the old
nullable = falsefor the array field. This test will fail with the current code changes and must be updated. Test41CorruptFieldsSpec.scala already has the correct expectation withnullable = true.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@spark-cobol/src/main/scala/za/co/absa/cobrix/spark/cobol/schema/CobolSchema.scala` around lines 132 - 141, Update the CobolSchemaSpec.scala test expectation for the corrupt fields array to match the schema change: change the assertion that `_corrupt_fields: array (nullable = false)` to `_corrupt_fields: array (nullable = true)` so it reflects that the array field (Constants.corruptFieldsField) is nullable while its elements remain non-null (containsNull = false); ensure the test text or expected schema string used in the spec (the assertion referencing Constants.corruptFieldsField / `_corrupt_fields`) is updated accordingly to avoid the mismatch with CobolSchema.generateCorruptFields behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In
`@spark-cobol/src/main/scala/za/co/absa/cobrix/spark/cobol/schema/CobolSchema.scala`:
- Around line 132-141: Update the CobolSchemaSpec.scala test expectation for the
corrupt fields array to match the schema change: change the assertion that
`_corrupt_fields: array (nullable = false)` to `_corrupt_fields: array (nullable
= true)` so it reflects that the array field (Constants.corruptFieldsField) is
nullable while its elements remain non-null (containsNull = false); ensure the
test text or expected schema string used in the spec (the assertion referencing
Constants.corruptFieldsField / `_corrupt_fields`) is updated accordingly to
avoid the mismatch with CobolSchema.generateCorruptFields behavior.
c217345 to
01b76d1
Compare
JaCoCo code coverage report - 'cobol-parser'
|
JaCoCo code coverage report - 'spark-cobol'
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
spark-cobol/src/test/scala/za/co/absa/cobrix/spark/cobol/CobolSchemaSpec.scala (1)
872-887: Optional: add nullability assertion to the builder test.The test verifies the field name and that its type
isInstanceOf[ArrayType], but doesn't assertnullable == true. Adding this assertion would close the coverage gap introduced by this PR.✏️ Suggested addition
assert(sparkSchema.fields(1).name == "_corrupt_fields") assert(sparkSchema.fields(1).dataType.isInstanceOf[ArrayType]) + assert(sparkSchema.fields(1).nullable == true)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@spark-cobol/src/test/scala/za/co/absa/cobrix/spark/cobol/CobolSchemaSpec.scala` around lines 872 - 887, The test "create schema with corrupt fields using builder" lacks an assertion that the generated _corrupt_fields column is nullable; update the test to assert nullability by calling CobolSchema.builder(parsedCopybook).withGenerateCorruptFields(true).build().getSparkSchema and verify that the field named "_corrupt_fields" (sparkSchema.fields(1)) has nullable == true (e.g., add an assertion on sparkSchema.fields(1).nullable). This uses the existing symbols CobolSchema.builder, withGenerateCorruptFields, getSparkSchema and the "_corrupt_fields" field.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@spark-cobol/src/test/scala/za/co/absa/cobrix/spark/cobol/CobolSchemaSpec.scala`:
- Around line 872-887: The test "create schema with corrupt fields using
builder" lacks an assertion that the generated _corrupt_fields column is
nullable; update the test to assert nullability by calling
CobolSchema.builder(parsedCopybook).withGenerateCorruptFields(true).build().getSparkSchema
and verify that the field named "_corrupt_fields" (sparkSchema.fields(1)) has
nullable == true (e.g., add an assertion on sparkSchema.fields(1).nullable).
This uses the existing symbols CobolSchema.builder, withGenerateCorruptFields,
getSparkSchema and the "_corrupt_fields" field.
Summary by CodeRabbit
Schema Updates
Tests