HDDS-14043. Fix ls -e UnsupportedOperationException on ofs/o3fs#10209
Draft
chihsuan wants to merge 7 commits intoapache:masterfrom
Draft
HDDS-14043. Fix ls -e UnsupportedOperationException on ofs/o3fs#10209chihsuan wants to merge 7 commits intoapache:masterfrom
chihsuan wants to merge 7 commits intoapache:masterfrom
Conversation
Adds a nullable erasureCodingPolicy field to FileStatusAdapter and populates it from each key's ReplicationConfig (canonical EC scheme name when EC, "Replicated" otherwise) in BasicOzoneClientAdapterImpl and BasicRootedOzoneClientAdapterImpl. Synthetic adapters for buckets and bucket snapshots derive the policy from the bucket's own ReplicationConfig instead of hardcoding "Replicated", which previously contradicted the existing isErasureCoded flag for EC buckets. The 15-arg FileStatusAdapter constructor is preserved as a back-compat overload that delegates with a null policy. No callers read the new field yet; that change follows.
Hadoop's "fs -ls -e" reads ContentSummary.getErasureCodingPolicy() and throws UnsupportedOperationException when it is null. ofs and o3fs were returning a ContentSummary without the field set, producing the misleading "FileSystem ofs://om does not support Erasure Coding" error on every "-ls -e" against an Ozone cluster. BasicOzoneFileSystem and BasicRootedOzoneFileSystem now set the policy on the builder using the path's own FileStatusAdapter, matching HDFS's "policy of the nearest ancestor" semantic rather than aggregating descendants. BasicOzoneFileSystem also gains a getContentSummary override so o3fs no longer falls through to the FileSystem default (which left the field null). The Builder.erasureCodingPolicy(String) call does not exist on Hadoop 2.10.2's ContentSummary.Builder, so it is isolated behind a protected applyEcPolicy hook overridden only in the Hadoop 3 subclasses (ozonefs and ozonefs-hadoop3). ozonefs-hadoop2 inherits a no-op default and is unaffected; Hadoop 2.10.2 has no "-ls -e" flag anyway.
Adds two tests to each of AbstractOzoneFileSystemTest and AbstractRootedOzoneFileSystemTest: - testContentSummaryErasureCodingPolicy verifies a Ratis file reports "Replicated" and an EC file reports the canonical scheme name (e.g. rs-3-2-1024k); on rooted ofs the parent directory of a mixed listing also reports the bucket's policy. - testLsDashEDoesNotThrow runs Hadoop FsShell with "-ls -e" against the bucket and asserts return code 0 - the literal regression guard for the original UnsupportedOperationException. Coverage runs through TestO3FS, TestO3FSWithFSO, TestOFS and TestOFSWithFSO.
- Drop redundant inline `no-op` comment in default applyEcPolicy;
the Javadoc above already explains the Hadoop 2/3 split.
- Drop dead `ecPolicy == null` coercion in getContentSummary; every
FileStatusAdapter producer this PR touches sets a non-null string
and the Hadoop 3 override already null-guards.
- Replace `RandomUtils.secure().randomBytes(1)` test fillers with
`new byte[]{0}`; remove now-unused import in the o3fs test.
- Add `testContentSummaryErasureCodingPolicyOnEcBucket` exercising
the synthetic-bucket-adapter EC branch (previously unasserted).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Why
ozone fs -ls -e <path>throwsUnsupportedOperationException: FileSystem ofs://om does not support Erasure Codingagainst any Ozone cluster, on bothofs://ando3fs://. The message is misleading — Ozone supports EC. The real cause: Hadoop's upstreamLs -ereadsContentSummary.getErasureCodingPolicy()and rejectsnull, and Ozone's filesystem implementations were never setting the field.What
Populate the EC-policy field on the
ContentSummaryreturned by bothofs://ando3fs://, mirroring HDFS'sContentSummaryComputationContext.getErasureCodingPolicyName(INode):Replicatedfor non-EC, the canonical EC scheme name (e.g.rs-3-2-1024k) for EC.""otherwise.OmKeyInfo's replication config: FSO bucket directories carry their own config (so an EC-configured intermediate dir reports its scheme); OBS/LEGACY synthetic directories report""because they have no real key entry."".The reported policy is always for the queried path itself, not aggregated from descendants.
How
The policy is plumbed through
FileStatusAdapter— per-keyReplicationConfigfor keys, the bucket's ownReplicationConfigfor synthetic bucket / bucket-snapshot adapters (which previously hardcoded"Replicated"and contradicted the existingisErasureCodedflag for EC buckets), and""for synthetic root / volume / snapshot-indicator adapters.getContentSummarythen sets it on theContentSummary.Builderusing the path's ownFileStatusAdapter.BasicOzoneFileSystemalso gains agetContentSummaryoverride soo3fsno longer falls through to theFileSystemdefault, which left the fieldnull.ContentSummary.Builder.erasureCodingPolicy(String)does not exist in Hadoop 2.10.2, so the new builder call is isolated behind a protectedapplyEcPolicyhook, overridden only in the Hadoop 3 subclasses (ozonefs/,ozonefs-hadoop3/).ozonefs-hadoop2inherits a no-op default — and Hadoop 2.10.2'sLshas no-eflag anyway, so there is no functional regression.The PR is split into three commits (plumbing → fix → tests) for review; squash on merge as usual.
Notes for reviewers
There is some repeated shape between the two
getContentSummarymethods and across the fourtoFileStatusAdaptercall sites (single-fileContentSummary.Builderblock + theecPolicyternary). I have kept it as-is to keep this PR scoped to the-ls -efix — happy to extract helpers in this PR if reviewers prefer, or to address it as a follow-up ticket.What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-14043
How was this patch tested?
AbstractOzoneFileSystemTestandAbstractRootedOzoneFileSystemTestexercise the fix end-to-end throughTestO3FS,TestO3FSWithFSO,TestOFS, andTestOFSWithFSO.testLsDashEDoesNotThrowruns HadoopFsShelldirectly with-ls -eand asserts return code0— the literal regression guard for the original exception.testContentSummaryErasureCodingPolicyasserts the file/directory values described above.ozonefs*modules compile cleanly.ozonefs-hadoop2compilesozonefs-commonagainst the Hadoop 2.10.2 classpath, which is the load-bearing check that the new code path doesn't reference any Hadoop 3-only API.compose/ozone: