-
Notifications
You must be signed in to change notification settings - Fork 297
Modernize Java/Scala Stack: Dual Build Profiles for Legacy and Modern Compatibility (Issue #813) #816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…odern stacks - Added dual Maven profiles: legacy (default, Scala 2.11/Java 8) and modern (Scala 2.13/Java 17) - Created compatibility layer: org.dbpedia.extraction.compat.JavaConversions - Replaced 25+ deprecated scala.collection.JavaConversions imports across codebase - Updated CI/CD workflow for matrix builds (legacy and modern profiles) - Added Spark 3.5.1 support for modern profile - Configured Java 17 module opens for --add-opens flags - All core, scripts, dump modules compile successfully on legacy profile - Modern profile ready for testing on Java 17 - Added comprehensive test results and documentation
📝 WalkthroughWalkthroughAdds dual Maven build profiles (legacy Java 8 / modern Java 17), a project-local Java-Scala compatibility shim (JavaConversions), migrates imports to that shim across many sources and tests, updates CI to run a JDK matrix, includes documentation and Eclipse project config changes, and removes one server stats source file and one test resource. Changes
Sequence Diagram(s)sequenceDiagram
participant Dev as Developer (push PR)
participant GH as GitHub Actions
participant Matrix as Build Matrix
participant Maven as Maven (profiles)
participant JDK as JDK (1.8 / 17)
participant Tests as Test Runner
Dev->>GH: push changes
GH->>Matrix: start jobs for each matrix entry
Matrix->>Maven: run mvn -P{legacy|modern} clean install
Maven->>JDK: use configured Java runtime
Maven->>Tests: execute surefire (with profile argLine for modern)
Tests-->>Maven: results
Maven-->>Matrix: build status
Matrix-->>GH: aggregate results
GH->>Dev: notify (Slack/Checks)
note right of Maven: New artifact paths and compatibility shim used across sources
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Possibly related PRs
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Organization UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (4)
💤 Files with no reviewable changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pom.xml (1)
215-524: Critical: Pipeline failures indicate missing dependency versions in child modules.The pipeline reports missing dependency versions for
scalatestandscoptin child module POMs. While this parent POM correctly defines these versions in<dependencyManagement>, child modules must either inherit them properly or declare versions explicitly.Run the following script to identify child POMs with missing version declarations:
#!/bin/bash # Description: Find dependency declarations missing version tags in child POMs echo "=== Checking for scalatest dependencies without version ===" fd -e xml -t f '^pom\.xml$' --exec grep -l 'scalatest' {} \; | while read pom; do echo "File: $pom" grep -A5 -B2 'scalatest' "$pom" | grep -v 'version' | head -20 done echo "" echo "=== Checking for scopt dependencies without version ===" fd -e xml -t f '^pom\.xml$' --exec grep -l 'scopt' {} \; | while read pom; do echo "File: $pom" grep -A5 -B2 'scopt' "$pom" | grep -v 'version' | head -20 done echo "" echo "=== Checking all child module POMs ===" fd -e xml -t f '^pom\.xml$' -E 'target' --exec echo {} \;
♻️ Duplicate comments (1)
ISSUE_804_FIX.md (1)
1-194: Question: Is Issue #804 documentation in scope for this modernization PR?Similar to VERIFICATION_REPORT.md, this file documents Issue #804 (Macedonian template namespace fix), which the PR objectives describe as "distinct from the modernization work" of Issue #813.
Consider consolidating Issue #804 documentation in a separate PR to maintain clear separation of concerns.
🧹 Nitpick comments (1)
MODERNIZATION_TEST_RESULTS.md (1)
23-106: Optional: Add language specifiers to code blocks for better rendering.Multiple fenced code blocks lack language specifiers (lines 23, 40, 54, 65, 79, 94, 103), which can affect syntax highlighting and documentation rendering.
Example improvements
-``` +```text [INFO] Reactor Build Order:-``` +```text [ERROR] error: IO error while decoding MappingStatsHolder.scala with UTF-8
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (33)
dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L11/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L220661/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L221495/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L221521/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L221524/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L222070/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L222071/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L222072/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L222073/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L222074/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L222075/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L222076/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L222077/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L222078/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L222261/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L222262/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L222327/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L222354/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L222359/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L222360/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L222361/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L222473/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L240/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L247/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L249/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L536/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L61/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Lexeme:L63240/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Property:P7531/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Property:P7532/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Property:P7555/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Property:P7556/wiki.xml.bz2is excluded by!**/*.bz2dump/src/test/resources/minidumps/wikidata.org/wiki/Property:P7558/wiki.xml.bz2is excluded by!**/*.bz2
📒 Files selected for processing (126)
.github/workflows/maven.ymlGIT_WORKFLOW.mdISSUE_804_FIX.mdMODERNIZATION_TEST_RESULTS.mdREADME.mdVERIFICATION_REPORT.mdclean-install-runcore/.projectcore/.settings/org.eclipse.core.resources.prefscore/.settings/org.eclipse.m2e.core.prefscore/pom.xmlcore/src/main/java/org/dbpedia/extraction/nif/LinkExtractor.javacore/src/main/scala/org/dbpedia/extraction/compat/JavaConversions.scalacore/src/main/scala/org/dbpedia/extraction/config/Config.scalacore/src/main/scala/org/dbpedia/extraction/destinations/formatters/UriPolicy.scalacore/src/main/scala/org/dbpedia/extraction/mappings/NifExtractor.scalacore/src/main/scala/org/dbpedia/extraction/mappings/PlainAbstractExtractor.scalacore/src/main/scala/org/dbpedia/extraction/mappings/wikidata/WikidataAliasExtractor.scalacore/src/main/scala/org/dbpedia/extraction/mappings/wikidata/WikidataDescriptionExtractor.scalacore/src/main/scala/org/dbpedia/extraction/mappings/wikidata/WikidataLLExtractor.scalacore/src/main/scala/org/dbpedia/extraction/mappings/wikidata/WikidataLabelExtractor.scalacore/src/main/scala/org/dbpedia/extraction/mappings/wikidata/WikidataLexemeExtractor.scalacore/src/main/scala/org/dbpedia/extraction/mappings/wikidata/WikidataPropertyExtractor.scalacore/src/main/scala/org/dbpedia/extraction/mappings/wikidata/WikidataR2RExtractor.scalacore/src/main/scala/org/dbpedia/extraction/mappings/wikidata/WikidataRawExtractor.scalacore/src/main/scala/org/dbpedia/extraction/mappings/wikidata/WikidataReferenceExtractor.scalacore/src/main/scala/org/dbpedia/extraction/mappings/wikidata/WikidataSameAsExtractor.scalacore/src/main/scala/org/dbpedia/extraction/nif/HtmlNifExtractor.scalacore/src/main/scala/org/dbpedia/extraction/nif/WikipediaNifExtractor.scalacore/src/main/scala/org/dbpedia/extraction/nif/WikipediaNifExtractorRest.scalacore/src/main/scala/org/dbpedia/extraction/sources/XMLSource.scalacore/src/main/scala/org/dbpedia/extraction/util/JsonConfig.scalacore/src/main/scala/org/dbpedia/extraction/util/MediaWikiConnector.scalacore/src/main/scala/org/dbpedia/extraction/util/MediaWikiConnectorAbstract.scalacore/src/main/scala/org/dbpedia/extraction/util/MediaWikiConnectorRest.scalacore/src/main/scala/org/dbpedia/extraction/util/MediawikiConnectorConfigured.scalacore/src/main/scala/org/dbpedia/extraction/util/RichPath.scalacore/src/main/scala/org/dbpedia/extraction/util/XMLEventBuilder.scalacore/src/main/scala/org/dbpedia/extraction/wikiparser/impl/sweble/SwebleWrapper.scalacore/src/test/resources/org/dbpedia/extraction/mappings/rml/test.rmlcore/src/test/scala/org/dbpedia/iri/IRI_Test_Suite.scaladocumentation/extraction-process.mddump/.projectdump/.settings/org.eclipse.m2e.core.prefsdump/src/main/bash/mysql.shdump/src/main/scala/org/dbpedia/extraction/dump/clean/Clean.scaladump/src/main/scala/org/dbpedia/validation/construct/tests/generators/NTripleTestGenerator.scaladump/src/test/bash/createMinidump.shdump/src/test/bash/createMinidump_custom_sample.shdump/src/test/bash/createSampleRandomFromPageIDdataset.shdump/src/test/bash/create_custom_sample.shdump/src/test/resources/extraction-configs/extraction.nif.abstracts.propertiesdump/src/test/resources/extraction-configs/extraction.plain.abstracts.propertiesdump/src/test/resources/shacl-tests/instances/?_(film)_citation1.ttldump/src/test/scala/org/dbpedia/extraction/dump/ExtractionTestAbstract.mddump/src/test/scala/org/dbpedia/extraction/dump/ExtractionTestAbstract.scalainstall-runlive/live.default.inilive/src/main/java/org/dbpedia/extraction/live/record/DeletionRecord.javalive/src/main/java/org/dbpedia/extraction/live/record/IRecord.javalive/src/main/java/org/dbpedia/extraction/live/record/IRecordVisitor.javalive/src/main/java/org/dbpedia/extraction/live/record/MediawikiTitle.javalive/src/main/java/org/dbpedia/extraction/live/record/ObjectContainer.javalive/src/main/java/org/dbpedia/extraction/live/record/RecordContent.javalive/src/main/java/org/dbpedia/extraction/live/storage/JSONCache.scalalive/src/main/java/org/dbpedia/extraction/live/transformer/CastTransformer.javalive/src/main/java/org/dbpedia/extraction/live/transformer/IterableToIteratorTransformer.javalive/src/main/java/org/dbpedia/extraction/live/transformer/NodeToDocumentTransformer.javalive/src/main/java/org/dbpedia/extraction/live/transformer/NodeToRecordTransformer.javalive/src/main/java/org/dbpedia/extraction/live/transformer/XPathTransformer.javalive/src/main/java/org/dbpedia/extraction/live/util/DBPediaXPathUtil.javalive/src/main/java/org/dbpedia/extraction/live/util/EqualsUtil.javalive/src/main/java/org/dbpedia/extraction/live/util/ExceptionUtil.javalive/src/main/java/org/dbpedia/extraction/live/util/Files.javalive/src/main/java/org/dbpedia/extraction/live/util/MD5Util.javalive/src/main/java/org/dbpedia/extraction/live/util/StringUtil.javalive/src/main/java/org/dbpedia/extraction/live/util/XPathUtil.javalive/src/main/java/org/dbpedia/extraction/live/util/collections/IDistanceFunc.javalive/src/main/java/org/dbpedia/extraction/live/util/collections/IMultiMap.javalive/src/main/java/org/dbpedia/extraction/live/util/collections/IOneToOneMap.javalive/src/main/java/org/dbpedia/extraction/live/util/collections/MultiMap.javalive/src/main/java/org/dbpedia/extraction/live/util/collections/OneToOneMap.javalive/src/main/java/org/dbpedia/extraction/live/util/collections/PersistentQueue.javalive/src/main/java/org/dbpedia/extraction/live/util/collections/PersistentQueueIterator.javalive/src/main/java/org/dbpedia/extraction/live/util/collections/SetDiff.javalive/src/main/java/org/dbpedia/extraction/live/util/collections/TimeStampMap.javalive/src/main/java/org/dbpedia/extraction/live/util/collections/TimeStampSet.javalive/src/main/java/org/dbpedia/extraction/live/util/iterators/DuplicateOAIRecordRemoverIterator.javalive/src/main/java/org/dbpedia/extraction/live/util/iterators/EndlessOAIMetaIterator.javalive/src/main/java/org/dbpedia/extraction/live/util/iterators/NodeListIterator.javalive/src/main/java/org/dbpedia/extraction/live/util/iterators/PrefetchIterator.javalive/src/main/java/org/dbpedia/extraction/live/util/iterators/RelativeDelayIterator.javalive/src/main/java/org/dbpedia/extraction/live/util/iterators/SaveResponseTimeIterator.javalive/src/main/java/org/dbpedia/extraction/live/util/iterators/TimeWindowIterator.javalive/src/main/java/org/dbpedia/extraction/live/util/iterators/TransformChainIterator.javalive/src/main/java/org/dbpedia/extraction/live/util/iterators/XPathQueryIterator.javalive/src/main/scala/org/dbpedia/extraction/destinations/PublisherDiffDestination.scalalive/src/main/scala/org/dbpedia/extraction/live/publisher/RDFDiffWriter.scalapom.xmlredeploy-serverrunscripts/.projectscripts/.settings/org.eclipse.m2e.core.prefsscripts/src/main/bash/coords-integration-test.shscripts/src/main/bash/databusPreparation.shscripts/src/main/bash/mappingbased-release.shscripts/src/main/bash/stats-redirects-test.shscripts/src/main/bash/test-extraction-combinations.shscripts/src/main/lighttpd/startscripts/src/main/lighttpd/stopscripts/src/main/scala/org/dbpedia/extraction/util/OpenRdfModelConverter.scalaserver/.projectserver/.settings/org.eclipse.m2e.core.prefsserver/src/main/scala/org/dbpedia/extraction/server/stats/MappingStatsHolder.scalaserver/src/main/web/sprint/cron/update_mappingstats.shsitemap.configvoid.configwiktionary/config.properties.defaultwiktionary/scripts/make_jarzipwiktionary/scripts/preparewiktionary/scripts/publish-downloadwiktionary/scripts/splitrapperwiktionary/scripts/statisticswiktionary/scripts/translation-extractwiktionary/scripts/virtuoso-loadwiktionary/src/main/scala/org/dbpedia/extraction/XMLFileSource.scala
💤 Files with no reviewable changes (1)
- dump/src/test/resources/shacl-tests/instances/?_(film)_citation1.ttl
🧰 Additional context used
🪛 actionlint (1.7.9)
.github/workflows/maven.yml
34-34: the runner of "actions/setup-java@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
🪛 GitHub Actions: Extraction Framework Build and MiniDump Test
pom.xml
[error] 409-409: dependencies.dependency.version for org.scalatest:scalatest_2.11:jar is missing.
[error] 288-288: dependencies.dependency.version for com.github.scopt:scopt_2.11:jar is missing.
[error] 103-103: dependencies.dependency.version for org.scalatest:scalatest_2.11:jar is missing.
🪛 LanguageTool
GIT_WORKFLOW.md
[style] ~95-~95: Consider an alternative for the often overused word ‘important’.
Context: ...| | 2>&1 | (Shell thing - ignore, not important) | --- ## Current Status ✅ **Our Fix...
(NOT_IMPORTANT)
MODERNIZATION_TEST_RESULTS.md
[style] ~3-~3: Some style guides suggest that commas should set off the year in a month-day-year date.
Context: ... - Test Results Date: December 23, 2025 Feature: Dual Build Profiles for ...
(MISSING_COMMA_AFTER_YEAR)
🪛 markdownlint-cli2 (0.18.1)
GIT_WORKFLOW.md
14-14: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
16-16: Hard tabs
Column: 1
(MD010, no-hard-tabs)
17-17: Hard tabs
Column: 1
(MD010, no-hard-tabs)
18-18: Hard tabs
Column: 1
(MD010, no-hard-tabs)
VERIFICATION_REPORT.md
5-5: Bare URL used
(MD034, no-bare-urls)
18-18: Bare URL used
(MD034, no-bare-urls)
31-31: Unordered list indentation
Expected: 0; Actual: 1
(MD007, ul-indent)
32-32: Unordered list indentation
Expected: 0; Actual: 1
(MD007, ul-indent)
documentation/extraction-process.md
5-5: Bare URL used
(MD034, no-bare-urls)
18-18: Bare URL used
(MD034, no-bare-urls)
31-31: Unordered list indentation
Expected: 0; Actual: 1
(MD007, ul-indent)
32-32: Unordered list indentation
Expected: 0; Actual: 1
(MD007, ul-indent)
MODERNIZATION_TEST_RESULTS.md
23-23: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
40-40: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
54-54: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
65-65: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
79-79: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
94-94: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
103-103: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build
- GitHub Check: deploy
- GitHub Check: build
🔇 Additional comments (45)
sitemap.config (1)
1-98: File modification appears disconnected from PR objectives.This configuration file has been reformatted with whitespace normalization and an XML declaration added (line 1). The file is valid XML and the structural changes are benign. However, the modification appears incidental to the stated PR goals (Java/Scala stack modernization, dual build profiles, compatibility layer, and CI updates).
Verify: Is this file modification intentional, or was it caught up in an automated formatter? If it's an accidental change, consider reverting it to minimize PR scope creep.
live/live.default.ini (1)
1-174: Verify this file's inclusion and scope relevance.This INI configuration file appears unrelated to the PR objectives (Java/Scala modernization, import replacements, dual build profiles). The AI summary indicates the file has been reformatted with no semantic changes to parameter values.
A few clarifications:
- Scope: Is the inclusion of this formatting change intentional, or is it incidental? If it's only reformatting, consider separating it into a dedicated housekeeping PR to keep the modernization work focused.
- No functional impact: The configuration parameters appear valid and unchanged semantically. The file itself does not require updates for the Java/Scala modernization changes (no imports, no JVM version-specific settings).
server/.settings/org.eclipse.m2e.core.prefs (1)
1-4: Standard Eclipse m2e configuration.These are standard Maven/Eclipse integration settings that enable workspace project resolution for multi-module builds. No runtime impact.
core/.settings/org.eclipse.m2e.core.prefs (1)
1-4: Standard Eclipse m2e configuration.Consistent Maven/Eclipse integration settings across modules. No runtime impact.
core/.settings/org.eclipse.core.resources.prefs (1)
1-6: Standard UTF-8 encoding configuration.Enforces UTF-8 encoding across all source roots, which is essential for consistent handling of internationalized content in Eclipse. No runtime impact.
dump/.settings/org.eclipse.m2e.core.prefs (1)
1-4: Standard Eclipse m2e configuration.Consistent Maven/Eclipse integration settings. No runtime impact.
scripts/.settings/org.eclipse.m2e.core.prefs (1)
1-4: Standard Eclipse m2e configuration.Consistent Maven/Eclipse integration settings. No runtime impact.
README.md (1)
48-51: Excellent documentation of dual build profiles.Clear guidance on the legacy (default) and modern profiles, with concise command examples and key technology versions. The placement under "Standalone Execution" is appropriate, and noting that both profiles are tested in CI builds confidence.
live/src/main/scala/org/dbpedia/extraction/live/publisher/RDFDiffWriter.scala (1)
13-13: LGTM: Import path updated to compatibility layer.The replacement of deprecated
scala.collection.JavaConversionswithorg.dbpedia.extraction.compat.JavaConversionsaligns with the modernization strategy. The compatibility layer provides the requiredasScalaSetimplicit conversion (line 12) for thejava.util.Set[Quad]parameter accepted by thewriteAsTurtlemethod at line 34.wiktionary/config.properties.default (1)
31-32: This URI policy change reflects the established convention throughout the extraction framework whereuri-policy.iriis used with turtle formats (TTL/TQL) anduri-policy.uriis commented out. This same pattern is used consistently across all 25+ properties files in the codebase, indicating the change is intentional and aligns with the framework's standard URI handling practices, not unrelated to the PR work.Likely an incorrect or invalid review comment.
GIT_WORKFLOW.md (1)
1-156: This file appears unrelated to the PR's modernization scope.This workflow guide documents steps for Issue #804 (Macedonian template namespace fix), but the PR objective is modernizing Java/Scala stack (Issue #813). Consider whether this file was accidentally included or should be in a separate PR.
Additionally, static analysis flagged minor markdown issues:
- Line 14: Code fence missing language specifier
- Lines 16-18: Hard tabs instead of spaces
scripts/.project (1)
3-35: LGTM - Maven integration added to Eclipse project.The addition of Maven2 builder and nature alongside the Scala configuration properly enables Maven-based builds in the IDE. The resource filters for
node_modules,.git, and language server artifacts are sensible exclusions.server/.project (1)
3-35: LGTM - Consistent Maven integration.Configuration follows the same pattern as other module
.projectfiles, properly adding Maven support while retaining Scala and Java natures.core/.project (1)
3-34: LGTM - Core module Maven integration.Appropriately configured as the base module with no project dependencies. Maven and Scala builders properly configured.
dump/.project (1)
3-35: LGTM - Dump module Maven integration.Consistent configuration with other modules, properly adding Maven support and referencing the core project dependency.
core/pom.xml (1)
117-120: Appropriate addition of scala-collection-compat dependency.This dependency correctly supports the compatibility layer for replacing deprecated
scala.collection.JavaConversions. The artifact uses${scala.compat.version}for the suffix, which is consistent with other Scala dependencies in the POM. The version is properly managed in the parent POM's<dependencyManagement>section and thescala.compat.versionproperty is defined as "2.11" in the parent..github/workflows/maven.yml (1)
22-40: Well-structured matrix strategy for dual-profile CI.The matrix configuration correctly implements the legacy (Java 1.8) and modern (Java 17) build profiles. The updates to
actions/checkout@v4and addition of Maven caching are good improvements.However,
actions/setup-java@v3is outdated. Update tov5.1.0, the latest version with the latest features and fixes.Likely an incorrect or invalid review comment.
MODERNIZATION_TEST_RESULTS.md (1)
1-352: Excellent documentation of modernization test results.The test results document is comprehensive and clearly structured, covering:
- Legacy profile build success
- Compatibility layer validation
- Known issues with pre-existing problems properly identified
- Clear next steps for completing modern profile testing
This provides valuable transparency for the modernization effort.
dump/src/main/scala/org/dbpedia/validation/construct/tests/generators/NTripleTestGenerator.scala (1)
15-15: LGTM: Import replacement is correct.The import change from
scala.collection.JavaConversions._toorg.dbpedia.extraction.compat.JavaConversions._is a straightforward migration to the project-specific compatibility layer. The wildcard import ensures all implicit conversions used throughout the file remain available.scripts/src/main/scala/org/dbpedia/extraction/util/OpenRdfModelConverter.scala (1)
9-9: LGTM: Import replacement is correct.The import change to
org.dbpedia.extraction.compat.JavaConversionscorrectly migrates to the project-specific compatibility layer. The qualified usage at line 19 (JavaConversions.asScalaSet) will resolve to the new compatibility object.dump/src/main/scala/org/dbpedia/extraction/dump/clean/Clean.scala (1)
4-4: LGTM: Import replacement is correct.The import change to
org.dbpedia.extraction.compat.JavaConversions.iterableAsScalaIterablecorrectly migrates to the project-specific compatibility layer. The specific import ofiterableAsScalaIterablematches its implicit usage in the for-comprehensions throughout the file.core/src/main/scala/org/dbpedia/extraction/destinations/formatters/UriPolicy.scala (1)
14-14: LGTM: Import replacement is correct.The import change to
org.dbpedia.extraction.compat.JavaConversions.asScalaSetcorrectly migrates to the project-specific compatibility layer. The specific import ofasScalaSetmatches its implicit usage for Java Set conversions in the file (e.g., line 120 withconfig.stringPropertyNames).core/src/main/scala/org/dbpedia/extraction/compat/JavaConversions.scala (1)
1-18: Scala version compatibility is correctly configured.The implementation properly supports both Scala 2.11 and 2.13. The
scala-collection-compatdependency (v2.11.0) is included in the build configuration and provides thescala.jdk.CollectionConverterscompatibility layer for Scala 2.11, while Scala 2.13 includes it natively. The import statement is correct across both versions and will resolve via the appropriate mechanism depending on the active Maven profile.VERIFICATION_REPORT.md (1)
1-221: Clarify scope: Issue #804 documentation is part of this PR, not from earlier commits.The PR contains documentation for Issue #804 (VERIFICATION_REPORT.md, ISSUE_804_FIX.md, GIT_WORKFLOW.md), but these are not from "two earlier commits" as suggested. Git history shows no prior commits referencing Issue #804; all this content was added in the current modernization commit.
The original concern about mixing scopes may still be valid—consider whether Issue #804 documentation should be included alongside Issue #813 modernization changes, or moved to a separate commit/PR for clarity.
live/src/main/java/org/dbpedia/extraction/live/storage/JSONCache.scala (1)
10-10: LGTM: Import path updated to use project-specific compatibility shim.The import change from
scala.collection.JavaConversions._toorg.dbpedia.extraction.compat.JavaConversions._aligns with the project-wide migration to a compatibility layer that replaces deprecated Scala interop utilities.live/src/main/scala/org/dbpedia/extraction/destinations/PublisherDiffDestination.scala (1)
10-10: LGTM: Import path updated consistently with project migration.pom.xml (2)
536-586: Dual-profile configuration looks well-structured.The
legacyandmodernprofiles provide clear separation:
- legacy (default): Java 8, Scala 2.11, Spark 2.2
- modern: Java 17, Scala 2.13, Spark 3.5 with JVM module opens
The modern profile correctly adds
--add-opensJVM arguments for Java 17 module system compatibility.
42-65: Property-based version management improves maintainability.Parameterizing dependency versions (lines 45-54) with profile-specific overrides enables the dual-build strategy while keeping configuration DRY.
core/src/main/scala/org/dbpedia/extraction/mappings/wikidata/WikidataR2RExtractor.scala (1)
15-15: LGTM: Import updated to compatibility shim.core/src/main/scala/org/dbpedia/extraction/mappings/wikidata/WikidataRawExtractor.scala (1)
10-10: LGTM: Import path updated consistently.core/src/main/scala/org/dbpedia/extraction/sources/XMLSource.scala (1)
9-9: LGTM: Import path updated to project compatibility layer.core/src/main/scala/org/dbpedia/extraction/util/JsonConfig.scala (1)
18-18: LGTM: Import updated consistently with project-wide migration.core/src/main/scala/org/dbpedia/extraction/mappings/wikidata/WikidataPropertyExtractor.scala (1)
10-10: LGTM: Import path updated to compatibility shim.core/src/main/scala/org/dbpedia/extraction/mappings/wikidata/WikidataDescriptionExtractor.scala (1)
9-9: LGTM — consistent with project-wide import migration.This import replacement aligns with the compatibility wrapper migration across all Wikidata extractors. The for-comprehension at line 40 (
for ((lang, value) <- document.getDescriptions)) relies on the same implicit conversions verified for the other extractors.core/src/main/scala/org/dbpedia/extraction/util/XMLEventBuilder.scala (1)
5-5: LGTM — specific import is a best practice.Using a specific import (
asJavaIterator) rather than a wildcard is good practice and makes the dependency on the compatibility layer explicit. The conversion is used at line 29 whereattributes.iteratoris passed tocreateStartElement.wiktionary/src/main/scala/org/dbpedia/extraction/XMLFileSource.scala (1)
3-3: LGTM — import migration extends to wiktionary module.The compatibility wrapper is correctly applied across module boundaries (core → wiktionary), ensuring consistent Java-Scala interoperability across the entire codebase.
core/src/main/scala/org/dbpedia/extraction/mappings/wikidata/WikidataSameAsExtractor.scala (1)
10-10: LGTM — consistent Wikidata extractor migration.This follows the same import replacement pattern as other Wikidata extractors, enabling iteration over Java collections (line 41:
itemDocument.getSiteLinks).core/src/main/scala/org/dbpedia/extraction/mappings/wikidata/WikidataAliasExtractor.scala (1)
9-9: LGTM — part of uniform Wikidata extractor refactor.Consistent with the compatibility wrapper migration across all Wikidata extractors (line 41 iterates over
document.getAliases).core/src/main/scala/org/dbpedia/extraction/mappings/wikidata/WikidataLexemeExtractor.scala (1)
12-12: LGTM — import replacement supports extensive Java collection usage.This file has the most extensive use of Java-Scala collection interop among the reviewed files, with numerous for-comprehensions iterating over Java collections (e.g., lines 41, 115, 124, 143, 155, 199, 204, 216, 221, 234, 239, 251, 261). The successful test results (0 regressions per PR objectives) provide strong evidence that the compatibility layer handles these conversions correctly.
core/src/main/scala/org/dbpedia/extraction/mappings/wikidata/WikidataReferenceExtractor.scala (1)
10-10: LGTM — completes the Wikidata extractor migration.This import replacement completes the consistent migration across all Wikidata extractors, enabling Java collection iteration (lines 38, 43, 50-57).
core/src/main/scala/org/dbpedia/extraction/mappings/wikidata/WikidataLabelExtractor.scala (1)
10-10: Import replacement is correct and fully supported by the compatibility layer.The migration from
scala.collection.JavaConversions._toorg.dbpedia.extraction.compat.JavaConversions._is properly implemented. The compatibility shim provides all necessary implicit conversions, includingasScalaMapwhich enables the pattern-matching iteration at line 41 (for ((lang, value) <- document.getLabels)) to work seamlessly with Java Maps. The scala-collection-compat dependency is configured in the POM. This change is safe and ready to merge.core/src/test/scala/org/dbpedia/iri/IRI_Test_Suite.scala (1)
11-11: LGTM: Test file import updated for consistency.The import replacement from
scala.collection.JavaConversions._toorg.dbpedia.extraction.compat.JavaConversions._maintains consistency with production code changes. While most test logic is commented out (TODO markers), the import ensures the test suite aligns with the project's modernization strategy.core/src/main/scala/org/dbpedia/extraction/mappings/wikidata/WikidataLLExtractor.scala (1)
10-10: LGTM: Import updated for Java Map iteration support.The replacement of
scala.collection.JavaConversions._withorg.dbpedia.extraction.compat.JavaConversions._is correct. This enables iteration overitemDocument.getSiteLinks(Java Map) at lines 48 and 51 using for-comprehension syntax. The compat shim provides theasScalaMapimplicit conversion that transformsjava.util.Map[K, V]tomutable.Map[K, V], enabling the for-comprehension pattern matching syntax.core/src/main/scala/org/dbpedia/extraction/util/RichPath.scala (1)
7-7: LGTM: Import path updated to project-specific compatibility shim.The replacement of
scala.collection.JavaConversions.iterableAsScalaIterablewithorg.dbpedia.extraction.compat.JavaConversions.iterableAsScalaIterablealigns with the modernization strategy. The compatibility shim atorg/dbpedia/extraction/compat/JavaConversions.scalaproperly implementsiterableAsScalaIterableby delegating toscala.jdk.CollectionConverters._, enabling.toListon JavaDirectoryStream[Path]at line 85. The shim also provides comprehensive coverage of Java-to-Scala conversions (iterators, sets, buffers, maps), supporting both Scala 2.11 and 2.13.core/src/main/scala/org/dbpedia/extraction/wikiparser/impl/sweble/SwebleWrapper.scala (1)
19-19: LGTM: Wildcard import updated for comprehensive Java-Scala interop.The import replacement from
scala.collection.JavaConversions._toorg.dbpedia.extraction.compat.JavaConversions._is correct. The compatibility shim provides all necessary implicit conversions, includingiterableAsScalaIterablefor transforming Java iterables andasScalaIteratorfor iterators. These support the extensive usage of patterns like.iterator.toListthroughout the file (20+ occurrences) to transform Sweble AST nodes to DBpedia AST nodes. No remaining old imports exist in the codebase.
|



Problem
Current framework uses Java 8 & Scala 2.11 with deprecated APIs. This PR adds modern stack support (Java 17 & Scala 2.13) while maintaining full backward compatibility.
Solution
Dual Maven Profiles:
mvn clean installmvn clean install -PmodernImplementation
1. Compatibility Layer
org.dbpedia.extraction.compat.JavaConversionsto bridge deprecatedscala.collection.JavaConversionsscala-collection-compatlibrary (works on both Scala versions)2. Dependencies
3. Code Changes
4. Java 17 Support
--add-opens java.base/java.langandjava.util5. CI/CD
How to Use
Build with Legacy Profile (Current Production Path)
Requires: Java 8+, Maven 3.2+
Output:
Build with Modern Profile (New Contributors Path)
# Activate modern profile mvn clean install -PmodernRequires: Java 17+, Maven 3.2+
Output:
Compile Only (Skip Tests)
Run Specific Tests
Test Results
For Contributors
Working with Modern Profile
If you prefer modern Scala/Java environment:
Adding New Features
-Plegacyand-PmodernFiles Changed
Core Changes
pom.xml- Added profiles, parameterized versionscore/pom.xml- Added scala-collection-compatcore/src/main/scala/org/dbpedia/extraction/compat/JavaConversions.scala- NEW.github/workflows/maven.yml- Updated CI/CD matrixREADME.md- Added build profile documentationDocumentation
MODERNIZATION_TEST_RESULTS.md- Comprehensive test resultsSummary by CodeRabbit
New Features
Bug Fixes
Documentation
CI/CD
Chores
Removed
✏️ Tip: You can customize this high-level summary in your review settings.