diff --git a/lucene_10_migration.md b/lucene_10_migration.md new file mode 100644 index 000000000000..1dc4229f92e8 --- /dev/null +++ b/lucene_10_migration.md @@ -0,0 +1,146 @@ +# Solr Migration Tasks for Lucene 10.2.2 + +## Critical Breaking Changes +- [ ] **CRITICAL** Search Queries: Replace removed `DocValuesFieldExistsQuery`, `NormsFieldExistsQuery` with `FieldExistsQuery`. + - **Impact:** Compilation errors where these classes are used. + - **Location:** `solr/core/src/java/org/apache/solr/schema/FieldType.java`, `solr/core/src/java/org/apache/solr/schema/CurrencyFieldType.java`, `solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformerFactory.java`, `solr/core/src/java/org/apache/solr/search/facet/MissingAgg.java`. + - **Lucene Change:** `MIGRATE.md` lines 112‑115. + - **Action Required:** Replace instantiations with `new FieldExistsQuery(fieldName)` and update imports. + - **Dependencies:** None +- [ ] **CRITICAL** Backup code: Update `Directory#openChecksumInput` calls to drop the `IOContext` parameter. + - **Impact:** Signature change causes compilation failure. + - **Location:** `solr/modules/s3-repository/.../S3BackupRepository.java`, `solr/modules/gcs-repository/.../GCSBackupRepository.java`, `solr/core/.../BackupRepository.java`. + - **Lucene Change:** `MIGRATE.md` lines 137‑141. + - **Action Required:** Remove `IOContext` argument when calling `openChecksumInput`. + - **Dependencies:** None +- [ ] **CRITICAL** Automaton builders: Replace usage of `DaciukMihovAutomatonBuilder` with `Automata.makeStringUnion`. + - **Impact:** Class removed and made package-private. + - **Location:** `solr/core/src/java/org/apache/solr/search/join/GraphEdgeCollector.java`, `solr/core/src/java/org/apache/solr/search/join/GraphQuery.java`. + - **Lucene Change:** `MIGRATE.md` lines 144‑147. + - **Action Required:** Switch imports and call the static method. + - **Dependencies:** None +- [ ] **CRITICAL** Query timeout: Remove `TimeLimitingCollector` usages and replace with `IndexSearcher#setTimeout(QueryTimeout)`. + - **Impact:** `TimeLimitingCollector` removed; runtime failure. + - **Location:** `solr/core/src/java/org/apache/solr/search/Grouping.java`, `solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java`, `solr/test-framework/src/java/org/apache/solr/SolrIgnoredThreadsFilter.java`. + - **Lucene Change:** `MIGRATE.md` lines 225‑227. + - **Action Required:** Configure query timeout via `IndexSearcher#setTimeout` and adjust error handling. + - **Dependencies:** None +- [ ] **CRITICAL** Search API: Replace calls to `IndexSearcher.search(Query, Collector)` with the `CollectorManager` based API. + - **Impact:** Method deprecated and will be removed; using old API prevents concurrency improvements. + - **Location:** `solr/core/src/java/org/apache/solr/search/MultiThreadedSearcher.java`, `solr/core/src/java/org/apache/solr/search/DocSetUtil.java`, `solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java`, `solr/core/src/java/org/apache/solr/search/grouping/CommandHandler.java`. + - **Lucene Change:** `MIGRATE.md` lines 229‑255. + - **Action Required:** Wrap collectors in a `CollectorManager` implementation and pass to the new method. + - **Dependencies:** None +- [ ] **CRITICAL** Clause limit API: Replace `BooleanQuery.getMaxClauseCount()` and related calls with `IndexSearcher.getMaxClauseCount()`. + - **Impact:** Deprecated methods removed. + - **Location:** `solr/core/src/java/org/apache/solr/core/SolrConfig.java`, `solr/core/src/java/org/apache/solr/core/CoreContainer.java`, tests referencing old API. + - **Lucene Change:** `MIGRATE.md` line 270. + - **Action Required:** Update imports and calls to use the `IndexSearcher` static methods. + - **Dependencies:** None + +## API Updates +- [ ] **HIGH** Analysis factories: Add `NAME` constants and public no-arg constructors for custom `TokenizerFactory`, `TokenFilterFactory`, and `CharFilterFactory` implementations. + - **Impact:** Factories without these will fail to load via ServiceLoader. + - **Location:** Custom analysis factories within Solr modules (search for implementations). + - **Lucene Change:** `MIGRATE.md` lines 494‑514. + - **Action Required:** Define `public static final String NAME` and default constructors throwing `defaultCtorException()` as recommended. + - **Dependencies:** None +- [ ] **HIGH** Term vector flags: Rename uses of `FieldInfo.hasVectors()` and `FieldInfos.hasVectors()` to `hasTermVectors()`. + - **Impact:** Compilation errors due to renamed methods. + - **Location:** `solr/core/src/java/org/apache/solr/handler/admin/api/GetSegmentData.java`, `solr/core/src/java/org/apache/solr/search/NumericHidingLeafReader.java`, `solr/core/src/java/org/apache/solr/search/SolrDocumentFetcher.java`, and others. + - **Lucene Change:** `MIGRATE.md` lines 280‑283. + - **Action Required:** Update method calls and adjust any logic expecting the old method names. + - **Dependencies:** None +- [ ] **HIGH** Expressions module changes: Adjust custom expression functions to use `MethodHandle` and handle `Expression#evaluate()` throwing `IOException`. + - **Impact:** Custom expression based features may fail at runtime. + - **Location:** `solr/core/src/java/org/apache/solr/search/ExpressionValueSourceParser.java`, `solr/core/src/java/org/apache/solr/spelling/suggest/DocumentExpressionDictionaryFactory.java`, etc. + - **Lucene Change:** `MIGRATE.md` lines 162‑176. + - **Action Required:** Convert existing `Map` usage using `JavascriptCompiler#convertLegacyFunctions` and update call sites for new checked exception. + - **Dependencies:** None +- [ ] **HIGH** Remove use of `Scorable.docID()` and adapt any subclass using this method to obtain doc IDs from `LeafCollector#collect`. + - **Impact:** Class removed; existing overrides fail to compile. + - **Location:** `solr/modules/ltr/src/java/org/apache/solr/ltr/LTRScoringQuery.java` and related tests. + - **Lucene Change:** `MIGRATE.md` lines 188‑198. + - **Action Required:** Refactor scorers and collectors to track document IDs externally. + - **Dependencies:** None +- [ ] **MEDIUM** Update `IOContext` usages to the new `ReadAdvice` enum and remove `IOContext.LOAD`/`IOContext.READ` constants. + - **Impact:** Compilation errors where these constants are referenced. + - **Location:** Backup and restoration utilities and any directory interactions. + - **Lucene Change:** `MIGRATE.md` lines 213‑223. + - **Action Required:** Replace with `ioContext.withReadAdvice(...)` or `IOContext.DEFAULT` as appropriate. + - **Dependencies:** None +- [ ] **MEDIUM** Ensure `Field` subclasses do not set custom `TokenStream` without a value. + - **Impact:** IllegalArgumentException at runtime. + - **Location:** `solr/core/src/java/org/apache/solr/legacy/LegacyField.java`, `solr/core/src/java/org/apache/solr/search/SolrDocumentFetcher.java`. + - **Lucene Change:** `MIGRATE.md` lines 79‑86. + - **Action Required:** Provide value fields or custom subclass overriding `tokenStream` as described. + - **Dependencies:** None +- [ ] **MEDIUM** Replace deprecated IndexSearcher#getExecutor usage with the new `getTaskExecutor` API if needed. + - **Impact:** Removed method may cause compilation errors. + - **Location:** Comments in `SolrIndexSearcher` and any custom code relying on it. + - **Lucene Change:** `MIGRATE.md` lines 149‑153. + - **Action Required:** Use `IndexSearcher#getTaskExecutor` when access to the executor is required. + - **Dependencies:** None + +## Behavioral Changes +- [ ] **HIGH** Faceting: `LongRangeFacetCounts#getTopChildren` and `DoubleRangeFacetCounts#getTopChildren` now return only the top‑n ranges. + - **Impact:** Range facet output may differ from 9.x behaviour. + - **Location:** Any Solr components relying on full range order counts. + - **Lucene Change:** `MIGRATE.md` lines 122‑127. + - **Action Required:** Use `Facets#getAllChildren` if previous behaviour is required. + - **Dependencies:** None +- [ ] **MEDIUM** Path hierarchy tokenization no longer produces overlapping tokens. + - **Impact:** Query parsing or highlighting against path fields may change. + - **Location:** Schemas using `PathHierarchyTokenizerFactory` or `ReversePathHierarchyTokenizerFactory`. + - **Lucene Change:** `MIGRATE.md` lines 183‑186. + - **Action Required:** Re-evaluate analysis and update tests expecting overlapping tokens. + - **Dependencies:** None +- [ ] **MEDIUM** Auto I/O throttling disabled by default in `ConcurrentMergeScheduler`. + - **Impact:** Merge behavior may change; performance tuning might be needed. + - **Location:** Config files using ``. + - **Lucene Change:** `MIGRATE.md` lines 274‑278. + - **Action Required:** Explicitly enable throttling if required via configuration. + - **Dependencies:** None +- [ ] **LOW** Romanian analysis normalization changed; reindex Romanian text if relying on old behavior. + - **Impact:** Search results for Romanian text may differ. + - **Location:** Any cores using `RomanianAnalyzer`. + - **Lucene Change:** `MIGRATE.md` lines 43‑46. + - **Action Required:** Reindex content or use a custom analyzer replicating old behavior. + - **Dependencies:** None + +## Configuration Updates +- [ ] **HIGH** Upgrade OpenNLP models and API usage to OpenNLP 2.x. + - **Impact:** Compilation issues if deprecated APIs are used; new model formats may be required. + - **Location:** `solr/modules/langid` update processors and related tests. + - **Lucene Change:** `MIGRATE.md` lines 35‑37. + - **Action Required:** Update dependencies, adjust code to new API, and verify language detection models. + - **Dependencies:** None +- [ ] **MEDIUM** Update Snowball stemmer references from "German2" to "German". + - **Impact:** Configuration references may be invalid in future releases. + - **Location:** Comments in sample schemas under `server/solr/configsets`. + - **Lucene Change:** `MIGRATE.md` lines 39‑41. + - **Action Required:** Replace "German2" with "German" in configs and documentation. + - **Dependencies:** None +- [ ] **LOW** Update any scripts or documentation using `CheckIndex -fast` or `-slow` to the new `-level` parameter. + - **Impact:** Deprecated flags will be removed. + - **Location:** Admin scripts or docs referencing CheckIndex. + - **Lucene Change:** `MIGRATE.md` lines 155‑160. + - **Action Required:** Replace with `-level 1/2/3` as appropriate. + - **Dependencies:** None + +## Optional Enhancements +- [ ] **LOW** Evaluate new `SeededKnnVectorQuery` and quantized vector codecs for potential performance improvements. + - **Impact:** Opportunity to improve vector search features. + - **Location:** Modules implementing KNN search. + - **Lucene Change:** `CHANGES.txt` lines 72‑79. + - **Action Required:** Investigate adoption and benchmark. + - **Dependencies:** None + +## Cleanup Tasks +- [ ] **LOW** Remove legacy mentions of `RAMDirectory` from example configurations. + - **Impact:** Keeps documentation current with removal of RAMDirectory in Lucene. + - **Location:** Sample `solrconfig.xml` files under `server/solr/configsets` and others. + - **Lucene Change:** `MIGRATE.md` lines 535‑539. + - **Action Required:** Update comments to suggest `ByteBuffersDirectory` or the default directory instead. + - **Dependencies:** None +