We are excluding stop words using a stop word filter right after we apply the stemming analyzer.
However, now the items in the index do not contain any stop words.
That can be problematic, as illustrated by the following search:
-
We're trying to find "K6 Standardization and Whitelist"
-
This document contains the sentence "Question: If you want to know where someone is in the process of onboarding/whitelisting?" and we want to try and find that phrase.
-
The first approach is to search for a subset of the phrase, like where someone is. However, this search does not return anything. The reason? The stopword "is" was removed from that item in the search index, and when we search for the phrase "where someone is" we're actually asking for documents that contain "where" and documents that contain "someone" and documents that contain "is".
-
The second approach (which works) is to put the phrase in quotes; "where someone is" and that will return the document as expected.
What we need to do is, if we filter the contents for stop words, we need to also filter the search queries for stop words (unless it's an exact phrase).
We are excluding stop words using a stop word filter right after we apply the stemming analyzer.
However, now the items in the index do not contain any stop words.
That can be problematic, as illustrated by the following search:
We're trying to find "K6 Standardization and Whitelist"
This document contains the sentence "Question: If you want to know where someone is in the process of onboarding/whitelisting?" and we want to try and find that phrase.
The first approach is to search for a subset of the phrase, like
where someone is. However, this search does not return anything. The reason? The stopword "is" was removed from that item in the search index, and when we search for the phrase "where someone is" we're actually asking for documents that contain "where" and documents that contain "someone" and documents that contain "is".The second approach (which works) is to put the phrase in quotes;
"where someone is"and that will return the document as expected.What we need to do is, if we filter the contents for stop words, we need to also filter the search queries for stop words (unless it's an exact phrase).