Skip to content

Add a new highlighter based on lucene's match highlighter #92068

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

romseygeek
Copy link
Contributor

This is considerably simpler and more accurate than our existing highlighters,
and should cover the vast majority of required functionality.

Still to do:

  • better test coverage including some REST tests
  • support for term vectors or 'requireFieldMatch=false', which should be doable via some faked index readers
  • support for nested docs - should be doable via some query rewriting in inner hits phase to pull out the relevant parts of the query
  • docs

I'd like to go through the :Search/Highlighting label and see if any of the reported bugs there become easier to fix with this implementation.

@romseygeek romseygeek self-assigned this Dec 2, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@elasticsearchmachine elasticsearchmachine added v8.7.0 Team:Search Meta label for search team labels Dec 2, 2022
@elasticsearchmachine
Copy link
Collaborator

Hi @romseygeek, I've created a changelog YAML for you.

@@ -106,7 +107,7 @@

public class HighlighterSearchIT extends ESIntegTestCase {
// TODO as we move analyzers out of the core we need to move some of these into HighlighterWithAnalyzersTests
private static final String[] ALL_TYPES = new String[] { "plain", "fvh", "unified" };
private static final String[] ALL_TYPES = new String[] { "plain", "fvh", "unified", "matches" };

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point we should rewrite at least some of this as an abstract highlighter test case which each highlighter implementation can then test itself against.

@@ -3508,6 +3509,9 @@ public void testWithNestedQuery() throws Exception {
// but we highlight the root text field since nested documents cannot be highlighted with postings nor term vectors
// directly.
for (String type : ALL_TYPES) {
if (Objects.equals("matches", type)) {
continue; // matches highlighter doesn't support nested fields
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... yet

@romseygeek
Copy link
Contributor Author

Note that the majority of the lines in this PR are copies of Apache Lucene classes that will be removed after the next lucene release, as they contain usability fixes I discovered while working on this project.

@javanna javanna added Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch and removed Team:Search Meta label for search team labels Jul 16, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>feature :Search Relevance/Highlighting How a query matched a document Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v8.6.0 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants