-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding semantic search workload that includes vector and bm25 search #342
base: main
Are you sure you want to change the base?
Adding semantic search workload that includes vector and bm25 search #342
Conversation
875dd1d
to
ef84aa8
Compare
ef84aa8
to
43bc7ff
Compare
@VijayanB can I get initial feedback for this PR while we're waiting for feedback on opensearch-project/opensearch-benchmark#591 from other folks? |
6cb13b7
to
42ddc44
Compare
Signed-off-by: Martin Gaievski <[email protected]>
Signed-off-by: Martin Gaievski <[email protected]>
Signed-off-by: Martin Gaievski <[email protected]>
Signed-off-by: Martin Gaievski <[email protected]>
42ddc44
to
79bc1f3
Compare
I'm going to restore works for this PR and I want to summarize the feedback I got last time:
I put together main points from this list in a private branch: https://github.com/martin-gaievski/opensearch-benchmark-workloads/tree/adding_vectordataset_for_semantic_search/treccovid_semantic_search @VijayanB please let me know if I've missed any important point from your comments |
Description
Adding workload for semantic search that is based on customized trec-covid dataset and includes vector, text and integer fields. That allows to run queries like neural search and hybrid query where neural is one of sub queries.
Modified version of trec-covid dataset has ~1M documents, 6 copies of each document from original dataset.
Issues Resolved
#341
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.