Skip to content

Commit e5c7579

Browse files
author
Sam Partee
authored
Range query support (#55)
Implement range query support. PR includes: - [x] new `RangeQuery` class - [x] updated tests - [x] updated docs, readme, and doc strings --------- authored-by: Tyler Hutcherson <tyler.hutcherson@redis.com> co-authored-by: Sam Partee <sam.partee@redis.com>
1 parent 57e5c89 commit e5c7579

File tree

4 files changed

+385
-65
lines changed

4 files changed

+385
-65
lines changed

docs/user_guide/hybrid_queries_02.ipynb

Lines changed: 163 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
"cell_type": "markdown",
66
"metadata": {},
77
"source": [
8-
"# Complex Queries\n",
8+
"# Query\n",
99
"\n",
1010
"In this notebook, we will explore more complex queries that can be performed with ``redisvl``\n",
1111
"\n",
@@ -95,8 +95,8 @@
9595
"name": "stdout",
9696
"output_type": "stream",
9797
"text": [
98-
"\u001b[32m19:55:11\u001b[0m \u001b[34m[RedisVL]\u001b[0m \u001b[1;30mINFO\u001b[0m Indices:\n",
99-
"\u001b[32m19:55:11\u001b[0m \u001b[34m[RedisVL]\u001b[0m \u001b[1;30mINFO\u001b[0m 1. user_index\n"
98+
"\u001b[32m17:09:16\u001b[0m \u001b[34m[RedisVL]\u001b[0m \u001b[1;30mINFO\u001b[0m Indices:\n",
99+
"\u001b[32m17:09:16\u001b[0m \u001b[34m[RedisVL]\u001b[0m \u001b[1;30mINFO\u001b[0m 1. user_index\n"
100100
]
101101
}
102102
],
@@ -120,7 +120,7 @@
120120
"cell_type": "markdown",
121121
"metadata": {},
122122
"source": [
123-
"## Executing Hybrid Queries\n",
123+
"## Hybrid Queries\n",
124124
"\n",
125125
"Hybrid queries are queries that combine multiple types of filters. For example, you may want to search for a user that is a certain age, has a certain job, and is within a certain distance of a location. This is a hybrid query that combines numeric, tag, and geographic filters."
126126
]
@@ -544,6 +544,155 @@
544544
"result_print(index.query(v))"
545545
]
546546
},
547+
{
548+
"cell_type": "markdown",
549+
"metadata": {},
550+
"source": [
551+
"## Filter Queries\n",
552+
"\n",
553+
"In some cases, you may not want to run a vector query, but just use a ``FilterExpression`` similar to a SQL query. The ``FilterQuery`` class enable this functionality. It is similar to the ``VectorQuery`` class but soley takes a ``FilterExpression``."
554+
]
555+
},
556+
{
557+
"cell_type": "code",
558+
"execution_count": 19,
559+
"metadata": {},
560+
"outputs": [
561+
{
562+
"data": {
563+
"text/html": [
564+
"<table><tr><th>user</th><th>credit_score</th><th>age</th><th>job</th></tr><tr><td>derrick</td><td>low</td><td>14</td><td>doctor</td></tr><tr><td>taimur</td><td>low</td><td>15</td><td>CEO</td></tr></table>"
565+
],
566+
"text/plain": [
567+
"<IPython.core.display.HTML object>"
568+
]
569+
},
570+
"metadata": {},
571+
"output_type": "display_data"
572+
}
573+
],
574+
"source": [
575+
"from redisvl.query import FilterQuery\n",
576+
"\n",
577+
"has_low_credit = Tag(\"credit_score\") == \"low\"\n",
578+
"\n",
579+
"filter_query = FilterQuery(\n",
580+
" return_fields=[\"user\", \"credit_score\", \"age\", \"job\", \"location\"],\n",
581+
" filter_expression=has_low_credit\n",
582+
")\n",
583+
"\n",
584+
"results = index.query(filter_query)\n",
585+
"\n",
586+
"result_print(results)"
587+
]
588+
},
589+
{
590+
"cell_type": "markdown",
591+
"metadata": {},
592+
"source": [
593+
"## Range Queries\n",
594+
"\n",
595+
"Range Queries are a useful method to perform a vector search where only results within a vector ``distance_threshold`` are returned. This enables the user to find all records within their dataset that are similar to a query vector where \"similar\" is defined by a quantitative value."
596+
]
597+
},
598+
{
599+
"cell_type": "code",
600+
"execution_count": 20,
601+
"metadata": {},
602+
"outputs": [
603+
{
604+
"data": {
605+
"text/html": [
606+
"<table><tr><th>vector_distance</th><th>user</th><th>credit_score</th><th>age</th><th>job</th></tr><tr><td>0</td><td>john</td><td>high</td><td>18</td><td>engineer</td></tr><tr><td>0</td><td>derrick</td><td>low</td><td>14</td><td>doctor</td></tr><tr><td>0.109129190445</td><td>tyler</td><td>high</td><td>100</td><td>engineer</td></tr><tr><td>0.158809006214</td><td>tim</td><td>high</td><td>12</td><td>dermatologist</td></tr></table>"
607+
],
608+
"text/plain": [
609+
"<IPython.core.display.HTML object>"
610+
]
611+
},
612+
"metadata": {},
613+
"output_type": "display_data"
614+
}
615+
],
616+
"source": [
617+
"from redisvl.query import RangeQuery\n",
618+
"\n",
619+
"range_query = RangeQuery(\n",
620+
" vector=[0.1, 0.1, 0.5],\n",
621+
" vector_field_name=\"user_embedding\",\n",
622+
" return_fields=[\"user\", \"credit_score\", \"age\", \"job\", \"location\"],\n",
623+
" distance_threshold=0.2\n",
624+
")\n",
625+
"\n",
626+
"# same as the vector query or filter query\n",
627+
"results = index.query(range_query)\n",
628+
"\n",
629+
"result_print(results)"
630+
]
631+
},
632+
{
633+
"cell_type": "markdown",
634+
"metadata": {},
635+
"source": [
636+
"We can also change the distance threshold of the query object between uses if we like. Here we will set ``distance_threshold==0.1``. This means that the query object will return all matches that are within 0.1 of the query object. This is a small distance, so we expect to get fewer matches than before."
637+
]
638+
},
639+
{
640+
"cell_type": "code",
641+
"execution_count": 21,
642+
"metadata": {},
643+
"outputs": [
644+
{
645+
"data": {
646+
"text/html": [
647+
"<table><tr><th>vector_distance</th><th>user</th><th>credit_score</th><th>age</th><th>job</th></tr><tr><td>0</td><td>john</td><td>high</td><td>18</td><td>engineer</td></tr><tr><td>0</td><td>derrick</td><td>low</td><td>14</td><td>doctor</td></tr></table>"
648+
],
649+
"text/plain": [
650+
"<IPython.core.display.HTML object>"
651+
]
652+
},
653+
"metadata": {},
654+
"output_type": "display_data"
655+
}
656+
],
657+
"source": [
658+
"range_query.set_distance_threshold(0.1)\n",
659+
"\n",
660+
"result_print(index.query(range_query))"
661+
]
662+
},
663+
{
664+
"cell_type": "markdown",
665+
"metadata": {},
666+
"source": [
667+
"Range queries can also be used with filters like any other query type. The following limits the results to only include records with a ``job`` of ``engineer`` while also being within the vector range (aka distance)."
668+
]
669+
},
670+
{
671+
"cell_type": "code",
672+
"execution_count": 22,
673+
"metadata": {},
674+
"outputs": [
675+
{
676+
"data": {
677+
"text/html": [
678+
"<table><tr><th>vector_distance</th><th>user</th><th>credit_score</th><th>age</th><th>job</th></tr><tr><td>0</td><td>john</td><td>high</td><td>18</td><td>engineer</td></tr></table>"
679+
],
680+
"text/plain": [
681+
"<IPython.core.display.HTML object>"
682+
]
683+
},
684+
"metadata": {},
685+
"output_type": "display_data"
686+
}
687+
],
688+
"source": [
689+
"is_engineer = Text(\"job\") == \"engineer\"\n",
690+
"\n",
691+
"range_query.set_filter(is_engineer)\n",
692+
"\n",
693+
"result_print(index.query(range_query))"
694+
]
695+
},
547696
{
548697
"cell_type": "markdown",
549698
"metadata": {},
@@ -559,7 +708,7 @@
559708
},
560709
{
561710
"cell_type": "code",
562-
"execution_count": 19,
711+
"execution_count": 23,
563712
"metadata": {},
564713
"outputs": [
565714
{
@@ -598,7 +747,7 @@
598747
},
599748
{
600749
"cell_type": "code",
601-
"execution_count": 20,
750+
"execution_count": 24,
602751
"metadata": {},
603752
"outputs": [
604753
{
@@ -607,7 +756,7 @@
607756
"'@credit_score:{high}'"
608757
]
609758
},
610-
"execution_count": 20,
759+
"execution_count": 24,
611760
"metadata": {},
612761
"output_type": "execute_result"
613762
}
@@ -620,17 +769,17 @@
620769
},
621770
{
622771
"cell_type": "code",
623-
"execution_count": 21,
772+
"execution_count": 25,
624773
"metadata": {},
625774
"outputs": [
626775
{
627776
"name": "stdout",
628777
"output_type": "stream",
629778
"text": [
630-
"{'id': 'v1:dc45946a8bc74f47858617c91d593b43', 'payload': None, 'user': 'john', 'age': '18', 'job': 'engineer', 'credit_score': 'high', 'office_location': '-122.4194,37.7749', 'user_embedding': '==\\x00\\x00\\x00?'}\n",
631-
"{'id': 'v1:5c628fdfbba247c6843955de04e3a00c', 'payload': None, 'user': 'nancy', 'age': '94', 'job': 'doctor', 'credit_score': 'high', 'office_location': '-122.4194,37.7749', 'user_embedding': '333?=\\x00\\x00\\x00?'}\n",
632-
"{'id': 'v1:4f1cb6dd167149d59c9c108e09407fc9', 'payload': None, 'user': 'tyler', 'age': '100', 'job': 'engineer', 'credit_score': 'high', 'office_location': '-122.0839,37.3861', 'user_embedding': '=>\\x00\\x00\\x00?'}\n",
633-
"{'id': 'v1:f1720dbeb81c4316bedf21ca60357fdf', 'payload': None, 'user': 'tim', 'age': '12', 'job': 'dermatologist', 'credit_score': 'high', 'office_location': '-122.0839,37.3861', 'user_embedding': '>>\\x00\\x00\\x00?'}\n"
779+
"{'id': 'v1:d78adb45342c4404a9c40afd4e65f51b', 'payload': None, 'user': 'john', 'age': '18', 'job': 'engineer', 'credit_score': 'high', 'office_location': '-122.4194,37.7749', 'user_embedding': '==\\x00\\x00\\x00?'}\n",
780+
"{'id': 'v1:a0a202b6398840c5ab2263b1fd4e704a', 'payload': None, 'user': 'nancy', 'age': '94', 'job': 'doctor', 'credit_score': 'high', 'office_location': '-122.4194,37.7749', 'user_embedding': '333?=\\x00\\x00\\x00?'}\n",
781+
"{'id': 'v1:1f3b15dfb4ed490186859c1b2cb3df82', 'payload': None, 'user': 'tyler', 'age': '100', 'job': 'engineer', 'credit_score': 'high', 'office_location': '-122.0839,37.3861', 'user_embedding': '=>\\x00\\x00\\x00?'}\n",
782+
"{'id': 'v1:465de540d9d54501b09b8e47a0116620', 'payload': None, 'user': 'tim', 'age': '12', 'job': 'dermatologist', 'credit_score': 'high', 'office_location': '-122.0839,37.3861', 'user_embedding': '>>\\x00\\x00\\x00?'}\n"
634783
]
635784
}
636785
],
@@ -653,7 +802,7 @@
653802
},
654803
{
655804
"cell_type": "code",
656-
"execution_count": 22,
805+
"execution_count": 26,
657806
"metadata": {},
658807
"outputs": [
659808
{
@@ -662,7 +811,7 @@
662811
"'((@credit_score:{high} @age:[18 +inf]) @age:[-inf 100])=>[KNN 10 @user_embedding $vector AS vector_distance] RETURN 6 user credit_score age job office_location vector_distance SORTBY vector_distance ASC DIALECT 2 LIMIT 0 10'"
663812
]
664813
},
665-
"execution_count": 22,
814+
"execution_count": 26,
666815
"metadata": {},
667816
"output_type": "execute_result"
668817
}

redisvl/query/__init__.py

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,3 @@
1-
from redisvl.query.query import FilterQuery, VectorQuery
1+
from redisvl.query.query import FilterQuery, VectorQuery, RangeQuery
22

3-
__all__ = [
4-
"VectorQuery",
5-
"FilterQuery",
6-
]
3+
__all__ = ["VectorQuery", "FilterQuery", "RangeQuery"]

0 commit comments

Comments
 (0)