issue: #29892
This PR:
1. Move the process of gathering materialized search info to when the
search plan is created, before it goes to each segment, to avoid
repeated work and access the plan node under multi-threaded
circumstances.
2. Enforce the supported MV type to `VARCHAR`
3. Add integration test
Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
issue: #29892
This PR
1. Pass Materialized View (MV) search information obtained from the
expression parsing planning procedure to Knowhere. It only performs when
MV is enabled and the partition key is involved in the expression. The
search information includes:
1. Touched field_id and the count of related categories in the
expression. E.g., `color == red && color == blue` yields `field_id ->
2`.
2. Whether the expression only includes AND (&&) logical operator,
default `true`.
3. Whether the expression has NOT (!) operator, default `false`.
4. Store if turning on MV on the proxy to eliminate reading from
paramtable for every search request.
5. Renames to MV.
## Rebuttals
1. Did not write in `ExtractInfoPlanNodeVisitor` since the new scalar
framework was introduced and this part might be removed in the future.
2. Currently only interested in `==` and `in` expression, `string` data
type, anything else is a bonus.
3. Leave handling expressions like `F == A || F == A` for future works
of the optimizer.
## Detailed MV Info
![image](https://github.com/milvus-io/milvus/assets/6563846/b27c08a0-9fd3-4474-8897-30a3d6d6b36f)
Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
issue: #31351
This PR fixed that search/hybrid_search doesn't expire shard leader
cache when send request to query node failed, which make every request
keep trying to connect a offline query node
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
During requery, segments may change (e.g., due to compaction), so we
need to return specific error codes when encountering incomplete requery
results. Clients can then retry to avoid this issue.
issue: https://github.com/milvus-io/milvus/issues/29656
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
if check in Segcore, will not do the it when not insert data.
so, check "radius" and "range_filter" in proxy.
related with #30365
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
See also #30250
This PR add requery flag in query task. When reQuery flag is true, query
task shall skip partition name conversion and use pre-calculated
partitionIDs passed from search task.
TODO: hybrid search does not have partition id information. we shall
apply same logic for hybrid search later.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also: #29113
Add a new utitliy function in `pkg/util/typetuil` to pre-allocate field
data slice capacity acoording to search limit. This shall avoid copying
the data during `AppendFieldData` when previous slice is out of space.
And shall also save CPU time during high paylog.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
When the TimeTravel functionality was previously removed, it
inadvertently affected the MVCC functionality within the system. This PR
aims to reintroduce the internal MVCC functionality as follows:
1. Add MvccTimestamp to the requests of Search/Query and the results of
Search internally.
2. When the delegator receives a Query/Search request and there is no
MVCC timestamp set in the request, set the delegator's current tsafe as
the MVCC timestamp of the request. If the request already has an MVCC
timestamp, do not modify it.
3. When the Proxy handles Search and triggers the second phase ReQuery,
divide the ReQuery into different shards and pass the MVCC timestamp to
the corresponding Query requests.
issue: #29656
Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
related: #25324
Search GroupBy function, used to aggregate result entities based on a
specific scalar column.
several points to mention:
1. Temporarliy, the whole groupby is implemented separated from
iterative expr framework **for the first period**
2. In the long term, the groupBy operation will be incorporated into the
iterative expr framework:https://github.com/milvus-io/milvus/pull/28166
3. This pr includes some unrelated mocked interface regarding alterIndex
due to some unworth-to-mention reasons. All these un-associated content
will be removed before the final pr is merged. This version of pr is
only for review
4. All other related details were commented in the files comparison
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
See also #29113
The collection schema is crucial when performing search/query but some
of the information is calculated for every request.
This PR change schema field of cached collection info into a utility
`schemaInfo` type to store some stable result, say pk field,
partitionKeyEnabled, etc. And provided field name to id map for
search/query services.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also #29177
Add a config item for partition name as regexp feature and disable it by
default
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also #29113
Using zap.Stringer log field will evaluate log field value only when log
level meets the configuration, which could save some CPU time in search
route
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
return the last error but not combining all errors, to improve
readability and erorr handling
resolve: #28572
---------
Signed-off-by: yah01 <yah2er0ne@outlook.com>