issue: #37289
cause pr #37116 introduce retry on get shard leader, which make search
won't fail during query node down.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #37289
those test case use search to verify replica's status, but if the search
gap is 1s, the node down's effect may be fixed up by balance.
This PR remove the 1 second gap between search operation.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue:https://github.com/milvus-io/milvus/issues/27576
# Main Goals
1. Create and describe collections with geospatial fields, enabling both
client and server to recognize and process geo fields.
2. Insert geospatial data as payload values in the insert binlog, and
print the values for verification.
3. Load segments containing geospatial data into memory.
4. Ensure query outputs can display geospatial data.
5. Support filtering on GIS functions for geospatial columns.
# Solution
1. **Add Type**: Modify the Milvus core by adding a Geospatial type in
both the C++ and Go code layers, defining the Geospatial data structure
and the corresponding interfaces.
2. **Dependency Libraries**: Introduce necessary geospatial data
processing libraries. In the C++ source code, use Conan package
management to include the GDAL library. In the Go source code, add the
go-geom library to the go.mod file.
3. **Protocol Interface**: Revise the Milvus protocol to provide
mechanisms for Geospatial message serialization and deserialization.
4. **Data Pipeline**: Facilitate interaction between the client and
proxy using the WKT format for geospatial data. The proxy will convert
all data into WKB format for downstream processing, providing column
data interfaces, segment encapsulation, segment loading, payload
writing, and cache block management.
5. **Query Operators**: Implement simple display and support for filter
queries. Initially, focus on filtering based on spatial relationships
for a single column of geospatial literal values, providing parsing and
execution for query expressions.
6. **Client Modification**: Enable the client to handle user input for
geospatial data and facilitate end-to-end testing.Check the modification
in pymilvus.
---------
Signed-off-by: tasty-gumi <1021989072@qq.com>
issue: #35821
After collection loaded, if we need to increase/decrease collection's
replica, we need to release and load it again.
milvus offers 4 solution to update loaded collection's replica, this PR
aims to dynamic change the replica number without release, and after
replica number changed, milvus will execute load replica or release
replica in async, and the replica loaded status can be checked by
getReplicas API.
Notice that if set too much replicas than querynode can afford,the new
replica won't be loaded successfully until enough querynode joins.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
#36252
remove no need type check. if users use null type writer to write
parquet, hope it successfully.
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
This PR introduce stats task for import:
1. Define new `Stats` and `IndexBuilding` states for importJob
2. Add new stats step to the import process: trigger the stats task and
wait for its completion
3. Abort stats task if import job failed
issue: https://github.com/milvus-io/milvus/issues/33744
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
fix not append valid data when transfer to insert record and add a tiny
check when in groupBy field.
#35924
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
This issue only occurs for a short time right after a table is created.
To avoid this, we simply reduce the frequency of cleaning up invalid
limiter nodes.
issue: https://github.com/milvus-io/milvus/issues/35933
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
issue: #33744
This PR includes the following changes:
1. Added a new task type to the task scheduler in datacoord: stats task,
which sorts segments by primary key.
2. Implemented segment sorting in indexnode.
3. Added a new field `FieldStatsLog` to SegmentInfo to store token index
information.
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
issue: #33285
- using streaming service in insert/upsert/flush/delete/querynode
- fixup flusher bugs and refactor the flush operation
- enable streaming service for dml and ddl
- pass the e2e when enabling streaming service
- pass the integration tst when enabling streaming service
---------
Signed-off-by: chyezh <chyezh@outlook.com>
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
issue: #34985#35493
after querynode has been suspended, it's not allow to load
segment/channel on it, which means the node is read only. to be
compatible with resource group design, after query node has been
suspend, we remove it from it's original resource group, make it a read
only query node in replica. then two things will happens:
1. it's original resource group will be lacking of query nodes, query
coord will assign new node to it.
2. querycoord will try to move out all segments/channels after querynode
has been suspended
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #29419
* If a sparse vector with 0 non-zero value is inserted, no ANN search on
this sparse vector field will return it as a result. User may retrieve
this row via scalar query or ANN search on another vector field though.
* If the user uses an empty sparse vector as the query vector for a ANN
search, no neighbor will be returned.
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
issue: #35170
This PR enable to set load configs in cluster level, such as replicas
and resource groups. then when load collections will use the load
config.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Seals the largest growing segment if the total size of growing segments
of each shard exceeds the size threshold(default 4GB). Introducing this
policy can help keep the size of growing segments within a suitable
level, alleviating the pressure on the delegator.
issue: https://github.com/milvus-io/milvus/issues/34554
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
issue: #34595
When consuming insert data on the delegator node, QueryCoord will move
out some sealed segments to manage its memory usage. After the growing
segment gets flushed, some sealed segments from other workers will be
moved back to the delegator node. To avoid the frequent movement of
segments, we estimate the maximum growing row count and preserve a
fixed-size memory in the delegator node.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>