Commit Graph

185 Commits

Author SHA1 Message Date
SimFG
03a78ecc3d
enhance: gc in the snapshot kv (#36792)
issue: #36770

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-10-12 04:03:20 +08:00
Buqian Zheng
f7b811450d
feat: add enable_tokenizer params to VarChar field (#36480)
issue: #35922

add an enable_tokenizer param to varchar field: must be set to true so
that a varchar field can enable_match or used as input of BM25 function

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-10-10 20:33:21 +08:00
wei liu
3cd0b26285
enhance: Enable dynamic update loaded collection's replica (#35822)
issue: #35821
After collection loaded, if we need to increase/decrease collection's
replica, we need to release and load it again.

milvus offers 4 solution to update loaded collection's replica, this PR
aims to dynamic change the replica number without release, and after
replica number changed, milvus will execute load replica or release
replica in async, and the replica loaded status can be checked by
getReplicas API.

Notice that if set too much replicas than querynode can afford,the new
replica won't be loaded successfully until enough querynode joins.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-09-25 10:13:18 +08:00
cai.zhang
4b077e1bd2
fix: Fix the compatibility bug between stats task and segment (#36359)
issue: #33744

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-09-20 14:33:11 +08:00
aoiasd
139787371e
feat: support embedding bm25 sparse vector and flush bm25 stats log (#36036)
relate: https://github.com/milvus-io/milvus/issues/35853

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-09-19 10:57:12 +08:00
cai.zhang
8395c8a8db
enhance: Update stats task to optional (#35947)
issue: #33744

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-09-12 20:37:08 +08:00
aoiasd
da227ff9a1
feat: Support create collection with functions (#35973)
relate: https://github.com/milvus-io/milvus/issues/35853
Support create collection with functions. Prepare for support bm25
function.

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-09-12 10:43:06 +08:00
cai.zhang
2c9bb4dfa3
feat: Support stats task to sort segment by PK (#35054)
issue: #33744 

This PR includes the following changes:
1. Added a new task type to the task scheduler in datacoord: stats task,
which sorts segments by primary key.
2. Implemented segment sorting in indexnode.
3. Added a new field `FieldStatsLog` to SegmentInfo to store token index
information.

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-09-02 14:19:03 +08:00
SimFG
311f860676
enhance: support to drop the role which is related the privilege list (#35727)
- issue: #35545

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-08-30 15:17:00 +08:00
wei liu
5c245d51c4
enhance: Refresh proxy cache after restore rbac meta (#35635)
issue: #35443

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-08-22 19:09:01 +08:00
Zhen Ye
4d69898cb2
enhance: support single pchannel level transaction (#35289)
issue: #33285

- support transaction on single wal.
- last confirmed message id can still be used when enable transaction.
- add fence operation for segment allocation interceptor.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-08-19 21:22:56 +08:00
wei liu
1d49358f82
enhance: Add BackupRBAC/RestoreRBAC API to enable rbac backup (#35444)
issue: #35443

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-08-16 10:10:53 +08:00
chyezh
c725416288
enhance: move streaming proto into pkg (#35284)
issue: #33285

- move streaming related proto into pkg.
- add v2 message type and change flush message into v2 message.

Signed-off-by: chyezh <chyezh@outlook.com>
2024-08-07 10:34:16 +08:00
chyezh
9871966415
enhance: segment alloc interceptor (#34996)
#33285

- add segment alloc interceptor for streamingnode.
- add add manual alloc segment rpc for datacoord.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-08-04 07:40:15 +08:00
congqixia
de8a266d8a
enhance: Enable linux code checker (#35084)
See also #34483

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-30 15:53:51 +08:00
wei liu
c45f38aa61
enhance: Update protobuf-go to protobuf-go v2 (#34394)
issue: #34252

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-07-29 11:31:51 +08:00
chyezh
cc8f7aa110
fix: streaming service related fix patch (#34696)
issue: #33285

- add idAlloc interface
- fix binary unsafe bug for message
- fix service discovery lost when repeated address with different server
id

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-07-16 15:49:38 +08:00
chyezh
fda720b880
enhance: streaming service grpc utilities (#34436)
issue: #33285

- add two grpc resolver (by session and by streaming coord assignment
service)
- add one grpc balancer (by serverID and roundrobin)
- add lazy conn to avoid block by first service discovery
- add some utility function for streaming service

Signed-off-by: chyezh <chyezh@outlook.com>
2024-07-15 20:49:38 +08:00
XuanYang-cn
d7a3697fb5
enhance: Add back compactionTaskNum metrics (#34583)
Fix L0 compaction task recover unable to set segment not isCompacting

See also: #34460

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-07-11 17:45:37 +08:00
chyezh
1bc3c0b925
enhance: implement balancer at streaming coord (#34435)
issue: #33285

- add balancer implementation
- add channel count fair balance policy
- add channel assignment discover grpc service

Signed-off-by: chyezh <chyezh@outlook.com>
2024-07-11 09:58:48 +08:00
jaime
4365308241
enhance: support setting properties in create database request (#34510)
issue: #34493

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-07-09 18:16:18 +08:00
jaime
60be454db0
enhance: add disk quota and max collections into db properties (#34368)
issue: #34385

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-07-05 18:22:17 +08:00
jaime
9630974fbb
enhance: move rocksmq from internal to pkg module (#33881)
issue: #33956

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-06-25 21:18:15 +08:00
yihao.dai
fb870d2426
fix: Do compressBinlog to fix logID 0 (#34060)
issue: https://github.com/milvus-io/milvus/issues/34059

Do compressBinlog to ensure that reloadFromKV will fill binlogs' logID
after datacoord restarts.

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-22 16:14:01 +08:00
wayblink
08fcf3f62b
fix: Fix meta prefix overlap bug (#33830)
#30633

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-06-13 19:27:57 +08:00
wayblink
a1232fafda
feat: Major compaction (#33620)
#30633

Signed-off-by: wayblink <anyang.wang@zilliz.com>
Co-authored-by: MrPresent-Han <chun.han@zilliz.com>
2024-06-10 21:34:08 +08:00
smellthemoon
c61fb1eff5
enhance: do check when add not empty logpath (#33640)
meta only store logid

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-06-07 10:19:51 +08:00
cai.zhang
27cc9f2630
enhance: Support analyze data (#33651)
issue: #30633

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: chasingegg <chao.gao@zilliz.com>
2024-06-06 17:37:51 +08:00
congqixia
f6e251514f
fix: Write back dbid modification for nonDB id collection (#33641)
See also #33608

Make `fixDefaultDBIDConsistency` also write back collection dbid
modification when nonDB id collection is found.

This fix shall prevent dropped collections of this kind show up again
after dropping and restart.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-06 14:29:53 +08:00
yihao.dai
35532a3e7d
fix: Fill stats log id and check validity (#33477)
1. Fill log ID of stats log from import
2. Add a check to validate the log ID before writing to meta

issue: https://github.com/milvus-io/milvus/issues/33476

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-05 11:17:56 +08:00
zhenshan.cao
ac4f3997ce
enhance: Reconstructing Compaction to possess persistence capability (#33265)
issue #33586

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-06-05 10:17:50 +08:00
smellthemoon
2c7bb0b8ac
fix: replace removeWithPrefix with remove to avoid delete redundantly (#33328)
#33288

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-05-31 18:05:45 +08:00
XuanYang-cn
22bddde5ff
enhance: Tidy compactor and remove dup codes (#32198)
See also: #32451

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-05-23 09:53:40 +08:00
smellthemoon
225f4a6134
enhance: use the only MaxEtcdTxnNum (#33070)
#33071

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-05-17 14:27:42 +08:00
smellthemoon
b45798107a
enhance: add nullable in Field, check valid_data and fill data (#32086)
1. add nullable in model.Field
   help to read nullable accurately.
2. check valid_data
a. if user pass default_value or the field is nullable, the length of
valid_data must be num_rows.
b. if passed valid_data, the length of passed field data must equal to
the number of 'true' in valid_data.
c. after fill default_value, only nullable field will still has
valid_data.
3. fill data in two situation
    a. has no default_value, if nullable,
will fill nullValue when passed num_rows not equal to expected num_rows.
    b. has default_value,
will fill default_value when passed num_rows not equal to expected
num_rows.
c. after fill data, the length of all field will equal to passed
num_rows.
#31728

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-05-16 11:57:35 +08:00
cai.zhang
6ea7633bd5
enhance: Add memory size for binlog (#33025)
issue: #33005
1. add `MemorySize` field for insert binlog.
2. `LogSize` means the file size in the storage object.
3. `MemorySize` means the size of the data in the memory.

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2024-05-15 12:59:34 +08:00
jaime
f48a7ff8ff
enhance: use Delete instead of DeletePartialMatch to remove metrics (#33029)
Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-05-14 18:49:33 +08:00
smellthemoon
89a7c34c7a
fix: exceed etcd limit (#33041)
#32974

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-05-14 18:45:32 +08:00
Jiquan Long
3d85e6e028
fix: etcd txn exceeds limit due to too many fields (#33040)
fix: #33038

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-05-14 17:05:33 +08:00
chyezh
2586c2f1b3
enhance: use WalkWithPrefix api for oss, enable piplined file gc (#31740)
issue: #19095,#29655,#31718

- Change `ListWithPrefix` to `WalkWithPrefix` of OOS into a pipeline
mode.

- File garbage collection is performed in other goroutine.

- Segment Index Recycle clean index file too.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-25 20:41:27 +08:00
chyezh
a2502bde75
enhance: replica manager enhancement (#31496)
issue: #30647 

- ReplicaManager manage read only node now, and always do persistent of
node distribution of replica.

- All segment/channel checker using ReplicaManager to get read-only node
or read-write node, but not ResourceManager.

- ReplicaManager promise that only apply unique querynode to one replica
in same collection now (replicas in same collection never hold same
querynode at same time).

- ReplicaManager promise that fairly node count assignment policy if
multi replicas of collection is assigned to one resource group.

- Move some parameters check into ReplicaManager to avoid data race.

- Allow transfer replica to resource group that already load replica of
same collection

- Allow transfer node between resource groups that load replica of same
collection

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-05 04:57:16 +08:00
SimFG
8f3e0b6b41
enhance: the return result of list db api (#31544)
issue: #31543

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-03-28 07:13:10 +08:00
SimFG
b1a1cca10b
feat: add more operation detail info for better allocation (#30438)
issue: #30436

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-03-28 06:33:11 +08:00
congqixia
8e5865f630
enhance: Save collection targets by batches (#31616)
See also #28491 #31240

When colleciton number is large, querycoord saves collection target one
by one, which is slow and may block querycoord exits.

In local run, 500 collections scenario may lead to about 40 seconds
saving collection targets.

This PR changes the `SaveCollectionTarget` interface into batch one and
organizes the collection in 16 per bundle batches to accelerate this
procedure.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-27 00:09:08 +08:00
Bingyi Sun
8e661f791a
fix: lazy load index data in cache (#31094)
issue: https://github.com/milvus-io/milvus/issues/31571

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-03-25 15:43:07 +08:00
wei liu
e6d50def4f
enhance: Enable properites in database meta (#31394)
issue: #30040

This PR add properties in database meta, so we can store some database
level info.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-19 19:21:06 +08:00
wei liu
d79aa58b37
enhance: Speed up target recovery after query coord restart (#31240)
issue: #28491

after querycoord restart, it will pull a new target, which include
channel and segment list. when segments loaded on querynode has reached
the target, the collection could provide search/query. but if segment
list changes by time, ater querycoord pull a new target, it will takes a
few minutes to catch up the target's segment distribution. and before
that, query/search will fail due to lack of segments.

This PR save the current loaded target to meta storein querycoord's stop
progress, and recover it when query coord starts, to speed up the target
recovery time.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-15 14:19:03 +08:00
Bingyi Sun
5c0bb40549
fix: merge index params when creating index (#31127)
issue: https://github.com/milvus-io/milvus/issues/31102

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-03-11 17:31:03 +08:00
chyezh
6a9418d200
fix: lost dbname when only passing collection id to describeCollection (#31167)
issue: #30931

Signed-off-by: chyezh <chyezh@outlook.com>
2024-03-11 17:19:02 +08:00
yihao.dai
c411cb4a49
enhance: Prevent the backlog of channelCP update tasks, perform batch updates of channelCPs (#30941)
This PR includes the following adjustments:
1. To prevent channelCP update task backlog, only one task with the same
vchannel is retained in the updater. Additionally, the lastUpdateTime is
refreshed after the flowgraph submits the update task, rather than in
the callBack function.
2. Batch updates of multiple vchannel checkpoints are performed in the
UpdateChannelCheckpoint RPC (default batch size is 128). Additionally,
the lock for channelCPs in DataCoord meta has been switched from key
lock to global lock.
3. The concurrency of UpdateChannelCheckpoint RPCs in the datanode has
been reduced from 1000 to 10.

issue: https://github.com/milvus-io/milvus/issues/30004

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Co-authored-by: jaime <yun.zhang@zilliz.com>
Co-authored-by: congqixia <congqi.xia@zilliz.com>
2024-03-07 20:39:02 +08:00