milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2024-12-02 11:59:00 +08:00

Author	SHA1	Message	Date
congqixia	4f8c540c77	enhance: cache collection schema attributes to reduce proxy cpu (#29668 ) See also #29113 The collection schema is crucial when performing search/query but some of the information is calculated for every request. This PR change schema field of cached collection info into a utility `schemaInfo` type to store some stable result, say pk field, partitionKeyEnabled, etc. And provided field name to id map for search/query services. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-04 17:28:46 +08:00
smellthemoon	a988daf143	fix: unstable ut and fix goroutine leak (#29624 ) fix unstable ut and fix goroutine leak related: #27801 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-01-04 17:24:47 +08:00
XuanYang-cn	a3aff37f73	fix: Correct flush buffer size metrics (#29571 ) See also: #29204 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-01-04 17:22:46 +08:00
smellthemoon	e09fc040aa	fix: the config value of DataCoordTimeTick become longer and longer (#29659 ) #29658 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-01-04 17:06:47 +08:00
congqixia	da7c3cbd88	enhance: make delegator delete buffer holding all delete from cp (#29626 ) See also #29625 This PR: - Add a new implemention of `DeleteBuffer`: listDeleteBuffer - holds cacheBlock slice - `Put` method append new delete data into last block - when a block is full, append a new block into the list - Add `TryDiscard` method for `DeleteBuffer` interface - For doubleCacheBuffer, do nothing - For listDeleteBuffer, try to evict "old" blocks, which are blocks before the first block whose start ts is behind provided ts - Add checkpoint field for `UpdateVersion` sync action, which shall be used to discard old cache delete block --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-04 17:02:46 +08:00
congqixia	79c06c5e73	fix: serializer shall bypass L0 segment merge stats step (#29636 ) See also #27675 Fix logic problem introduced by #29413, which is serializer tries to merge statslog list while level segments do not have statslog. This shall result returning error. `writeBufferBase` ignores this error but it shall only ignore `ErrSegmentNotFound`. This PR add logic checking segment level before execution of merging statslog list. And add error type check for getSyncTask failure. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-04 16:52:45 +08:00
congqixia	aa967de0a8	enhance: Explicitly pass LevelZero segment ids in vchan info (#29612 ) See also #27675 For `GetRecoveryInfo` & `GetRecoveryInfoV2`, Level zero segment ids shall be specified in vchan info so that querycoord could re-fetch current segment info during watch procedure without having all segment info Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-04 16:46:45 +08:00
yah01	99e0f1e65a	enhance: unable to compile C++ tests (#29616 ) The tests need to call a private method, Milvus uses `#define` to replace private with public, the hack trick works but would be broken if the including order changed. This uses friend to make all things work well Signed-off-by: yah01 <yang.cen@zilliz.com> Signed-off-by: yah01 <yah2er0ne@outlook.com>	2024-01-04 13:20:46 +08:00
wei liu	336fce0582	enhance: Rewrite gen segment plan based on assign segment (#29574 ) issue: #29582 This PR rewrite gen segment plan logic based on assign segment in `score_based_balancer` Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-01-04 11:10:44 +08:00
SimFG	d23f87a393	enhance: Add concurrency for datacoord segment GC (#29561 ) issue: https://github.com/milvus-io/milvus/issues/29553 /kind improvement Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-01-03 13:16:57 +08:00
smellthemoon	b12c90af72	enhance:Add upsert vector metrics (#29226 ) add some metrics. Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-01-03 10:06:48 +08:00
XuanYang-cn	f1b6ccf305	enhance: compaction use ChannelManager interface (#29530 ) Rewrite compaction_test.go See also: #29447 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-01-02 18:08:49 +08:00
PowderLi	5f00bad4b8	fix: link with install path's libblob-chunk-manager (#29496 ) issue: #29494 1. link with install path's libblob-chunk-manager 2. performance of `ShouldBindWith` is better than `ShouldBindBodyWith` 3. the middleware shouldn't read the unrefreshed parameter repeatly Signed-off-by: PowderLi <min.li@zilliz.com>	2023-12-31 20:02:48 +08:00
Jiquan Long	3f46c6d459	feat: support inverted index (#28783 ) issue: https://github.com/milvus-io/milvus/issues/27704 Add inverted index for some data types in Milvus. This index type can save a lot of memory compared to loading all data into RAM and speed up the term query and range query. Supported: `INT8`, `INT16`, `INT32`, `INT64`, `FLOAT`, `DOUBLE`, `BOOL` and `VARCHAR`. Not supported: `ARRAY` and `JSON`. Note: - The inverted index for `VARCHAR` is not designed to serve full-text search now. We will treat every row as a whole keyword instead of tokenizing it into multiple terms. - The inverted index don't support retrieval well, so if you create inverted index for field, those operations which depend on the raw data will fallback to use chunk storage, which will bring some performance loss. For example, comparisons between two columns and retrieval of output fields. The inverted index is very easy to be used. Taking below collection as an example: ```python fields = [ FieldSchema(name="pk", dtype=DataType.VARCHAR, is_primary=True, auto_id=False, max_length=100), FieldSchema(name="int8", dtype=DataType.INT8), FieldSchema(name="int16", dtype=DataType.INT16), FieldSchema(name="int32", dtype=DataType.INT32), FieldSchema(name="int64", dtype=DataType.INT64), FieldSchema(name="float", dtype=DataType.FLOAT), FieldSchema(name="double", dtype=DataType.DOUBLE), FieldSchema(name="bool", dtype=DataType.BOOL), FieldSchema(name="varchar", dtype=DataType.VARCHAR, max_length=1000), FieldSchema(name="random", dtype=DataType.DOUBLE), FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=dim), ] schema = CollectionSchema(fields) collection = Collection("demo", schema) ``` Then we can simply create inverted index for field via: ```python index_type = "INVERTED" collection.create_index("int8", {"index_type": index_type}) collection.create_index("int16", {"index_type": index_type}) collection.create_index("int32", {"index_type": index_type}) collection.create_index("int64", {"index_type": index_type}) collection.create_index("float", {"index_type": index_type}) collection.create_index("double", {"index_type": index_type}) collection.create_index("bool", {"index_type": index_type}) collection.create_index("varchar", {"index_type": index_type}) ``` Then, term query and range query on the field can be speed up automatically by the inverted index: ```python result = collection.query(expr='int64 in [1, 2, 3]', output_fields=["pk"]) result = collection.query(expr='int64 < 5', output_fields=["pk"]) result = collection.query(expr='int64 > 2997', output_fields=["pk"]) result = collection.query(expr='1 < int64 < 5', output_fields=["pk"]) ``` --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2023-12-31 19:50:47 +08:00
zhagnlu	79c417b14e	fix: pass active count to query context instead of timestamp (#29541 ) #29319 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2023-12-31 16:08:48 +08:00
smellthemoon	ae640e7c80	fix: pass in undefined params (#29591 ) fix pass in undefined params issue: #29594 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2023-12-30 00:32:47 +08:00
sre-ci-robot	c2345daf3a	[automated] Update Knowhere Commit (#29578 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2023-12-29 18:56:46 +08:00
xige-16	02673914a0	feat: Support multiple vector indexes in a collection (#27700 ) issue: #25639 /kind improvement Signed-off-by: xige-16 <xi.ge@zilliz.com> --------- Signed-off-by: xige-16 <xi.ge@zilliz.com>	2023-12-29 11:44:45 +08:00
congqixia	55af8f611f	fix: always sync level zero segments as flushed (#29569 ) See also #27675 For now, Level zero segments shall always be synced as `Flushed` ones. This PR fixes when level zero segments selected by policies other than flush ts policy will be synced as growing state. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-29 10:34:47 +08:00
congqixia	a3cb8e2625	fix: Add atomic method to get collection target (#29577 ) Related to #29575 Add `getCollectionTarget` method which is atomic when scope is `CurrentTargetFirst` or `NextTargetFirst` Also return error when executor finds no channel in target manager --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-29 09:04:46 +08:00
congqixia	a8b7629315	fix: exclude insertData before growing checkpoint (#29558 ) Resolves: #29556 Refine exclude segment function signature Add exclude growing before checkpoint logic Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-28 18:18:54 +08:00
MrPresent-Han	ed644983e2	enhance: add param for bloomfilter(#29388 ) (#29490 ) related: #29388 Signed-off-by: MrPresent-Han <chun.han@zilliz.com>	2023-12-28 18:10:46 +08:00
wei liu	514e279f3a	enhance: Remove useless log in collection observer (#29554 ) This PR removed the useless log in collection observer Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-12-28 17:16:47 +08:00
xige-16	0a70e8b601	enhance: Remove multiple vector field limit (#27827 ) issue: https://github.com/milvus-io/milvus/issues/25639 /kind improvement Signed-off-by: xige-16 <xi.ge@zilliz.com> Signed-off-by: xige-16 <xi.ge@zilliz.com>	2023-12-28 16:40:46 +08:00
XuanYang-cn	4b406e5973	enhance: Add CompactionTaskNum metrics (#29518 ) See also: #27606 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-12-28 15:46:55 +08:00
yah01	a8a0aa9357	fix: missing to support compact for Array type (#29505 ) the array type can't be compacted, the system could continue with the inserted segments, but these segments can be never compacted fix #29503 --------- Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-12-28 15:42:48 +08:00
wei liu	5474bce9d2	fix: Choose wrong shard leader during balance channel (#29529 ) issue: #29523 readable shard leader should still be the old one during channel balance, if the new shard leader is not ready. This PR fixed that query coord choose wrong shard leader during balance channel Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-12-28 15:22:51 +08:00
congqixia	6597c72992	fix: compose exclude info from flushed segment id (#29548 ) See also #29526 Previous PR removed flushed segment info from request, which causes pipeline failing to exclude flushed segment info Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-28 14:02:54 +08:00
XuanYang-cn	623939c9f5	enhance: Remove not in use policies (#29448 ) THe results don't meet our requirements, and the code hasn't been maintained for a long time. See also: #29447 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-12-28 10:38:46 +08:00
Gao	8a630f733a	enhance: add new optimize param for queryhook (#29495 ) add a flag to indicate if we use search param optimizations, default is on --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2023-12-28 10:04:46 +08:00
congqixia	aa279db44c	enhance: remove flushed segmentInfo in WatchChannelRequest (#29526 ) `WatchDmChannel` only need growing segment info, this PR removes fetch segmentInfos when fill watch dml channel request. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-28 00:40:47 +08:00
congqixia	b251c3a682	enhance: add ctx for HandleCStatus and callers (#29517 ) See also #29516 Make `HandleCStatus` print trace id for better logging Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-27 16:10:47 +08:00
XuanYang-cn	632d8b3743	enhance: Change DN channelmanger into interface (#29307 ) See also: #28854 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-12-27 16:00:48 +08:00
XuanYang-cn	fe04598900	enhance: Add compaction type label to metrics (#29485 ) Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-12-27 15:56:48 +08:00
cai.zhang	c45f8a2946	fix: Import data from parquet file in streaming way (#29514 ) issue: #29292 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2023-12-27 15:30:46 +08:00
wei liu	839a72129e	fix: Auto balance param can't be updated by dynamic (#29501 ) This PR fixed that auto balance param can't be updated by dynamic Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-12-27 14:30:53 +08:00
congqixia	f6cff25712	enhance: fix serialization record span & flushed buffer size metrics (#29482 ) See also #27675 #29413 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-27 10:20:48 +08:00
aoiasd	033456ea2c	enhance: make sure stream closed (#29456 ) relate: https://github.com/milvus-io/milvus/issues/28367 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2023-12-26 19:56:47 +08:00
aoiasd	a76e3b2813	Refine delete by expression for forbid proxy dml task scheduler hang (#29340 ) relate: https://github.com/milvus-io/milvus/issues/29146 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2023-12-26 19:52:48 +08:00
Jiquan Long	6f4791da0b	fix: panic in concurrent insert/query scenario (#29408 ) issue: https://github.com/milvus-io/milvus/issues/29405 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2023-12-26 15:10:48 +08:00
congqixia	02bc0d0dd5	fix: Add scope limit for querynode DeleteRequest (#29474 ) See also #27515 When Delegator processes delete data, it forwards delete data with only segment id specified. When two segments has same segment id but one is growing and the other is sealed, the delete will be applied to both segments which causes delete data out of order when concurrent load segment occurs. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-26 14:28:47 +08:00
wei liu	6cbf9c489d	enhance: Rewrite gen stopping segment plan based on assign segment (#29473 ) `AssignSegment` method defines how to assign segment to nodes, but score_based_balance implement another assign logic in `genStoppingSegmentPlan` This PR rewrite gen stopping segment plan based on assign segment. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-12-26 14:26:56 +08:00
wei liu	2ffde52f8a	fix: Upgrade from 2.2 should update CollectionLoadInfo (#29443 ) milvus branch 2.3 add `loadType` in CollectionLoadInfo, so for collection meta upgrade from 2.2, we should add `loadType` to CollectionLoadInfo. This PR update CollectionLoadInfo with `loadType` when meet a old version CollectionLoadInfo Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-12-26 14:18:47 +08:00
yah01	b8318fcd7d	enhance: improve the handling for segcore error (#29471 ) - fix lost exception details in segcore - improve the logs of handling errors from segcore Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-12-26 14:06:46 +08:00
cqy123456	4c979538a4	enhance: update cagra index params in config and add params check (#29045 ) issue:https://github.com/milvus-io/milvus/issues/29230 this pr do two things about cagra index: a.milvus yaml config support gpu memory settings b.add cagra-params check Signed-off-by: cqy123456 <qianya.cheng@zilliz.com> Co-authored-by: yusheng.ma <yusheng.ma@zilliz.com>	2023-12-26 11:04:47 +08:00
congqixia	277849a915	enhance: separate serializer logic from sync task (#29413 ) See also #27675 Since serialization segment buffer does not related to sync manager can shall be done before submit into sync manager. So that the pk statistic file could be more accurate and reduce complex logic inside sync manager. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-26 10:40:47 +08:00
congqixia	13aa174b8a	enhance: add log when release segment created for load failure (#29464 ) Add log for releasing segment created during load process when load error happens Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-26 00:46:46 +08:00
MrPresent-Han	bd3bde82f0	fix iterator lose data for duplicted result(#29406 ) (#29451 ) related: #29406 Signed-off-by: MrPresent-Han <chun.han@zilliz.com>	2023-12-25 23:28:47 +08:00
congqixia	ac95c52171	enhance: change protection to RLock for loadStreamDelete (#29450 ) See also: #29332 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-25 23:27:02 +08:00
yah01	1d6bcd1ded	enhance: speed up loading with many deletions (#29455 ) the executor always fetches the latest segment info, so we could consume from the latest checkpoint, which could save much time while deleted many entities Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-12-25 20:58:45 +08:00

1 2 3 4 5 ...

8006 Commits