milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2024-12-02 20:09:57 +08:00

Author	SHA1	Message	Date
zhenshan.cao	7e6f73a12d	feat: Authorize users to query grant info of their roles (#29747 ) Once a role is granted to a user, the user should automatically possess the privilege information associated with that role. issue: #29710 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-01-08 15:10:49 +08:00
congqixia	fe47deebf3	fix: Set & Return correct SegmentLevel in querynode segment manager (#29740 ) See also #27349 The segment level label in querynode used `Legacy` before segment level was correctly passed in Load request. Now this attribute is still using legacy so the metrics does not look right. This PR add paramter for `NewSegment` and passes corrent values for each invocation. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-08 14:16:48 +08:00
Jiquan Long	e9f3df3626	fix: inverted index file not found (#29695 ) issue: https://github.com/milvus-io/milvus/issues/29654 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-01-07 20:26:49 +08:00
Jiquan Long	20fb847521	enhance: load delta logs concurrently (#29623 ) This pr will make milvus load delta logs concurrently, which should decrease the latency of loading a segment. /kind improvement --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-01-07 20:22:48 +08:00
zhagnlu	d07197ab1a	enhance: add compare simd function (#29432 ) #26137 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-01-07 20:20:57 +08:00
foxspy	271edc6669	fix: throw exception when upload file failed for DiskIndex (#29627 ) related to : #29417 cardinal indexes upload index files in `Serialize` interface, and throw exception when the `Serialize` failed. Signed-off-by: xianliang <xianliang.li@zilliz.com>	2024-01-07 20:03:13 +08:00
wayblink	635a7f777c	feat: add clustering key in create/describe collection (#29506 ) #28410 /kind feature Signed-off-by: wayblink <anyang.wang@zilliz.com>	2024-01-07 19:56:48 +08:00
yihao.dai	156a0dd450	feat: Add import reader for Parquet (#29618 ) This PR implements a Parquet reader for import. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-01-07 19:38:49 +08:00
cai.zhang	5dc300c4a9	fix: Fix bug for pk index doesn't have raw data (#29711 ) issue: #29697 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-01-07 19:36:48 +08:00
congqixia	b5f039a221	fix: Assertion all async invocations in test case (#29737 ) Resolves: #29736 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-07 15:54:47 +08:00
yah01	a0cec4047a	fix: make the entity num metric accurate (#29643 ) fix #29642 Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-05 18:24:47 +08:00
yihao.dai	23183ffb0f	feat: Add import reader for json (#29252 ) This PR implements a new json reader for import. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-01-05 18:12:48 +08:00
aoiasd	70ec00cd5d	enhance: support access log print cluster prefix (#29646 ) relate: https://github.com/milvus-io/milvus/issues/29645 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-01-05 16:34:47 +08:00
smellthemoon	1c1f2a1371	enhance:change some logs (#29579 ) related #29588 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-01-05 16:12:48 +08:00
wei liu	e98c62abbb	enhance: refactor leader_observer to leader_checker (#29454 ) issue: #29453 sync distribution by rpc will also call loadSegment/releaseSegment, which may cause all kinds of concurrent case on same segment, such as concurrent load and release on one segment. This PR add leader_checker which generate load/release task to correct the leader view, instead of calling sync distribution by rpc --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-01-05 15:54:55 +08:00
MrPresent-Han	9e2e7157e9	feat: support search_group_by for milvus(#25324 ) (#28983 ) related: #25324 Search GroupBy function, used to aggregate result entities based on a specific scalar column. several points to mention: 1. Temporarliy, the whole groupby is implemented separated from iterative expr framework for the first period 2. In the long term, the groupBy operation will be incorporated into the iterative expr framework:https://github.com/milvus-io/milvus/pull/28166 3. This pr includes some unrelated mocked interface regarding alterIndex due to some unworth-to-mention reasons. All these un-associated content will be removed before the final pr is merged. This version of pr is only for review 4. All other related details were commented in the files comparison Signed-off-by: MrPresent-Han <chun.han@zilliz.com>	2024-01-05 15:50:47 +08:00
cqy123456	22bb84fa9d	feat:add new gpu index:GPU_BRUTE_FORCE and limit gpu index metric type (#29590 ) issue: https://github.com/milvus-io/milvus/issues/29230 this pr do these things: 1. add gpu brute force; 2. limit gpu index only support l2 / ip; Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-01-05 15:24:48 +08:00
PowderLi	c8db36a63a	enhance: get a blob to check object storage config (#29703 ) issue: #29672 the storage account need privileges of actions `Microsoft.Storage/storageAccounts/blobServices/containers/blobs/*` at least Signed-off-by: PowderLi <min.li@zilliz.com>	2024-01-05 14:50:46 +08:00
wei liu	b45d08b47b	enhance: Add ctx for load index logs (#29686 ) This PR add ctx for load index logs Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-01-05 14:24:49 +08:00
yihao.dai	3561586edf	feat: Add import reader for binlog (#28910 ) This PR defines the new import reader interfaces and implement a binlog reader for import. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-01-05 11:48:47 +08:00
congqixia	3626f49025	fix: make sure balance candidate is alway pushed back (#29702 ) See also #29699 Querycoord panicked when tried to pop from an empty heap. We assume the heap shall not be empty, but in some branch, the candidate is never pushed back. This PR put pop & push in a closure and adds a defer call to push item back. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-05 10:08:47 +08:00
congqixia	dc6a6a50fa	enhance: reduce SyncTask AllocID call and refine code (#29701 ) See also #27675 `Allocator.Alloc` and `Allocator.AllocOne` might be invoked multiple times if there were multiple blobs set in one sync task. This PR add pre-fetch logic for all blobs and cache logIDs in sync task so that at most only one call of the allocator is needed. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-05 10:04:46 +08:00
cai.zhang	dc8b5c1130	enhance: Read azure file without ReadAll (#29602 ) issue: #29292 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-01-04 20:50:46 +08:00
wayblink	05d735c322	enhance: Rename SearchV2 to HybridSearch (#29592 ) related: https://github.com/milvus-io/milvus-proto/pull/233 issue: #29593 /kind enhancement Signed-off-by: wayblink <anyang.wang@zilliz.com>	2024-01-04 19:22:46 +08:00
yah01	0ae90443ba	enhance: fill missed info for segcore error (#29610 ) - fill missed error info - format the error message directly Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-04 17:54:46 +08:00
yah01	9e0163e12f	enhance: use GPU pool for gpu tasks (#29678 ) - this much improve the performance for GPU index Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-04 17:50:46 +08:00
congqixia	4f8c540c77	enhance: cache collection schema attributes to reduce proxy cpu (#29668 ) See also #29113 The collection schema is crucial when performing search/query but some of the information is calculated for every request. This PR change schema field of cached collection info into a utility `schemaInfo` type to store some stable result, say pk field, partitionKeyEnabled, etc. And provided field name to id map for search/query services. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-04 17:28:46 +08:00
smellthemoon	a988daf143	fix: unstable ut and fix goroutine leak (#29624 ) fix unstable ut and fix goroutine leak related: #27801 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-01-04 17:24:47 +08:00
XuanYang-cn	a3aff37f73	fix: Correct flush buffer size metrics (#29571 ) See also: #29204 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-01-04 17:22:46 +08:00
smellthemoon	e09fc040aa	fix: the config value of DataCoordTimeTick become longer and longer (#29659 ) #29658 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-01-04 17:06:47 +08:00
congqixia	da7c3cbd88	enhance: make delegator delete buffer holding all delete from cp (#29626 ) See also #29625 This PR: - Add a new implemention of `DeleteBuffer`: listDeleteBuffer - holds cacheBlock slice - `Put` method append new delete data into last block - when a block is full, append a new block into the list - Add `TryDiscard` method for `DeleteBuffer` interface - For doubleCacheBuffer, do nothing - For listDeleteBuffer, try to evict "old" blocks, which are blocks before the first block whose start ts is behind provided ts - Add checkpoint field for `UpdateVersion` sync action, which shall be used to discard old cache delete block --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-04 17:02:46 +08:00
congqixia	79c06c5e73	fix: serializer shall bypass L0 segment merge stats step (#29636 ) See also #27675 Fix logic problem introduced by #29413, which is serializer tries to merge statslog list while level segments do not have statslog. This shall result returning error. `writeBufferBase` ignores this error but it shall only ignore `ErrSegmentNotFound`. This PR add logic checking segment level before execution of merging statslog list. And add error type check for getSyncTask failure. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-04 16:52:45 +08:00
congqixia	aa967de0a8	enhance: Explicitly pass LevelZero segment ids in vchan info (#29612 ) See also #27675 For `GetRecoveryInfo` & `GetRecoveryInfoV2`, Level zero segment ids shall be specified in vchan info so that querycoord could re-fetch current segment info during watch procedure without having all segment info Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-04 16:46:45 +08:00
yah01	99e0f1e65a	enhance: unable to compile C++ tests (#29616 ) The tests need to call a private method, Milvus uses `#define` to replace private with public, the hack trick works but would be broken if the including order changed. This uses friend to make all things work well Signed-off-by: yah01 <yang.cen@zilliz.com> Signed-off-by: yah01 <yah2er0ne@outlook.com>	2024-01-04 13:20:46 +08:00
wei liu	336fce0582	enhance: Rewrite gen segment plan based on assign segment (#29574 ) issue: #29582 This PR rewrite gen segment plan logic based on assign segment in `score_based_balancer` Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-01-04 11:10:44 +08:00
SimFG	d23f87a393	enhance: Add concurrency for datacoord segment GC (#29561 ) issue: https://github.com/milvus-io/milvus/issues/29553 /kind improvement Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-01-03 13:16:57 +08:00
smellthemoon	b12c90af72	enhance:Add upsert vector metrics (#29226 ) add some metrics. Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-01-03 10:06:48 +08:00
XuanYang-cn	f1b6ccf305	enhance: compaction use ChannelManager interface (#29530 ) Rewrite compaction_test.go See also: #29447 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-01-02 18:08:49 +08:00
PowderLi	5f00bad4b8	fix: link with install path's libblob-chunk-manager (#29496 ) issue: #29494 1. link with install path's libblob-chunk-manager 2. performance of `ShouldBindWith` is better than `ShouldBindBodyWith` 3. the middleware shouldn't read the unrefreshed parameter repeatly Signed-off-by: PowderLi <min.li@zilliz.com>	2023-12-31 20:02:48 +08:00
Jiquan Long	3f46c6d459	feat: support inverted index (#28783 ) issue: https://github.com/milvus-io/milvus/issues/27704 Add inverted index for some data types in Milvus. This index type can save a lot of memory compared to loading all data into RAM and speed up the term query and range query. Supported: `INT8`, `INT16`, `INT32`, `INT64`, `FLOAT`, `DOUBLE`, `BOOL` and `VARCHAR`. Not supported: `ARRAY` and `JSON`. Note: - The inverted index for `VARCHAR` is not designed to serve full-text search now. We will treat every row as a whole keyword instead of tokenizing it into multiple terms. - The inverted index don't support retrieval well, so if you create inverted index for field, those operations which depend on the raw data will fallback to use chunk storage, which will bring some performance loss. For example, comparisons between two columns and retrieval of output fields. The inverted index is very easy to be used. Taking below collection as an example: ```python fields = [ FieldSchema(name="pk", dtype=DataType.VARCHAR, is_primary=True, auto_id=False, max_length=100), FieldSchema(name="int8", dtype=DataType.INT8), FieldSchema(name="int16", dtype=DataType.INT16), FieldSchema(name="int32", dtype=DataType.INT32), FieldSchema(name="int64", dtype=DataType.INT64), FieldSchema(name="float", dtype=DataType.FLOAT), FieldSchema(name="double", dtype=DataType.DOUBLE), FieldSchema(name="bool", dtype=DataType.BOOL), FieldSchema(name="varchar", dtype=DataType.VARCHAR, max_length=1000), FieldSchema(name="random", dtype=DataType.DOUBLE), FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=dim), ] schema = CollectionSchema(fields) collection = Collection("demo", schema) ``` Then we can simply create inverted index for field via: ```python index_type = "INVERTED" collection.create_index("int8", {"index_type": index_type}) collection.create_index("int16", {"index_type": index_type}) collection.create_index("int32", {"index_type": index_type}) collection.create_index("int64", {"index_type": index_type}) collection.create_index("float", {"index_type": index_type}) collection.create_index("double", {"index_type": index_type}) collection.create_index("bool", {"index_type": index_type}) collection.create_index("varchar", {"index_type": index_type}) ``` Then, term query and range query on the field can be speed up automatically by the inverted index: ```python result = collection.query(expr='int64 in [1, 2, 3]', output_fields=["pk"]) result = collection.query(expr='int64 < 5', output_fields=["pk"]) result = collection.query(expr='int64 > 2997', output_fields=["pk"]) result = collection.query(expr='1 < int64 < 5', output_fields=["pk"]) ``` --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2023-12-31 19:50:47 +08:00
zhagnlu	79c417b14e	fix: pass active count to query context instead of timestamp (#29541 ) #29319 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2023-12-31 16:08:48 +08:00
smellthemoon	ae640e7c80	fix: pass in undefined params (#29591 ) fix pass in undefined params issue: #29594 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2023-12-30 00:32:47 +08:00
sre-ci-robot	c2345daf3a	[automated] Update Knowhere Commit (#29578 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2023-12-29 18:56:46 +08:00
xige-16	02673914a0	feat: Support multiple vector indexes in a collection (#27700 ) issue: #25639 /kind improvement Signed-off-by: xige-16 <xi.ge@zilliz.com> --------- Signed-off-by: xige-16 <xi.ge@zilliz.com>	2023-12-29 11:44:45 +08:00
congqixia	55af8f611f	fix: always sync level zero segments as flushed (#29569 ) See also #27675 For now, Level zero segments shall always be synced as `Flushed` ones. This PR fixes when level zero segments selected by policies other than flush ts policy will be synced as growing state. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-29 10:34:47 +08:00
congqixia	a3cb8e2625	fix: Add atomic method to get collection target (#29577 ) Related to #29575 Add `getCollectionTarget` method which is atomic when scope is `CurrentTargetFirst` or `NextTargetFirst` Also return error when executor finds no channel in target manager --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-29 09:04:46 +08:00
congqixia	a8b7629315	fix: exclude insertData before growing checkpoint (#29558 ) Resolves: #29556 Refine exclude segment function signature Add exclude growing before checkpoint logic Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-28 18:18:54 +08:00
MrPresent-Han	ed644983e2	enhance: add param for bloomfilter(#29388 ) (#29490 ) related: #29388 Signed-off-by: MrPresent-Han <chun.han@zilliz.com>	2023-12-28 18:10:46 +08:00
wei liu	514e279f3a	enhance: Remove useless log in collection observer (#29554 ) This PR removed the useless log in collection observer Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-12-28 17:16:47 +08:00
xige-16	0a70e8b601	enhance: Remove multiple vector field limit (#27827 ) issue: https://github.com/milvus-io/milvus/issues/25639 /kind improvement Signed-off-by: xige-16 <xi.ge@zilliz.com> Signed-off-by: xige-16 <xi.ge@zilliz.com>	2023-12-28 16:40:46 +08:00

1 2 3 4 5 ...

8032 Commits