milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2024-12-04 21:09:06 +08:00

Author	SHA1	Message	Date
cqy123456	74cfba0249	enhance:limit binlog index rows num (#30173 ) issue: https://github.com/milvus-io/milvus/issues/27678 also relate issue: https://github.com/milvus-io/milvus/issues/30065 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-01-29 19:49:02 +08:00
sre-ci-robot	0542a0e7dc	[automated] Update Knowhere Commit (#30332 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-01-29 01:05:01 +08:00
zhagnlu	aeb1e36f00	enhance: change plan desc log from info to debug (#30304 ) #30172 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-01-28 16:04:38 +08:00
xige-16	e9fdd2475d	fix: fix searchPlan metricType modified concurrently (#30227 ) issue: #30225 /kind bug Signed-off-by: xige-16 <xi.ge@zilliz.com> --------- Signed-off-by: xige-16 <xi.ge@zilliz.com>	2024-01-26 14:03:09 +08:00
MrPresent-Han	116d0f20b8	fix: groupby bug for ut (#30272 ) related: #29965 Signed-off-by: MrPresent-Han <chun.han@zilliz.com>	2024-01-25 20:57:00 +08:00
yihao.dai	c02fb64ad6	enhance: Allows proactive warming up of chunk cache (#30182 ) Allows proactive warming up of chunk cache. Original vector data will be asynchronously loaded into the chunk cache during the load process. It has the potential to significantly reduce query/search latency for a certain duration after the load, albeit with a concurrent increase in disk usage. issue: https://github.com/milvus-io/milvus/issues/30181 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-01-25 19:55:39 +08:00
yah01	a27c0e86fd	enhance: reduce many I/O operations while loading disk index (#30189 ) before this, every time writting the index chunk data into the disk, there are 4 I/O operations: - open the file - seek to the offset - write the data - close the file this optimized this to open only once and continiously write all data. This also makes it concurrent to load the files from object storage Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-25 15:23:02 +08:00
zhagnlu	8c58d9af67	enhance: optimize marisa trie range search for performance (#30079 ) #30078 #29986 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-01-25 10:07:00 +08:00
Patrick Weizhi Xu	0907d76253	enhance: pass partition key scalar info if enabled when build vector index (#29931 ) issue: #29892 Pass optional scalar IVF offsets to Cardinal Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>	2024-01-24 00:04:55 +08:00
cqy123456	42bb4e37e5	fix:diskann search crash when search list = 9999999999 (#30185 ) issue: https://github.com/milvus-io/milvus/issues/29020 Json can't not pass a max_int32 value to int32_t, so let knowhere check value range by itself. After fix this, pymilvus will report: pymilvus.exceptions.MilvusException: <MilvusException: (code=65535, message=fail to search on QueryNode 6: worker(6) query failed: => failed to search: arithmetic overflow: param search_list_size should be at most 2147483647)> Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-01-23 14:46:55 +08:00
cai.zhang	6cf2f09b60	feat: Support tencent cloud object storage for milvus (#30163 ) issue: #30162 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-01-23 11:28:56 +08:00
yah01	a77693aa19	enhance: convert the `GetObject` util to async (#30166 ) This makes it much easier to use Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-22 19:20:57 +08:00
sre-ci-robot	e967949cc5	[automated] Update Knowhere Commit (#30120 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-01-22 18:40:54 +08:00
MrPresent-Han	4436effdc3	enhance: support groupby based on scalar-index(#29965 ) (#30091 ) related: #29965 Signed-off-by: MrPresent-Han <chun.han@zilliz.com>	2024-01-22 10:50:54 +08:00
xige-16	aee19dcd6b	enhance: Opt vector dimension mismatch error message (#29928 ) issue: https://github.com/milvus-io/milvus/issues/29791 /kind improvement Signed-off-by: xige-16 <xi.ge@zilliz.com> --------- Signed-off-by: xige-16 <xi.ge@zilliz.com>	2024-01-19 17:52:54 +08:00
yah01	f542bdbf3c	enhance: calc the accurate mem size of segment (#30093 ) this stats the real memory size of segment, also reduces the memory usage in mmap mode resolve #30095 Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-19 12:32:53 +08:00
xige-16	fa7cf587b0	enhance: Opt metric type does not match error message (#29927 ) issue: #29791 /kind improvement Signed-off-by: xige-16 <xi.ge@zilliz.com> Signed-off-by: xige-16 <xi.ge@zilliz.com>	2024-01-17 20:25:03 +08:00
yah01	1185e4dcd5	fix: written file size is over the int32 range and raises error (#30057 ) we sum the total data size in int32, which could lead to an overflow error related #30056 Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-17 16:42:54 +08:00
Bingyi Sun	8030b90891	fix: correct file name when loading index (#29985 ) issue: #29973 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-01-16 10:24:52 +08:00
MrPresent-Han	c31e68446e	enhance: refine groupby-performance (#29933 ) related: #29844 Signed-off-by: MrPresent-Han <chun.han@zilliz.com>	2024-01-15 14:12:52 +08:00
chyezh	def717af55	fix: SealedIndexingEntry in SealedIndexingRecord may leak without smart pointer protect. (#29932 ) may related issue: #29828 Signed-off-by: chyezh <ye.zhen@zilliz.com>	2024-01-14 10:28:51 +08:00
Bingyi Sun	e1258b8cad	feat: integrate storagev2 into loading segment (#29336 ) issue: #29335 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-01-12 18:10:51 +08:00
yah01	f2e36db488	enhance: optimize the loading index performance (#29894 ) this utilizes concurrent loading Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-12 17:44:51 +08:00
yah01	6c477ce3a7	enhance: optimize the loading strategy (#29910 ) as we have the pool size limit so we don't need to limit the concurrency manually Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-12 14:26:50 +08:00
yah01	aba2656e68	fix: missing field data after appending scalar index to loaded segment (#29912 ) related #29843 Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-12 14:04:54 +08:00
sre-ci-robot	4d11525f55	[automated] Update Knowhere Commit (#29904 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-01-12 14:00:50 +08:00
Xu Tong	e429965f32	Add float16 approve for multi-type part (#28427 ) issue：https://github.com/milvus-io/milvus/issues/22837 Add bfloat16 vector, add the index part of float16 vector. Signed-off-by: Writer-X <1256866856@qq.com>	2024-01-11 15:48:51 +08:00
Jiquan Long	67ab5be15a	enhance: optimize search performance of inverted index (#29794 ) issue: #29793 Use `DocSetCollector` instead of `TopDocsCollector`, which will avoid scoring and sorting. --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-01-11 11:12:49 +08:00
zhagnlu	5164d30287	fix: increase expr recursion depth to avoid parse failed (#29860 ) #29759 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-01-11 10:26:50 +08:00
yah01	031243fee7	feat: support mmap for marisa trie (#29613 ) this supports mmap for marisa trie index related https://github.com/milvus-io/milvus/issues/21866 Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-11 10:22:50 +08:00
congqixia	d6429933a7	enhance: make Load process traceable in querynode & segcore (#29858 ) See also #29803 This PR: - Add trace span for `LoadIndex` & `LoadFieldData` in segment loader - Add `TraceCtx` parameter for `Index.Load` in segcore - Add span for ReadFiles & Engine Load for Memory/Disk Vector index --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-10 21:58:51 +08:00
Cai Yudong	cb9d9ec0f0	enhance: Correct sampleFraction's type to float (#29810 ) Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>	2024-01-10 13:18:50 +08:00
Cai Yudong	600f6eff06	enhance: Upgrade gtest to 1.13.0 (#29805 ) Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>	2024-01-10 13:16:57 +08:00
zhagnlu	601a8b801b	fix: add move cursor function to physical expr (#29603 ) #29570 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-01-09 17:08:48 +08:00
zhenshan.cao	60e88fb833	fix: Restore the MVCC functionality. (#29749 ) When the TimeTravel functionality was previously removed, it inadvertently affected the MVCC functionality within the system. This PR aims to reintroduce the internal MVCC functionality as follows: 1. Add MvccTimestamp to the requests of Search/Query and the results of Search internally. 2. When the delegator receives a Query/Search request and there is no MVCC timestamp set in the request, set the delegator's current tsafe as the MVCC timestamp of the request. If the request already has an MVCC timestamp, do not modify it. 3. When the Proxy handles Search and triggers the second phase ReQuery, divide the ReQuery into different shards and pass the MVCC timestamp to the corresponding Query requests. issue: #29656 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-01-09 11:38:48 +08:00
xige-16	9702cef2b5	feat: Support multiple vector search (#29433 ) issue #25639 Signed-off-by: xige-16 <xi.ge@zilliz.com> Signed-off-by: xige-16 <xi.ge@zilliz.com>	2024-01-08 15:34:48 +08:00
Jiquan Long	e9f3df3626	fix: inverted index file not found (#29695 ) issue: https://github.com/milvus-io/milvus/issues/29654 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-01-07 20:26:49 +08:00
zhagnlu	d07197ab1a	enhance: add compare simd function (#29432 ) #26137 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-01-07 20:20:57 +08:00
foxspy	271edc6669	fix: throw exception when upload file failed for DiskIndex (#29627 ) related to : #29417 cardinal indexes upload index files in `Serialize` interface, and throw exception when the `Serialize` failed. Signed-off-by: xianliang <xianliang.li@zilliz.com>	2024-01-07 20:03:13 +08:00
cai.zhang	5dc300c4a9	fix: Fix bug for pk index doesn't have raw data (#29711 ) issue: #29697 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-01-07 19:36:48 +08:00
MrPresent-Han	9e2e7157e9	feat: support search_group_by for milvus(#25324 ) (#28983 ) related: #25324 Search GroupBy function, used to aggregate result entities based on a specific scalar column. several points to mention: 1. Temporarliy, the whole groupby is implemented separated from iterative expr framework for the first period 2. In the long term, the groupBy operation will be incorporated into the iterative expr framework:https://github.com/milvus-io/milvus/pull/28166 3. This pr includes some unrelated mocked interface regarding alterIndex due to some unworth-to-mention reasons. All these un-associated content will be removed before the final pr is merged. This version of pr is only for review 4. All other related details were commented in the files comparison Signed-off-by: MrPresent-Han <chun.han@zilliz.com>	2024-01-05 15:50:47 +08:00
cqy123456	22bb84fa9d	feat:add new gpu index:GPU_BRUTE_FORCE and limit gpu index metric type (#29590 ) issue: https://github.com/milvus-io/milvus/issues/29230 this pr do these things: 1. add gpu brute force; 2. limit gpu index only support l2 / ip; Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-01-05 15:24:48 +08:00
PowderLi	c8db36a63a	enhance: get a blob to check object storage config (#29703 ) issue: #29672 the storage account need privileges of actions `Microsoft.Storage/storageAccounts/blobServices/containers/blobs/*` at least Signed-off-by: PowderLi <min.li@zilliz.com>	2024-01-05 14:50:46 +08:00
yah01	0ae90443ba	enhance: fill missed info for segcore error (#29610 ) - fill missed error info - format the error message directly Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-04 17:54:46 +08:00
yah01	99e0f1e65a	enhance: unable to compile C++ tests (#29616 ) The tests need to call a private method, Milvus uses `#define` to replace private with public, the hack trick works but would be broken if the including order changed. This uses friend to make all things work well Signed-off-by: yah01 <yang.cen@zilliz.com> Signed-off-by: yah01 <yah2er0ne@outlook.com>	2024-01-04 13:20:46 +08:00
PowderLi	5f00bad4b8	fix: link with install path's libblob-chunk-manager (#29496 ) issue: #29494 1. link with install path's libblob-chunk-manager 2. performance of `ShouldBindWith` is better than `ShouldBindBodyWith` 3. the middleware shouldn't read the unrefreshed parameter repeatly Signed-off-by: PowderLi <min.li@zilliz.com>	2023-12-31 20:02:48 +08:00
Jiquan Long	3f46c6d459	feat: support inverted index (#28783 ) issue: https://github.com/milvus-io/milvus/issues/27704 Add inverted index for some data types in Milvus. This index type can save a lot of memory compared to loading all data into RAM and speed up the term query and range query. Supported: `INT8`, `INT16`, `INT32`, `INT64`, `FLOAT`, `DOUBLE`, `BOOL` and `VARCHAR`. Not supported: `ARRAY` and `JSON`. Note: - The inverted index for `VARCHAR` is not designed to serve full-text search now. We will treat every row as a whole keyword instead of tokenizing it into multiple terms. - The inverted index don't support retrieval well, so if you create inverted index for field, those operations which depend on the raw data will fallback to use chunk storage, which will bring some performance loss. For example, comparisons between two columns and retrieval of output fields. The inverted index is very easy to be used. Taking below collection as an example: ```python fields = [ FieldSchema(name="pk", dtype=DataType.VARCHAR, is_primary=True, auto_id=False, max_length=100), FieldSchema(name="int8", dtype=DataType.INT8), FieldSchema(name="int16", dtype=DataType.INT16), FieldSchema(name="int32", dtype=DataType.INT32), FieldSchema(name="int64", dtype=DataType.INT64), FieldSchema(name="float", dtype=DataType.FLOAT), FieldSchema(name="double", dtype=DataType.DOUBLE), FieldSchema(name="bool", dtype=DataType.BOOL), FieldSchema(name="varchar", dtype=DataType.VARCHAR, max_length=1000), FieldSchema(name="random", dtype=DataType.DOUBLE), FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=dim), ] schema = CollectionSchema(fields) collection = Collection("demo", schema) ``` Then we can simply create inverted index for field via: ```python index_type = "INVERTED" collection.create_index("int8", {"index_type": index_type}) collection.create_index("int16", {"index_type": index_type}) collection.create_index("int32", {"index_type": index_type}) collection.create_index("int64", {"index_type": index_type}) collection.create_index("float", {"index_type": index_type}) collection.create_index("double", {"index_type": index_type}) collection.create_index("bool", {"index_type": index_type}) collection.create_index("varchar", {"index_type": index_type}) ``` Then, term query and range query on the field can be speed up automatically by the inverted index: ```python result = collection.query(expr='int64 in [1, 2, 3]', output_fields=["pk"]) result = collection.query(expr='int64 < 5', output_fields=["pk"]) result = collection.query(expr='int64 > 2997', output_fields=["pk"]) result = collection.query(expr='1 < int64 < 5', output_fields=["pk"]) ``` --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2023-12-31 19:50:47 +08:00
zhagnlu	79c417b14e	fix: pass active count to query context instead of timestamp (#29541 ) #29319 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2023-12-31 16:08:48 +08:00
sre-ci-robot	c2345daf3a	[automated] Update Knowhere Commit (#29578 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2023-12-29 18:56:46 +08:00
Jiquan Long	6f4791da0b	fix: panic in concurrent insert/query scenario (#29408 ) issue: https://github.com/milvus-io/milvus/issues/29405 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2023-12-26 15:10:48 +08:00

1 2 3 4 5 ...

1351 Commits