milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2024-12-05 05:18:52 +08:00

Author	SHA1	Message	Date
Xu Tong	e429965f32	Add float16 approve for multi-type part (#28427 ) issue：https://github.com/milvus-io/milvus/issues/22837 Add bfloat16 vector, add the index part of float16 vector. Signed-off-by: Writer-X <1256866856@qq.com>	2024-01-11 15:48:51 +08:00
congqixia	f18a7191f2	enhance: make `ColumnBasedInsertMsgToInsertData` check field missing (#29758 ) fix: #29757 In previous code, `ColumnBasedInsertMsgToInsertData` adds empty field if the insertMsg parameter does not have the column schema defined. This may lead to unexpected behavior of caller functions. This PR: - Add column missing check - Add column length check - Generate BlobInfo for ColumnBasedInsertMsgToInsertData result --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-09 11:50:48 +08:00
yihao.dai	3d07b6682c	feat: Add import reader for numpy (#29253 ) This PR implements a new numpy reader for import. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-01-08 19:42:49 +08:00
yah01	97e4ec5a69	enhance: use random root path for minio unit tests (#29753 ) this avoids the conflicts while running multiple unit tests Signed-off-by: yah01 <yah2er0ne@outlook.com>	2024-01-08 15:58:48 +08:00
yihao.dai	23183ffb0f	feat: Add import reader for json (#29252 ) This PR implements a new json reader for import. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-01-05 18:12:48 +08:00
smellthemoon	1c1f2a1371	enhance:change some logs (#29579 ) related #29588 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-01-05 16:12:48 +08:00
yihao.dai	3561586edf	feat: Add import reader for binlog (#28910 ) This PR defines the new import reader interfaces and implement a binlog reader for import. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-01-05 11:48:47 +08:00
cai.zhang	dc8b5c1130	enhance: Read azure file without ReadAll (#29602 ) issue: #29292 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-01-04 20:50:46 +08:00
Jiquan Long	3f46c6d459	feat: support inverted index (#28783 ) issue: https://github.com/milvus-io/milvus/issues/27704 Add inverted index for some data types in Milvus. This index type can save a lot of memory compared to loading all data into RAM and speed up the term query and range query. Supported: `INT8`, `INT16`, `INT32`, `INT64`, `FLOAT`, `DOUBLE`, `BOOL` and `VARCHAR`. Not supported: `ARRAY` and `JSON`. Note: - The inverted index for `VARCHAR` is not designed to serve full-text search now. We will treat every row as a whole keyword instead of tokenizing it into multiple terms. - The inverted index don't support retrieval well, so if you create inverted index for field, those operations which depend on the raw data will fallback to use chunk storage, which will bring some performance loss. For example, comparisons between two columns and retrieval of output fields. The inverted index is very easy to be used. Taking below collection as an example: ```python fields = [ FieldSchema(name="pk", dtype=DataType.VARCHAR, is_primary=True, auto_id=False, max_length=100), FieldSchema(name="int8", dtype=DataType.INT8), FieldSchema(name="int16", dtype=DataType.INT16), FieldSchema(name="int32", dtype=DataType.INT32), FieldSchema(name="int64", dtype=DataType.INT64), FieldSchema(name="float", dtype=DataType.FLOAT), FieldSchema(name="double", dtype=DataType.DOUBLE), FieldSchema(name="bool", dtype=DataType.BOOL), FieldSchema(name="varchar", dtype=DataType.VARCHAR, max_length=1000), FieldSchema(name="random", dtype=DataType.DOUBLE), FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=dim), ] schema = CollectionSchema(fields) collection = Collection("demo", schema) ``` Then we can simply create inverted index for field via: ```python index_type = "INVERTED" collection.create_index("int8", {"index_type": index_type}) collection.create_index("int16", {"index_type": index_type}) collection.create_index("int32", {"index_type": index_type}) collection.create_index("int64", {"index_type": index_type}) collection.create_index("float", {"index_type": index_type}) collection.create_index("double", {"index_type": index_type}) collection.create_index("bool", {"index_type": index_type}) collection.create_index("varchar", {"index_type": index_type}) ``` Then, term query and range query on the field can be speed up automatically by the inverted index: ```python result = collection.query(expr='int64 in [1, 2, 3]', output_fields=["pk"]) result = collection.query(expr='int64 < 5', output_fields=["pk"]) result = collection.query(expr='int64 > 2997', output_fields=["pk"]) result = collection.query(expr='1 < int64 < 5', output_fields=["pk"]) ``` --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2023-12-31 19:50:47 +08:00
MrPresent-Han	ed644983e2	enhance: add param for bloomfilter(#29388 ) (#29490 ) related: #29388 Signed-off-by: MrPresent-Han <chun.han@zilliz.com>	2023-12-28 18:10:46 +08:00
congqixia	6a86ac0ac6	fix: Align minio object storage ut to new minio server behavior (#29014 ) See also #29013 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-06 15:42:43 +08:00
yihao.dai	b4353ca4ce	enhance: Remove vector chunk manager (#28569 ) We have implemented the chunkcache (in cpp) to retrieve vectors, hence rendering the vectorchunkcache (in golang) obsolete. issue: https://github.com/milvus-io/milvus/issues/28568 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2023-11-30 18:00:33 +08:00
XuanYang-cn	aae7e62729	feat: Add levelzero compaction in DN (#28470 ) See also: #27606 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-11-30 14:30:28 +08:00
cai.zhang	f5f4f0872e	enhance: Support importing data with parquet file (#28608 ) issue: #28272 Numpy does not support array type import. Array type data is imported through parquet. Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2023-11-29 20:52:27 +08:00
yihao.dai	4bd426dbe7	fix: Fix minio latency monitoring for get operation (#28510 ) see also: https://github.com/milvus-io/milvus/issues/28509 Currently Minio latency monitoring for get operation only collects the duration of getting object (which just returns an io.Reader and does not really read from minio), this pr will correct this behavior. Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2023-11-28 10:00:27 +08:00
congqixia	8a9ab69369	fix: Skip statslog generation flushing empty L0 segment (#28733 ) See also #27675 When L0 segment contains only delta data, merged statslog shall be skiped when performing sync task --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-25 15:10:25 +08:00
yah01	cc952e0486	enhance: optimize forwarding level0 deletions by respecting partition (#28456 ) - Cache the level 0 deletions after loading level0 segments - Divide the level 0 deletions by partition related: #27349 --------- Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-11-21 18:24:22 +08:00
congqixia	2b3fa8f67b	fix: Add length check for `storage.NewPrimaryKeyStats` (#28576 ) See also #28575 Add zero-length check for `storage.NewPrimaryKeyStats`. This function shall return error when non-positive rowNum passed. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-21 10:28:21 +08:00
Bingyi Sun	59355cb3dc	Update arrow version to v12 (#28425 ) issue: https://github.com/milvus-io/milvus/issues/28423 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2023-11-15 10:36:19 +08:00
congqixia	e576271a24	Fix buffer FieldData has no `ElementType` and array logsize always zero (#28295 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-09 14:16:20 +08:00
yah01	ece592a42f	Deliver L0 segments delete records (#27722 ) Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-11-07 01:44:18 +08:00
PowderLi	0252871d30	fix azure ListObjects (#27931 ) Signed-off-by: PowderLi <min.li@zilliz.com>	2023-11-01 11:34:14 +08:00
Enwei Jiao	8ae9c947ae	Use OpenDAL to access object store (#25642 ) Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2023-11-01 09:00:14 +08:00
yah01	9658367a3c	Refine chunk manager errors (#27590 ) Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-10-31 12:18:15 +08:00
zhenshan.cao	6c3f29d003	Identify service providers based on addresses (#27907 ) Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2023-10-25 17:28:10 +08:00
zhagnlu	6060dd7ea8	Add chunk manager request timeout (#27692 ) Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2023-10-23 20:08:08 +08:00
XuanYang-cn	7358c3527b	Add iterators (#27643 ) See also: #27606 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-10-18 19:34:08 +08:00
congqixia	2f201c25e2	Remove deprecated io/ioutil usage (#27747 ) `io/ioutil` package is deprecated, use `io`,`os` package replacement also added golangci-lint rule to block future reference Signed-off-by: Congqi Xia <congqi.xia@zilliz.com> Co-authored-by: guoguangwu <guoguangwu@magic-shield.com>	2023-10-17 20:32:09 +08:00
XuanYang-cn	2f16339aac	Enhance InsertData and FieldData (#27436 ) 1. Add NewInsertData 2. Add GetRowNum(), GetMemorySize(), and, Append() for InsertData 3. Add AppendRow() for FieldData for compaction Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-10-17 17:36:11 +08:00
congqixia	670cb386e7	Add back `gocritic` linter and fix related issues (#27289 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-09-22 10:05:26 +08:00
SimFG	26f06dd732	Format the code (#27275 ) Signed-off-by: SimFG <bang.fu@zilliz.com>	2023-09-21 09:45:27 +08:00
congqixia	cc9974979f	Add staticcheck linter and fix existing problems (#27174 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-09-19 10:05:22 +08:00
PowderLi	4feb3fa7c6	support azure (#26398 ) Signed-off-by: PowderLi <min.li@zilliz.com>	2023-09-19 10:01:23 +08:00
Xu Tong	9166011c4a	Add float16 vector (#25852 ) Signed-off-by: Writer-X <1256866856@qq.com>	2023-09-08 10:03:16 +08:00
bjzhjing	548c82eca5	Refactor storage.MergeInsertData() to optimize the merging process (#26839 ) Benchmark Milvus with https://github.com/qdrant/vector-db-benchmark and specify the datasets as 'deep-image-96-angular'. Meanwhile, do perf profiling during 'upload + index' stage of vector-db-benchmark and see the following hot spots. 39.59%--github.com/milvus-io/milvus/internal/storage.MergeInsertData \| \|--21.43%--github.com/milvus-io/milvus/internal/storage.MergeFieldData \| \| \| \|--17.22%--runtime.memmove \| \| \| \|--1.53%--asm_exc_page_fault \| ...... \| \|--18.16%--runtime.memmove \| \|--1.66%--asm_exc_page_fault ...... The hot code path is in storage.MergeInsertData() which updates buffer.buffer by creating a new 'InsertData' instance and merging both the old buffer.buffer and addedBuffer into it. When it calls golang runtime.memmove to move buffer.buffer which is with big size (>1M), the hot spots appear. To avoid the above overhead, update storage.MergeInsertData() by appending addedBuffer to buffer.buffer, instead of moving buffer.buffer and addedBuffer to a new 'InsertData'. This change removes the hot spots 'runtime.memmove' from perf profiling output. Additionally, the 'upload + index' time, which is one performance metric of vector-db-benchmark, is reduced around 60% with this change. Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>	2023-09-05 21:41:48 +08:00
Enwei Jiao	fb0705df1b	Decouple basetable and componentparam (#26725 ) Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2023-09-05 10:31:48 +08:00
zhagnlu	411f9ac823	Upgrade minio-go and add region and virtual host config for segcore chunk manager (#26194 ) Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2023-08-11 10:37:36 +08:00
congqixia	2770ac4df5	Fix nilness linter errors (#26218 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-08-09 11:31:15 +08:00
zhenshan.cao	2c6c7749e2	Enable print_log support json data type (#26118 ) Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2023-08-04 11:27:05 +08:00
xige-16	f33451b3d8	Write the cache file to the cacheStorage.rootpath dir (#25715 ) Signed-off-by: xige-16 <xi.ge@zilliz.com>	2023-07-28 10:59:02 +08:00
xige-16	94d6cbb238	Fix querynode panic when binlog ts wrong (#25635 ) Signed-off-by: xige-16 <xi.ge@zilliz.com>	2023-07-18 10:41:20 +08:00
xige-16	33c2012675	Add more metrics (#25081 ) Signed-off-by: xige-16 <xi.ge@zilliz.com>	2023-06-26 17:52:44 +08:00
Xiaofan	e8911ebda7	Add retry time when lazy load BF (#25096 ) Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>	2023-06-25 11:32:43 +08:00
PowderLi	3f4356df10	fix the spelling of `field` (#25008 ) Signed-off-by: PowderLi <min.li@zilliz.com>	2023-06-21 14:00:42 +08:00
yah01	8bc5282eb3	Fix datanode always retries to load stats even file corrupted (#25012 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-06-20 16:40:42 +08:00
Enwei Jiao	1ef8f0fceb	Remove cgo PayloadWriter (#24892 ) Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2023-06-14 18:04:38 +08:00
yah01	a9dccec03a	Add go payload writer (#24656 ) (#24762 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-06-09 13:52:39 +08:00
congqixia	41af0a98fa	Use go-api/v2 for milvus-proto (#24770 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-06-09 01:28:37 +08:00
yah01	ebd0279d3f	Check error by Error() and NoError() for better report message (#24736 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-06-08 15:36:36 +08:00
Enwei Jiao	d3af451d92	Upgrade golangci-lint (#24707 ) Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2023-06-07 19:34:36 +08:00

1 2 3 4 5 ...

394 Commits