milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2024-12-02 20:09:57 +08:00

Author	SHA1	Message	Date
congqixia	e04f1f9748	enhance: Add unittest for `storage.DeleteLog` (#34190 ) See also #33787 Backport unit test part in #34188 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-06-26 17:14:04 +08:00
congqixia	fd922d921a	enhance: Add nilness linter and fix some small issues (#34049 ) Add `nilness` for govet linter and fixed some detected issues Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-06-24 14:52:03 +08:00
Chun Han	ca7ef26e4b	fix: sync part stats task cannot be finished(#30376 ) (#34027 ) related: #30376 also: refine log output for query_coord task by rephrasing action string Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2024-06-24 10:16:02 +08:00
Ted Xu	78885a44c4	fix: turn on compression on stream writers (#34067 ) See #31679 Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-06-24 10:08:02 +08:00
wayblink	380d3f4469	fix: Fix memory buffer error & some renaming (#33850 ) #30633 --------- Signed-off-by: wayblink <anyang.wang@zilliz.com>	2024-06-21 17:30:01 +08:00
congqixia	2f691f1e67	enhance: Unify DeleteLog parsing code (#34009 ) See also #33787 The parsing delete log is distributed in lots of places, which is not recommended and hard to maintain. This PR abstract common parsing logic into `DeleteLog.Parse` method to unify implementation and make it easier to replace json parsing lib. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com> --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-06-21 16:54:01 +08:00
shaoting-huang	5f02e52561	enhance: Refactor data codec deserialize (#33923 ) #33922 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2024-06-20 11:17:59 +08:00
smellthemoon	2a1356985d	enhance: support null in go payload (#32296 ) #31728 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-06-19 17:08:00 +08:00
Ted Xu	6d5747cb3e	feat: adding deltalog stream reader and writer (#33844 ) See #31679 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-06-19 14:42:01 +08:00
shaoting-huang	8cdc0e6233	fix: fix data codec writer close (#33818 ) issue:#33813 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2024-06-18 13:59:57 +08:00
congqixia	f993b2913b	enhance: Reserve space of payload writer when serialize data (#33817 ) See also #33561 #33562 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-06-17 12:06:04 +08:00
XuanYang-cn	f67b6dc2b0	fix: DeleteData merge wrong data casuing data loss (#33820 ) See also: #33819 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-06-14 17:57:56 +08:00
shaoting-huang	0ecd694305	enhance: legacy code clean up (#33838 ) issue: #33839 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2024-06-14 14:25:56 +08:00
wei liu	ab93d9c23d	enhance: Use BatchPkExist to reduce bloom filter func call cost (#33611 ) issue:#33610 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-06-13 17:57:56 +08:00
congqixia	512ea6be5f	enhance: Avoid merging insert data when buffering insert msgs (#33562 ) See also #33561 This PR: - Use zero copy when buffering insert messages - Make `storage.InsertCodec` support serialize multiple insert data chunk into same batch binlog files Signed-off-by: Congqi Xia <congqi.xia@zilliz.com> --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-06-13 11:15:56 +08:00
congqixia	b39dfc25dc	enhance: Use fastjson lib for unmarshal delete log (#33787 ) ``` goos: linux goarch: amd64 GOMAXPROC=1 cpu: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz BenchmarkJsonSerdeStd 343872 3568 ns/op 1335 B/op 25 allocs/op BenchmarkJsonSerdeFastjson 5124177 234.9 ns/op 16 B/op 1 allocs/op ``` --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-06-12 20:41:57 +08:00
wayblink	a1232fafda	feat: Major compaction (#33620 ) #30633 Signed-off-by: wayblink <anyang.wang@zilliz.com> Co-authored-by: MrPresent-Han <chun.han@zilliz.com>	2024-06-10 21:34:08 +08:00
wei liu	c6a1c49e02	enhance: Use Blocked Bloom Filter instead of basic bloom fitler impl. (#33405 ) issue: #32995 To speed up the construction and querying of Bloom filters, we chose a blocked Bloom filter instead of a basic Bloom filter implementation. WARN: This PR is compatible with old version bf impl, but if fall back to old milvus version, it may causes bloom filter deserialize failed. In single Bloom filter test cases with a capacity of 1,000,000 and a false positive rate (FPR) of 0.001, the blocked Bloom filter is 5 times faster than the basic Bloom filter in both querying and construction, at the cost of a 30% increase in memory usage. - Block BF construct time {"time": "54.128131ms"} - Block BF size {"size": 3021578} - Block BF Test cost {"time": "55.407352ms"} - Basic BF construct time {"time": "210.262183ms"} - Basic BF size {"size": 2396308} - Basic BF Test cost {"time": "192.596229ms"} In multi Bloom filter test cases with a capacity of 100,000, an FPR of 0.001, and 100 Bloom filters, we reuse the primary key locations for all Bloom filters to avoid repeated hash computations. As a result, the blocked Bloom filter is also 5 times faster than the basic Bloom filter in querying. - Block BF TestLocation cost {"time": "529.97183ms"} - Basic BF TestLocation cost {"time": "3.197430181s"} --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-05-31 17:49:45 +08:00
wei liu	322a4c5b8c	enhance: Remove StringPrimaryKey to reduce unnecessary copy and function call cost (#33486 ) issue: #33497 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-05-31 15:41:45 +08:00
congqixia	73c9b80a7d	enhance: Store locations for largest K in `LocationCache` (#33429 ) See also #32642 `LocationCache` used map to store different locations for different K which may cause lots of CPU time when get locations many times. This PR change the implementation of LocationCache to store only the location for the largest K used to totally remove the map access operation. See pprof from test of @XuanYang-cn ![image](https://github.com/milvus-io/milvus/assets/84113973/ad17cff8-62ad-4d78-9bb0-f6df0512f4ea) --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-05-29 10:05:42 +08:00
Ted Xu	066c8ea175	feat: stream reader/writer to support nulls (#33080 ) See: #31728 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-05-27 16:27:42 +08:00
congqixia	970bf18a49	fix: Allocate new slice for each batch in streaming reader (#33359 ) Related to #33268 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-05-24 18:07:41 +08:00
Ted Xu	a8bd9bea39	fix: adding blob memory size in binlog serde (#33324 ) See: #33280 Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-05-24 10:33:40 +08:00
Cai Yudong	4004e4c545	enhance: Optimize bulk insert unittest (#33224 ) Issue: #22837 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-05-24 10:23:41 +08:00
Ted Xu	a9c7ce72b8	enhance: enable stream writer in compactions (#32612 ) See #31679 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-05-17 15:05:37 +08:00
cai.zhang	6ea7633bd5	enhance: Add memory size for binlog (#33025 ) issue: #33005 1. add `MemorySize` field for insert binlog. 2. `LogSize` means the file size in the storage object. 3. `MemorySize` means the size of the data in the memory. --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com> Signed-off-by: cai.zhang <cai.zhang@zilliz.com>	2024-05-15 12:59:34 +08:00
Cai Yudong	4fc7915c70	enhance: unify data generation test APIs (#32955 ) Issue: #22837 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-05-14 14:33:33 +08:00
congqixia	0e5765b116	enhance: Utilize `TestLocations` ability to accelerate write & compaction (#32948 ) See also #32642 This PR reuses hash locations for bloom filter prediction utilizing `storage.Location`, like enhancement #32642. Also adds a utility struct in storage: `LocationCache` to storage locations for variable K (numbers of hash functions) --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-05-13 10:15:32 +08:00
wei liu	5038036ece	enhance: Reuse hash locations during access bloom fitler (#32642 ) issue: #32530 when try to match segment bloom filter with pk, we can reuse the hash locations. This PR maintain the max hash Func, and compute hash location once for all segment, reuse hash location can speed up bf access --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-05-07 06:13:47 -07:00
Cai Yudong	bcdbd1966e	feat: Support sparse float vector bulk insert for binlog/json/parquet (#32649 ) Issue: #22837 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-05-07 18:43:30 +08:00
aoiasd	31dca3249e	enhance: add type info for payload writer error message and add log when querynode find new collection (#32522 ) relate: https://github.com/milvus-io/milvus/issues/32668 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-05-07 14:45:29 +08:00
Aldrin	cb8dbc3c83	fix: Removed minio bucket after use in test (#32624 ) issue: https://github.com/milvus-io/milvus/issues/32616 - Forcefully deleted the non empty minio bucket with dummy data. Signed-off-by: Aldrin <imagesai32@gmail.com>	2024-04-28 13:51:26 +08:00
chyezh	2586c2f1b3	enhance: use WalkWithPrefix api for oss, enable piplined file gc (#31740 ) issue: #19095,#29655,#31718 - Change `ListWithPrefix` to `WalkWithPrefix` of OOS into a pipeline mode. - File garbage collection is performed in other goroutine. - Segment Index Recycle clean index file too. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-04-25 20:41:27 +08:00
Buqian Zheng	8a1017a152	enhance: add helpers to parse sparse float vector in JSON (#32543 ) issue: #29419 added helper functions to parse JSON representation of sparse float vectors, will be used by both the restful server and the import utils. Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-04-25 14:47:24 +08:00
Cai Yudong	5fc439c600	feat: Bulk insert support fp16/bf16 (#32157 ) Issue: #22837 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-04-22 10:05:22 +08:00
Ted Xu	dc5ea6f17c	feat: adding binlog streaming writer (#31537 ) See #31679 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-04-11 10:33:20 +08:00
aoiasd	5b693c466d	fix: delegator filter out all partition's delete msg when loading segment (#31585 ) May cause deleted data queryable a period of time. relate: https://github.com/milvus-io/milvus/issues/31484 https://github.com/milvus-io/milvus/issues/31548 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-04-09 15:21:24 +08:00
Cai Yudong	00438f408f	enhance: Unify data type check APIs for go (#31887 ) Issue: #22837 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-04-07 14:27:22 +08:00
cqy123456	976928ecd1	fix: fix fp16/bf16 some code missing and add more fp16/bf16 test (#31612 ) issue: #31534 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-03-28 14:11:10 +08:00
SimFG	b1a1cca10b	feat: add more operation detail info for better allocation (#30438 ) issue: #30436 --------- Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-03-28 06:33:11 +08:00
groot	5be395354c	fix: minio ssl compatible issue (#31607 ) issue: https://github.com/milvus-io/milvus/issues/30709 Signed-off-by: yhmo <yihua.mo@zilliz.com>	2024-03-27 14:41:20 +08:00
yihao.dai	31cf849f68	enhance: Support retriving file size from importutilv2.Reader (#31533 ) To reduce the overhead caused by listing the S3 objects, add an interface to importutil.Reader to retrieve file sizes. issue: https://github.com/milvus-io/milvus/issues/31532, https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-03-25 20:29:07 +08:00
Chun Han	c3264ca3e3	feat: support segment pruner (#31003 ) related: #30376	2024-03-22 13:57:06 +08:00
groot	c81909bfab	enhance: Support MinIO TLS connection (#31311 ) issue: https://github.com/milvus-io/milvus/issues/30709 pr: #31292 Signed-off-by: yhmo <yihua.mo@zilliz.com> Co-authored-by: Chen Rao <chenrao317328@163.com>	2024-03-21 11:15:20 +08:00
Buqian Zheng	d7dbc3c9d8	fix: [sparse float vector] support the new streaming deserialize reader (#31325 ) issue: https://github.com/milvus-io/milvus/issues/31324 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-03-17 13:59:04 +08:00
Buqian Zheng	3c80083f51	feat: [Sparse Float Vector] add sparse vector support to milvus components (#30630 ) add sparse float vector support to different milvus components, including proxy, data node to receive and write sparse float vectors to binlog, query node to handle search requests, index node to build index for sparse float column, etc. https://github.com/milvus-io/milvus/issues/29419 --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-03-13 14:32:54 -07:00
Ted Xu	987d9023a5	enhance: Enable binlog deserialize reader in datanode compaction (#31036 ) See #30863 Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-03-08 18:25:02 +08:00
wayblink	875036b81b	feat: Define FieldValue, FieldStats and PartitionStats (#30286 ) Define FieldValue, FieldStats, PartitionStats FieldValue is largely copied from PrimaryKey FieldStats is largely copied from PrimaryKeyStats PartitionStats is map[segmentid][]FieldStats Each partition can have a PartitionStats file /kind feature related: #30287 related: #30633 --------- Signed-off-by: wayblink <anyang.wang@zilliz.com>	2024-03-06 20:42:37 -08:00
Ted Xu	71adafa933	enhance: adding a streaming deserialize reader for binlogs (#30860 ) See #30863 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-03-04 19:31:09 +08:00
yihao.dai	a434d33e75	feat: Add import scheduler and manager (#29367 ) This PR introduces novel managerial roles for importv2: 1. ImportMeta: To manage all the import tasks; 2. ImportScheduler: To process tasks and modify their states; 3. ImportChecker: To ascertain the completion of all tasks and instigate relevant operations. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-03-01 18:31:02 +08:00

1 2 3 4 5 ...

454 Commits