milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2024-11-29 18:38:44 +08:00

Author	SHA1	Message	Date
chyezh	2b7ee1968f	enhance: new messsage interface for log service (#33286 ) issue: #33285 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-06-11 10:38:01 +08:00
chyezh	f53ab54c5d	enhance: async cgo utility (#33133 ) issue: #30926, #33132 - implement future-based cgo utility. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-06-09 22:55:53 +08:00
cai.zhang	27cc9f2630	enhance: Support analyze data (#33651 ) issue: #30633 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com> Co-authored-by: chasingegg <chao.gao@zilliz.com>	2024-06-06 17:37:51 +08:00
zhenshan.cao	ac4f3997ce	enhance: Reconstructing Compaction to possess persistence capability (#33265 ) issue #33586 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-06-05 10:17:50 +08:00
XuanYang-cn	22bddde5ff	enhance: Tidy compactor and remove dup codes (#32198 ) See also: #32451 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-05-23 09:53:40 +08:00
shaoting-huang	de7901121f	Upgrade go from 1.20 to 1.21 (#33047 ) Signed-off-by: shaoting-huang [shaoting-huang@zilliz.com] issue: https://github.com/milvus-io/milvus/issues/32982 # Background Go 1.21 introduces several improvements and changes over Go 1.20, which is quite stable now. According to [Go 1.21 Release Notes](https://tip.golang.org/doc/go1.21), the big difference of Go 1.21 is enabling Profile-Guided Optimization by default, which can improve performance by around 2-14%. Here are the summary steps of PGO: 1. Build Initial Binary (Without PGO) 2. Deploying the Production Environment 3. Run the program and collect Performance Analysis Data (CPU pprof) 4. Analyze the Collected Data and Select a Performance Profile for PGO 5. Place the Performance Analysis File in the Main Package Directory and Name It default.pgo 6. go build Detects the default.pgo File and Enables PGO 7. Build and Release the Updated Binary (With PGO) 8. Iterate and Repeat the Above Steps <img width="657" alt="Screenshot 2024-05-14 at 15 57 01" src="https://github.com/milvus-io/milvus/assets/167743503/b08d4300-0be1-44dc-801f-ce681dabc581"> # What does this PR do There are three experiments, search benchmark by Zilliz test platform, search benchmark by open-source [VectorDBBench](https://github.com/zilliztech/VectorDBBench?tab=readme-ov-file), and search benchmark with PGO. We do both search benchmarks by Zilliz test platform and by VectorDBBench to reduce reliance on a single experimental result. Besides, we validate the performance enhancement with PGO. ## Search Benchmark Report by Zilliz Test Platform An upgrade to Go 1.21 was conducted on a Milvus Standalone server, equipped with 16 CPUs and 64GB of memory. The search performance was evaluated using a 1 million entry local dataset with an L2 metric type in a 768-dimensional space. The system was tested for concurrent searches with 50 concurrent tasks for 1 hour, each with a 20-second interval. The reason for using one server rather than two servers to compare is to guarantee the same data source and same segment state after compaction. Test Sequence: 1. Go 1.20 Initial Run: Insert data, build index, load index, and search. 2. Go 1.20 Rebuild: Rebuild the index with the same dataset, load index, and search. 3. Go 1.21 Load: Upload to Go 1.21 within the server. Then load the index from the second run, and search. 4. Go 1.21 Rebuild: Rebuild the index with the same dataset, load index, and search. Search Metrics: \| Metric \| Go 1.20 \| Go 1.20 Rebuild Index \| Go 1.21 \| Go 1.21 Rebuild Index \| \|----------------------------\|------------------\|-----------------\|------------------\|-----------------\| \| `search requests` \| 10,942,683 \| 16,131,726 \| 16,200,887 \| 16,331,052 \| \| `search fails` \| 0 \| 0 \| 0 \| 0 \| \| `search RT_avg` (ms) \| 16.44 \| 11.15 \| 11.11 \| 11.02 \| \| `search RT_min` (ms) \| 1.30 \| 1.28 \| 1.31 \| 1.26 \| \| `search RT_max` (ms) \| 446.61 \| 233.22 \| 235.90 \| 147.93 \| \| `search TP50` (ms) \| 11.74 \| 10.46 \| 10.43 \| 10.35 \| \| `search TP99` (ms) \| 92.30 \| 25.76 \| 25.36 \| 25.23 \| \| `search RPS` \| 3,039 \| 4,481 \| 4,500 \| 4,536 \| ### Key Findings The benchmark tests reveal that the index build time with Go 1.20 at 340.39 ms and Go 1.21 at 337.60 ms demonstrated negligible performance variance in index construction. However, Go 1.21 offers slightly better performance in search operations compared to Go 1.20, with improvements in handling concurrent tasks and reducing response times. ## Search Benchmark Report By VectorDb Bench Follow [VectorDBBench](https://github.com/zilliztech/VectorDBBench?tab=readme-ov-file) to create a VectorDb Bench test for Go 1.20 and Go 1.21. We test the search performance with Go 1.20 and Go 1.21 (without PGO) on the Milvus Standalone system. The tests were conducted using the Cohere dataset with 1 million entries in a 768-dimensional space, utilizing the COSINE metric type. Search Metrics: Metric \| Go 1.20 \| Go 1.21 without PGO -- \| -- \| -- Load Duration (seconds) \| 1195.95 \| 976.37 Queries Per Second (QPS) \| 841.62 \| 875.89 99th Percentile Serial Latency (seconds) \| 0.0047 \| 0.0076 Recall \| 0.9487 \| 0.9489 ### Key Findings Go 1.21 indicates faster index loading times and larger search QPS handling. ## PGO Performance Test Milvus has already added [net/http/pprof](https://pkg.go.dev/net/http/pprof) in the metrics. So we can curl the CPU profile directly by running `curl -o default.pgo "http://${MILVUS_SERVER_IP}:${MILVUS_SERVER_PORT}/debug/pprof/profile?seconds=${TIME_SECOND}"` to collect the profile as the default.pgo during the first search. Then I build Milvus with PGO and use the same index to run the search again. The result is as below: Search Metrics \| Metric \| Go 1.21 Without PGO \| Go 1.21 With PGO \| Change (%) \| \|---------------------------------------------\|------------------\|-----------------\|------------\| \| `search Requests` \| 2,644,583 \| 2,837,726 \| +7.30% \| \| `search Fails` \| 0 \| 0 \| N/A \| \| `search RT_avg` (ms) \| 11.34 \| 10.57 \| -6.78% \| \| `search RT_min` (ms) \| 1.39 \| 1.32 \| -5.18% \| \| `search RT_max` (ms) \| 349.72 \| 143.72 \| -58.91% \| \| `search TP50` (ms) \| 10.57 \| 9.93 \| -6.05% \| \| `search TP99` (ms) \| 26.14 \| 24.16 \| -7.56% \| \| `search RPS` \| 4,407 \| 4,729 \| +7.30% \| ### Key Findings PGO led to a notable enhancement in search performance, particularly in reducing the maximum response time by 58% and increasing the search QPS by 7.3%. ### Further Analysis Generate a diff flame graphs between two CPU profiles by running `go tool pprof -http=:8000 -diff_base nopgo.pgo pgo.pgo -normalize` <img width="1894" alt="goprofiling" src="https://github.com/milvus-io/milvus/assets/167743503/ab9e91eb-95c7-4963-acd9-d1c3c73ee010"> Further insight of HnswIndexNode and Milvus Search Handler <img width="1906" alt="hnsw" src="https://github.com/milvus-io/milvus/assets/167743503/a04cf4a0-7c97-4451-b3cf-98afc20a0b05"> <img width="1873" alt="search_handler" src="https://github.com/milvus-io/milvus/assets/167743503/5f4d3982-18dd-4115-8e76-460f7f534c7f"> After applying PGO to the Milvus server, the CPU utilization of the faiss::fvec_L2 function has decreased. This optimization significantly enhances the performance of the [HnswIndexNode::Search::searchKnn](`e0c9c41aa2/src/index/hnsw/hnsw.cc (L203)`) method, which is frequently invoked by Knowhere during high-concurrency searches. As the explanation from Go release notes, the function might be more aggressively inlined by Go compiler during the second build with the CPU profiling collected from the first run. As a result, the search handler efficiency within Milvus DataNode has improved, allowing the server to process a higher number of search queries per second (QPS). # Conclusion The combination of Go 1.21 and PGO has led to substantial enhancements in search performance for Milvus server, particularly in terms of search QPS and response times, making it more efficient for handling high-concurrency search operations. Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2024-05-22 13:21:39 +08:00
congqixia	7eeb120aab	enhance: Add lint rules for client pkg and fix problems (#33180 ) See also #31293 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-05-20 20:47:38 +08:00
wei liu	f1c9986974	enhance: Skip return data distribution if no change happen (#32814 ) issue: #32813 --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-05-17 10:11:37 +08:00
aoiasd	cbbfb5b6d6	enhance: Add update milvus yaml command to makefile (#32857 ) Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-05-10 14:31:37 +08:00
yiwangdr	b1eacb2ae8	feat: datacoord/node watch based on rpc (#32036 ) issue: https://github.com/milvus-io/milvus/issues/25309 Signed-off-by: yiwangdr <yiwangdr@gmail.com>	2024-05-07 15:49:30 +08:00
SimFG	2944971507	enhance: fix the dev docker yaml and use `go install` to install the gotestsum (#32197 ) /kind improvement Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-04-12 19:17:18 +08:00
SimFG	c012e6786f	feat: support rate limiter based on db and partition levels (#31070 ) issue: https://github.com/milvus-io/milvus/issues/30577 co-author: @jaime0815 --------- Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com> Signed-off-by: SimFG <bang.fu@zilliz.com> Co-authored-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>	2024-04-12 16:01:19 +08:00
congqixia	f399416b92	enhance: Use gotestsum to run go unit test (#31622 ) See also #31490 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-03-27 15:29:10 +08:00
Bingyi Sun	0ac9bb4a9c	enhance: add mmap migration tool (#30909 ) issue: #30908 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-03-25 15:51:09 +08:00
congqixia	4da8b6607d	enhance: Add scripts to use `gotestsum` to execute integration test (#31490 ) See also #31489 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-03-22 10:29:06 +08:00
Jiquan Long	375190e76e	fix: cpp format check not work (#30767 ) fix: #30765 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-02-22 19:40:53 +08:00
zhagnlu	976b6fc0e4	enhance: change opendal as compile configurable (#30384 ) #30373 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-02-20 19:16:52 +08:00
congqixia	a68b32134a	fix: Verify sync task target segment and retry if not match (#30500 ) See also #27675 #30469 For a sync task, the segment could be compacted during sync task. In previous implementation, this sync task will hold only the old segment id as KeyLock, in which case compaction on compacted to segment may run in parallel with delta sync of this sync task. This PR introduces sync target segment verification logic. It shall check target segment lock it's holding beforing actually syncing logic. If this check failed, sync task shall return`errTargetSegementNotMatch` error and make manager re-fetch the current target segment id. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-02-05 11:33:43 +08:00
congqixia	5be909982d	enhance: add MockSerializer generation command into Makefile (#29713 ) See also #27675 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-05 21:42:47 +08:00
XuanYang-cn	632d8b3743	enhance: Change DN channelmanger into interface (#29307 ) See also: #28854 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-12-27 16:00:48 +08:00
congqixia	277849a915	enhance: separate serializer logic from sync task (#29413 ) See also #27675 Since serialization segment buffer does not related to sync manager can shall be done before submit into sync manager. So that the pk statistic file could be more accurate and reduce complex logic inside sync manager. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-26 10:40:47 +08:00
XuanYang-cn	ae180d1628	enhance: Change ChannelManager to interface (#29300 ) Rewrite cluster test issue: #28854 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-12-25 19:24:46 +08:00
SimFG	67ab0e424b	fix: Clean the compaction plan info to avoid the object leak (#29365 ) issue: #29296 Signed-off-by: SimFG <bang.fu@zilliz.com>	2023-12-22 12:00:43 +08:00
wei liu	e41fd6fbde	enhance: Move proxy client manager to util package (#28955 ) issue: #28898 This PR move the `ProxyClientManager` to util package, in case of reusing it's implementation in querycoord Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-12-20 19:22:42 +08:00
zhagnlu	a602171d06	enhance: Refactor runtime and expr framework (#28166 ) #28165 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2023-12-18 12:04:42 +08:00
wei liu	fe1eeae2aa	enhance: Use mockery to replace manual mock code (#29074 ) issue: #29043 This PR remove mannul mock code for proxy and data coord --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-12-13 10:46:44 +08:00
aoiasd	3c32ba2407	enhance: pack datacoord Cluster and SessionManager with interface and mock them (#28869 ) relate: https://github.com/milvus-io/milvus/issues/28861 https://github.com/milvus-io/milvus/issues/28854 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2023-12-11 17:52:37 +08:00
XuanYang-cn	5bac7f7897	fix: Fix L0 compaction in datacoord (#28814 ) See also: #27606 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-12-05 18:44:37 +08:00
XuanYang-cn	e62edb991a	enhance: Add FlowgraphManager interface (#28852 ) - Change flowgraphManager to fgManagerImpl - Change close to stop - change execute to controlMemWaterLevel - Change method name of fgManager for readability - Add mockery for fgmanager Issue: #28853 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-11-30 18:42:32 +08:00
XuanYang-cn	aae7e62729	feat: Add levelzero compaction in DN (#28470 ) See also: #27606 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-11-30 14:30:28 +08:00
cai.zhang	f5f4f0872e	enhance: Support importing data with parquet file (#28608 ) issue: #28272 Numpy does not support array type import. Array type data is imported through parquet. Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2023-11-29 20:52:27 +08:00
XuanYang-cn	9b371067d2	feat: Add Compaction views and triggers (#27906 ) - Add Compaction l0 views - Add Compaction scheduler - Add Compaction triggerv2 - Add Compaction view manager See also: #27606 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-11-23 17:30:25 +08:00
Bingyi Sun	d7145e2c06	enhance: Update golangci_lint version (#28535 ) Update golangci lint and fix some warnings Signed-off-by: sunby <sunbingyi1992@gmail.com>	2023-11-21 10:04:21 +08:00
Enwei Jiao	7445d3711c	feat: trigger compaction to handle index version (#28442 ) issue: https://github.com/milvus-io/milvus/issues/28441 --------- Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2023-11-21 09:26:22 +08:00
congqixia	bed7467f20	enhance: Remove commented code and fix naming issue (#28450 ) This PR removes all the commented code and files from PR #28320 For naming issue: - Renaming `MinCheckpoint` to `EarliestPosition`, see #28320 comment - Renaming `writebuffer.Mananger` to `BufferMananger`, see #27874 comment Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-16 00:22:20 +08:00
congqixia	0b905078e7	Use writebuffer, sync manager refactory in datanode (#28320 ) See also #27675 This PR make previously merged refactory of datanode go online - Use write node to replace insert/delete node - Use write buffer manager to control all buffers - Use sync manager to control sync tasks instead of flush manager Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-15 15:24:18 +08:00
wei liu	5b45a138b1	disable auto balance when old node exists (#28191 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-11-07 14:02:20 +08:00
congqixia	bf2f62c1e7	Add `WriteBuffer` to provide abstraction for delta policy (#27874 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-04 12:10:17 +08:00
wei liu	ecec5dfcfd	fix retry on offline node (#28079 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-11-03 10:14:16 +08:00
congqixia	233bf90c55	Add SyncManager to replace flush manager (#27873 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-10-31 02:30:16 +08:00
xige-16	bf46ffd6c4	Fix build with diskann failed (#27672 ) Signed-off-by: xige-16 <xi.ge@zilliz.com>	2023-10-12 19:25:35 +08:00
Enwei Jiao	0f2f4a0a75	Remove useless parameters for Makefile (#27622 ) Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2023-10-11 20:45:35 +08:00
congqixia	cbb350c552	Add broker for datanode grpc operations (#27631 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-10-11 17:03:34 +08:00
foxspy	2b46bd1f08	fix diskann searchCache param compatibility (#27557 ) Signed-off-by: xianliang <xianliang.li@zilliz.com>	2023-10-11 16:57:34 +08:00
PowderLi	8d3069b1db	update openssl to 3.1.2 (#27399 ) deal with root path's normalization Signed-off-by: PowderLi <min.li@zilliz.com>	2023-10-08 19:17:31 +08:00
congqixia	8c59dba329	Refine queryHook mockery (#27394 ) This PR move `QueryHook` interface to `optimizers` pkg Update all mockery generated files to latest Add makefile entry for `QueryHook` Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-09-28 10:01:26 +08:00
congqixia	a3dd2756cf	Add predicates for TxnKV operations (#27365 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-09-27 10:21:26 +08:00
jaime	7119cb29ca	Fix kafka consumer connection leak (#27224 ) Signed-off-by: jaime <yun.zhang@zilliz.com>	2023-09-26 10:31:27 +08:00
congqixia	7bbeebc0d1	Fix lint-fix command may fail when .git has new ref (#27354 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-09-26 10:05:25 +08:00
jaime	7f7c71ea7d	Decoupling client and server API in types interface (#27186 ) Co-authored-by:: aoiasd <zhicheng.yue@zilliz.com> Signed-off-by: jaime <yun.zhang@zilliz.com>	2023-09-26 09:57:25 +08:00

1 2 3 4 5 ...

284 Commits