Commit Graph

318 Commits

Author SHA1 Message Date
Xiaofan
92361c4efc
Upgrade Minio and Pulsar dependency (#36982)
fix #34910
upgrade the dependency version of minio and pulsar client

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2024-10-23 11:26:19 -07:00
Chun Han
903450f5c6
enhance: add ts support for iterator(#22718) (#36572)
related: #22718

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-10-16 18:51:23 +08:00
SimFG
bb3ef5349f
enhance: update the expr version to support automatic conversion of variable types (#36832)
/kind improvement

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-10-15 10:53:22 +08:00
cai.zhang
d1060c0e05
enhance: Update antlr version and refine parsing not in (#36745)
issue: #36672

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-10-11 14:03:21 +08:00
smellthemoon
2055df81aa
enhance: upgrade pulsar-client-go to 0.12.1 (#36615)
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-09-30 17:13:16 +08:00
Rijin-N
a05a37a583
enhance: GCS native support (GCS implemented using Google Cloud Storage libraries) (#36214)
Native support for Google cloud storage using the Google Cloud Storage
libraries. Authentication is performed using GCS service account
credentials JSON.

Currently, Milvus supports Google Cloud Storage using S3-compatible APIs
via the AWS SDK. This approach has the following limitations:

1. Overhead: Translating requests between S3-compatible APIs and GCS can
introduce additional overhead.
2. Compatibility Limitations: Some features of the original S3 API may
not fully translate or work as expected with GCS.

To address these limitations, This enhancement is needed.

Related Issue: #36212
2024-09-30 13:23:32 +08:00
cai.zhang
ecb2b242e2
enhance: Add sorted for segment info (#36469)
issue: #33744

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-09-30 10:01:16 +08:00
SimFG
c94b69c2f6
enhance: update the expr version and format the expr http response (#36406)
/kind improvement

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-09-26 14:27:20 +08:00
smellthemoon
a8c80abe36
enhance: upgrade pulsar-client-go to 0.11.1 (#36435)
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-09-26 10:09:13 +08:00
Ted Xu
b9c037f558
feat: adding cache to expression parse (#36185)
See #36122

This PR improves the proxy node performance by adding cache to
expression parse.

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-09-13 10:03:09 +08:00
aoiasd
da227ff9a1
feat: Support create collection with functions (#35973)
relate: https://github.com/milvus-io/milvus/issues/35853
Support create collection with functions. Prepare for support bm25
function.

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-09-12 10:43:06 +08:00
Ted Xu
e7ea1d7a04
enhance: improve log encoding performance on proxy nodes (#36123)
See #36122

This PR is designed to enhance log performance through two improvements:

1. Optimize JSON encoding by switching JSON serializer to
`json-iterator`.
2. Adding support of lazy initialization `WithLazy`.

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-09-11 14:51:07 +08:00
Yinzuo Jiang
407fc933e7
fix: bump bytedance/sonic to v1.12.2 to fix compilation error with go 1.23.0 (#35879)
fixes: #35878

Signed-off-by: Yinzuo Jiang <jiangyinzuo@foxmail.com>
2024-09-01 13:43:01 +08:00
SimFG
311f860676
enhance: support to drop the role which is related the privilege list (#35727)
- issue: #35545

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-08-30 15:17:00 +08:00
yihao.dai
69265978bf
fix: Fix arrow go client (#35819)
issue: https://github.com/milvus-io/milvus/issues/35662

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-08-29 17:53:01 +08:00
Zhen Ye
99dff06391
enhance: using streaming service in insert/upsert/flush/delete/querynode (#35406)
issue: #33285

- using streaming service in insert/upsert/flush/delete/querynode
- fixup flusher bugs and refactor the flush operation
- enable streaming service for dml and ddl
- pass the e2e when enabling streaming service
- pass the integration tst when enabling streaming service

---------

Signed-off-by: chyezh <chyezh@outlook.com>
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-08-29 10:03:08 +08:00
SimFG
731d45abbe
enhance: provide more general configuration to control mmap behavior (#35359)
- issue: #35273

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-08-21 00:22:54 +08:00
wei liu
a570567644
enhance: Enable ReadOnly/ReadWrite/Admin Privilege Group to simplify RBAC grant progress (#35472)
issue: #35471

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-08-16 14:18:54 +08:00
wei liu
1d49358f82
enhance: Add BackupRBAC/RestoreRBAC API to enable rbac backup (#35444)
issue: #35443

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-08-16 10:10:53 +08:00
congqixia
db06b86594
enhance: Sync otlp dependency version to fix security issue (#35192)
Related to #34434

otelgrpc DoS vulnerability due to unbound cardinality metrics

https://github.com/milvus-io/milvus/security/dependabot/91

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-02 16:22:20 +08:00
zhenshan.cao
aa247f192d
enhance: remove unused code for StorageV2 (#35132)
issue: https://github.com/milvus-io/milvus/issues/34168

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-08-01 12:08:13 +08:00
congqixia
972752258a
enhance: Support otlp http exporter (#35053)
See also #35052

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-29 17:43:49 +08:00
wei liu
c45f38aa61
enhance: Update protobuf-go to protobuf-go v2 (#34394)
issue: #34252

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-07-29 11:31:51 +08:00
congqixia
4ee6c69217
enhance: Add Segment Level in milvus segment info APIs (#34763)
See also #34746

This PR add segment level field in response of
`GetPersistentSegmentInfo` and `GetQuerySegmentInfo`

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-26 10:01:46 +08:00
chyezh
f6c6e98ec5
enhance: recover stack info when non-cgo thread crash (#34865)
issue: #34864

Signed-off-by: chyezh <chyezh@outlook.com>
2024-07-23 14:25:44 +08:00
Xiaofan
be7760a9ab
fix: CVE by upgrading some dependencies. (#34462)
fix #34434 and #34456
upgrade otelgrpc to fix CVE

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2024-07-16 11:55:36 +08:00
Patrick Weizhi Xu
df5ba3fae3
enhance: update milvus proto (#34490)
issue: #34336

Revert CollectionSchema changes

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2024-07-09 10:04:11 +08:00
Aldrin
686a212d8b
fix: Upgraded Azidentity Package to v1.6.0 (#34464)
issue : https://github.com/milvus-io/milvus/issues/34456

Signed-off-by: Ald392 <imagesai32@gmail.com>
2024-07-08 17:51:32 +08:00
chyezh
7611128e57
enhance: wal adaptor implementation (#34122)
issue: #33285

- add adaptor to implement walimpls into wal interface.
- implement timetick sorted and filtering scanner.
- add test for wal.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-07-04 15:23:08 +08:00
zhenshan.cao
d18c49013b
enhance: Refine compaction (#33982)
issue : https://github.com/milvus-io/milvus/issues/32939

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-06-25 10:08:03 +08:00
chyezh
b9237280c2
enhance: wal interface definition (#33745)
issue: #33285

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-06-24 10:34:12 +08:00
congqixia
b39dfc25dc
enhance: Use fastjson lib for unmarshal delete log (#33787)
```
goos: linux
goarch: amd64
GOMAXPROC=1
cpu: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
BenchmarkJsonSerdeStd             343872              3568 ns/op            1335 B/op         25 allocs/op
BenchmarkJsonSerdeFastjson       5124177               234.9 ns/op            16 B/op          1 allocs/op
```

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-12 20:41:57 +08:00
wei liu
22a059d4de
enhance: update dependency for blobloom (#33565)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-03 21:59:46 +08:00
wei liu
c6a1c49e02
enhance: Use Blocked Bloom Filter instead of basic bloom fitler impl. (#33405)
issue: #32995
To speed up the construction and querying of Bloom filters, we chose a
blocked Bloom filter instead of a basic Bloom filter implementation.

WARN: This PR is compatible with old version bf impl, but if fall back
to old milvus version, it may causes bloom filter deserialize failed.

In single Bloom filter test cases with a capacity of 1,000,000 and a
false positive rate (FPR) of 0.001, the blocked Bloom filter is 5 times
faster than the basic Bloom filter in both querying and construction, at
the cost of a 30% increase in memory usage.

- Block BF construct time	{"time": "54.128131ms"}
- Block BF size	                {"size": 3021578}
- Block BF Test cost	        {"time": "55.407352ms"}
- Basic BF construct time	{"time": "210.262183ms"}
- Basic BF size	                {"size": 2396308}
- Basic BF Test cost	        {"time": "192.596229ms"}

In multi Bloom filter test cases with a capacity of 100,000, an FPR of
0.001, and 100 Bloom filters, we reuse the primary key locations for all
Bloom filters to avoid repeated hash computations. As a result, the
blocked Bloom filter is also 5 times faster than the basic Bloom filter
in querying.

- Block BF TestLocation cost    {"time": "529.97183ms"}
- Basic BF TestLocation cost	{"time": "3.197430181s"}

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-05-31 17:49:45 +08:00
shaoting-huang
de7901121f
Upgrade go from 1.20 to 1.21 (#33047)
Signed-off-by: shaoting-huang [shaoting-huang@zilliz.com]

issue: https://github.com/milvus-io/milvus/issues/32982

# Background
Go 1.21 introduces several improvements and changes over Go 1.20, which
is quite stable now. According to
[Go 1.21 Release Notes](https://tip.golang.org/doc/go1.21), the big
difference of Go 1.21 is enabling Profile-Guided Optimization by
default, which can improve performance by around 2-14%. Here are the
summary steps of PGO:
1. Build Initial Binary (Without PGO)
2. Deploying the Production Environment
3. Run the program and collect Performance Analysis Data (CPU pprof)
4. Analyze the Collected Data and Select a Performance Profile for PGO
5. Place the Performance Analysis File in the Main Package Directory and
Name It default.pgo
6. go build Detects the default.pgo File and Enables PGO
7. Build and Release the Updated Binary (With PGO)
8. Iterate and Repeat the Above Steps
<img width="657" alt="Screenshot 2024-05-14 at 15 57 01"
src="https://github.com/milvus-io/milvus/assets/167743503/b08d4300-0be1-44dc-801f-ce681dabc581">

# What does this PR do
There are three experiments, search benchmark by Zilliz test platform,
search benchmark by open-source
[VectorDBBench](https://github.com/zilliztech/VectorDBBench?tab=readme-ov-file),
and search benchmark with PGO. We do both search benchmarks by Zilliz
test platform and by VectorDBBench to reduce reliance on a single
experimental result. Besides, we validate the performance enhancement
with PGO.

## Search Benchmark Report by Zilliz Test Platform
An upgrade to Go 1.21 was conducted on a Milvus Standalone server,
equipped with 16 CPUs and 64GB of memory. The search performance was
evaluated using a 1 million entry local dataset with an L2 metric type
in a 768-dimensional space. The system was tested for concurrent
searches with 50 concurrent tasks for 1 hour, each with a 20-second
interval. The reason for using one server rather than two servers to
compare is to guarantee the same data source and same segment state
after compaction.

Test Sequence:
1. Go 1.20 Initial Run: Insert data, build index, load index, and
search.
2. Go 1.20 Rebuild: Rebuild the index with the same dataset, load index,
and search.
3. Go 1.21 Load: Upload to Go 1.21 within the server. Then load the
index from the second run, and search.
4. Go 1.21 Rebuild: Rebuild the index with the same dataset, load index,
and search.

Search Metrics: 
| Metric | Go 1.20 | Go 1.20 Rebuild Index | Go 1.21 | Go 1.21 Rebuild
Index |

|----------------------------|------------------|-----------------|------------------|-----------------|
| `search requests` | 10,942,683 | 16,131,726 | 16,200,887 | 16,331,052
|
| `search fails` | 0 | 0 | 0 | 0 |
| `search RT_avg` (ms) | 16.44 | 11.15 | 11.11 | 11.02 |
| `search RT_min` (ms) | 1.30 | 1.28 | 1.31 | 1.26 |
| `search RT_max` (ms) | 446.61 | 233.22 | 235.90 | 147.93 |
| `search TP50` (ms) | 11.74 | 10.46 | 10.43 | 10.35 |
| `search TP99` (ms) | 92.30 | 25.76 | 25.36 | 25.23 |
| `search RPS` | 3,039 | 4,481 | 4,500 | 4,536 |

### Key Findings
The benchmark tests reveal that the index build time with Go 1.20 at
340.39 ms and Go 1.21 at 337.60 ms demonstrated negligible performance
variance in index construction. However, Go 1.21 offers slightly better
performance in search operations compared to Go 1.20, with improvements
in handling concurrent tasks and reducing response times.

## Search Benchmark Report By VectorDb Bench
Follow
[VectorDBBench](https://github.com/zilliztech/VectorDBBench?tab=readme-ov-file)
to create a VectorDb Bench test for Go 1.20 and Go 1.21. We test the
search performance with Go 1.20 and Go 1.21 (without PGO) on the Milvus
Standalone system. The tests were conducted using the Cohere dataset
with 1 million entries in a 768-dimensional space, utilizing the COSINE
metric type.

Search Metrics: 
Metric | Go 1.20 | Go 1.21 without PGO
-- | -- | --
Load Duration (seconds) | 1195.95 | 976.37
Queries Per Second (QPS) | 841.62 | 875.89
99th Percentile Serial Latency (seconds) | 0.0047 | 0.0076
Recall | 0.9487 | 0.9489

### Key Findings
Go 1.21 indicates faster index loading times and larger search QPS
handling.

## PGO Performance Test
Milvus has already added
[net/http/pprof](https://pkg.go.dev/net/http/pprof) in the metrics. So
we can curl the CPU profile directly by running
`curl -o default.pgo
"http://${MILVUS_SERVER_IP}:${MILVUS_SERVER_PORT}/debug/pprof/profile?seconds=${TIME_SECOND}"`
to collect the profile as the default.pgo during the first search. Then
I build Milvus with PGO and use the same index to run the search again.
The result is as below:

Search Metrics
| Metric | Go 1.21 Without PGO | Go 1.21 With PGO | Change (%) |

|---------------------------------------------|------------------|-----------------|------------|
| `search Requests` | 2,644,583 | 2,837,726 | +7.30% |
| `search Fails` | 0 | 0 | N/A |
| `search RT_avg` (ms) | 11.34 | 10.57 | -6.78% |
| `search RT_min` (ms) | 1.39 | 1.32 | -5.18% |
| `search RT_max` (ms) | 349.72 | 143.72 | -58.91% |
| `search TP50` (ms) | 10.57 | 9.93 | -6.05% |
| `search TP99` (ms) | 26.14 | 24.16 | -7.56% |
| `search RPS`                 | 4,407       | 4,729       | +7.30%    |

### Key Findings
PGO led to a notable enhancement in search performance, particularly in
reducing the maximum response time by 58% and increasing the search QPS
by 7.3%.

### Further Analysis
Generate a diff flame graphs between two CPU profiles by running `go
tool pprof -http=:8000 -diff_base nopgo.pgo pgo.pgo -normalize`

<img width="1894" alt="goprofiling"
src="https://github.com/milvus-io/milvus/assets/167743503/ab9e91eb-95c7-4963-acd9-d1c3c73ee010">
Further insight of HnswIndexNode and Milvus Search Handler
<img width="1906" alt="hnsw"
src="https://github.com/milvus-io/milvus/assets/167743503/a04cf4a0-7c97-4451-b3cf-98afc20a0b05">
<img width="1873" alt="search_handler"
src="https://github.com/milvus-io/milvus/assets/167743503/5f4d3982-18dd-4115-8e76-460f7f534c7f">

After applying PGO to the Milvus server, the CPU utilization of the
faiss::fvec_L2 function has decreased. This optimization significantly
enhances the performance of the
[HnswIndexNode::Search::searchKnn](e0c9c41aa2/src/index/hnsw/hnsw.cc (L203))
method, which is frequently invoked by Knowhere during high-concurrency
searches. As the explanation from Go release notes, the function might
be more aggressively inlined by Go compiler during the second build with
the CPU profiling collected from the first run. As a result, the search
handler efficiency within Milvus DataNode has improved, allowing the
server to process a higher number of search queries per second (QPS).



# Conclusion
The combination of Go 1.21 and PGO has led to substantial enhancements
in search performance for Milvus server, particularly in terms of search
QPS and response times, making it more efficient for handling
high-concurrency search operations.

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2024-05-22 13:21:39 +08:00
smellthemoon
1671c7898a
enhance: sync milvus proto version (#33094)
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-05-16 18:51:34 +08:00
congqixia
8cf2cf5c94
enhance: Add go-deadlock as unittest only dependency (#33063)
See also #33062

This PR:

- Add `lock.RWMutex` & `lock.Mutex` alias to switch implementation based
  on build flags
- When build flags has `test` in it, use `go-deadlock` to detect
  possible deadlocks
- Replace all `sync.RWMutex` & `sync.Mutex` in datacoord pkg

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-15 16:33:34 +08:00
congqixia
b2d83d3354
enhance: Bump milvus version to v2.4.2 (#33048)
Bumping version to v2.4.2. Also bump milvus-proto version to v2.4.3.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-15 13:57:35 +08:00
Xiaofan
36f1ea93a5
enhance: optimize plan parser pool to avoid unnessary recycle (#32869)
fix #32868
plan parser takes too much cpu on high qps,this pr try to avoid create
lexer and parser too freequent

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2024-05-11 10:51:31 +08:00
wei liu
c35797c399
enhance: expose DescribeDatabase api in proxy (#32732)
issue: #32707

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-05-09 22:51:30 +08:00
wayblink
42d0412e93
enhance: Add channelCPs in FlushResponse (#32044)
#32609

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-04-30 09:45:27 +08:00
congqixia
f07e78ec91
enhance: Bump Milvus & proto version to v2.4.1 (#32693)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-29 14:37:26 +08:00
wei liu
07720f1a95
enhance: expose alter database api in proxy (#32639)
issue: #30040

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-04-28 18:15:31 +08:00
SimFG
bed6363feb
enhance: update the go-api version for the list api (#32605)
issue: https://github.com/milvus-io/milvus/issues/32550
/kind improvement

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-04-26 17:49:33 +08:00
jaime
2c63f848bf
fix: upgrade nats server to fix security vulnerabilities (#32021)
issue: #32022

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-04-11 16:07:18 +08:00
SimFG
789e014c74
enhance: add the db id for the describe collection response (#32114)
/kind improvement
issue: #32110

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-04-10 21:23:20 +08:00
congqixia
9d16aa0bd3
enhance: Bump jose2go for security alerts (#32040)
See also #31986

- jose2go vulnerable to denial of service via large p2c value

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-10 10:05:19 +08:00
congqixia
a860bf66f9
enhance: Bump google.golang.org/grpc to 1.57.1 (#31985)
See also #31986
See dependency alert for "gRPC-Go HTTP/2 Rapid Reset vulnerability"

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-09 10:53:17 +08:00
cqy123456
976928ecd1
fix: fix fp16/bf16 some code missing and add more fp16/bf16 test (#31612)
issue: #31534

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-03-28 14:11:10 +08:00
SimFG
b1a1cca10b
feat: add more operation detail info for better allocation (#30438)
issue: #30436

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-03-28 06:33:11 +08:00