Commit Graph

64 Commits

Author SHA1 Message Date
Gao
5fc1370f6f
enhance: [2.4] autoindex for multi data type (#33867)
issue: #22837 
pr: https://github.com/milvus-io/milvus/pull/33868

- opensource autoindex support
- metric type check for different data types
- autoindex data type for search param

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-06-14 23:26:00 +08:00
cqy123456
562369627d
enhance: [cherry-pick]check index with data type in knowhere api (#33878)
issue: https://github.com/milvus-io/milvus/issues/22837
related: https://github.com/milvus-io/milvus/pull/33880

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-06-14 19:45:58 +08:00
Buqian Zheng
39e341e83a
fix: [2.4] update check for sparse hnsw index (#33714)
issue: #29419
pr: #33713

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-06-07 18:36:07 +08:00
foxspy
58a7111599
enhance: [cherry-pick] add autoindex mapping for binary/sparse datatype (#33625)
issue: #22837 
pr: #33624

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2024-06-06 10:33:52 +08:00
Cai Yudong
00438f408f
enhance: Unify data type check APIs for go (#31887)
Issue: #22837

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-04-07 14:27:22 +08:00
cai.zhang
1f43be4a3c
enhance: Support auto index for scalar index (#31255)
issue: #29309 
reopen pr : #29310

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-03-21 10:37:08 +08:00
Buqian Zheng
3c80083f51
feat: [Sparse Float Vector] add sparse vector support to milvus components (#30630)
add sparse float vector support to different milvus components,
including proxy, data node to receive and write sparse float vectors to
binlog, query node to handle search requests, index node to build index
for sparse float column, etc.

https://github.com/milvus-io/milvus/issues/29419

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-03-13 14:32:54 -07:00
cai.zhang
40ca98f57f
enhance: Skip timestamp allocation when search/query consistency level is eventually (#29773)
issue: #29772 
1. Skip timestamp allocation when search/query consistency level is
eventually.

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-02-21 09:52:59 +08:00
yah01
1c8ce33eea
fix: report error if the altering index doesn't support mmap (#29832)
this also checks the param value
fix #29909

Signed-off-by: yah01 <yah2er0ne@outlook.com>
2024-01-17 16:40:54 +08:00
cai.zhang
8c89ad694e
fix: Fix error message for indexing (#29898)
issue: #29897

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-01-12 19:44:51 +08:00
Xu Tong
e429965f32
Add float16 approve for multi-type part (#28427)
issue:https://github.com/milvus-io/milvus/issues/22837

Add bfloat16 vector, add the index part of float16 vector.

Signed-off-by: Writer-X <1256866856@qq.com>
2024-01-11 15:48:51 +08:00
congqixia
4f8c540c77
enhance: cache collection schema attributes to reduce proxy cpu (#29668)
See also #29113

The collection schema is crucial when performing search/query but some
of the information is calculated for every request.

This PR change schema field of cached collection info into a utility
`schemaInfo` type to store some stable result, say pk field,
partitionKeyEnabled, etc. And provided field name to id map for
search/query services.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-04 17:28:46 +08:00
Jiquan Long
3f46c6d459
feat: support inverted index (#28783)
issue: https://github.com/milvus-io/milvus/issues/27704

Add inverted index for some data types in Milvus. This index type can
save a lot of memory compared to loading all data into RAM and speed up
the term query and range query.

Supported: `INT8`, `INT16`, `INT32`, `INT64`, `FLOAT`, `DOUBLE`, `BOOL`
and `VARCHAR`.

Not supported: `ARRAY` and `JSON`.

Note:
- The inverted index for `VARCHAR` is not designed to serve full-text
search now. We will treat every row as a whole keyword instead of
tokenizing it into multiple terms.
- The inverted index don't support retrieval well, so if you create
inverted index for field, those operations which depend on the raw data
will fallback to use chunk storage, which will bring some performance
loss. For example, comparisons between two columns and retrieval of
output fields.

The inverted index is very easy to be used.

Taking below collection as an example:

```python
fields = [
		FieldSchema(name="pk", dtype=DataType.VARCHAR, is_primary=True, auto_id=False, max_length=100),
		FieldSchema(name="int8", dtype=DataType.INT8),
		FieldSchema(name="int16", dtype=DataType.INT16),
		FieldSchema(name="int32", dtype=DataType.INT32),
		FieldSchema(name="int64", dtype=DataType.INT64),
		FieldSchema(name="float", dtype=DataType.FLOAT),
		FieldSchema(name="double", dtype=DataType.DOUBLE),
		FieldSchema(name="bool", dtype=DataType.BOOL),
		FieldSchema(name="varchar", dtype=DataType.VARCHAR, max_length=1000),
		FieldSchema(name="random", dtype=DataType.DOUBLE),
		FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=dim),
]
schema = CollectionSchema(fields)
collection = Collection("demo", schema)
```

Then we can simply create inverted index for field via:

```python
index_type = "INVERTED"
collection.create_index("int8", {"index_type": index_type})
collection.create_index("int16", {"index_type": index_type})
collection.create_index("int32", {"index_type": index_type})
collection.create_index("int64", {"index_type": index_type})
collection.create_index("float", {"index_type": index_type})
collection.create_index("double", {"index_type": index_type})
collection.create_index("bool", {"index_type": index_type})
collection.create_index("varchar", {"index_type": index_type})
```

Then, term query and range query on the field can be speed up
automatically by the inverted index:

```python
result = collection.query(expr='int64 in [1, 2, 3]', output_fields=["pk"])
result = collection.query(expr='int64 < 5', output_fields=["pk"])
result = collection.query(expr='int64 > 2997', output_fields=["pk"])
result = collection.query(expr='1 < int64 < 5', output_fields=["pk"])
```

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-12-31 19:50:47 +08:00
yah01
a0e1a1eb31
feat: support enable/disable mmap for index (#29005)
support enable/disable mmap for index, the user could alter the index's
mode by `AlterIndex` method
related: https://github.com/milvus-io/milvus/issues/21866

---------

Signed-off-by: yah01 <yah2er0ne@outlook.com>
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-12-21 18:07:24 +08:00
cai.zhang
1b4d2674b3
fix: Set the default index name to the name of the existing index (#29275)
issue: #29269

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2023-12-20 17:22:41 +08:00
cai.zhang
2f7252b44e
enhance: Set default index name as field name (#29218)
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2023-12-15 12:06:39 +08:00
SimFG
9b0ecbdca7
Support to replicate the mq message (#27240)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-10-20 14:26:09 +08:00
xige-16
6cbb67832f
Compatible with scalar index types marisa-trie and Ascending (#27638)
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2023-10-15 13:52:06 +08:00
yah01
be980fbc38
Refine state check (#27541)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-10-11 21:01:35 +08:00
yah01
41495ed266
Improve the error message for getting all indexes of collection (#27389)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-10-08 21:23:32 +08:00
yah01
8394b3a1ec
Block creating new error from status reason (#27426)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-10-07 11:29:32 +08:00
cai.zhang
dedb90f85f
Fix error message for creating scalar index (#27382)
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2023-09-27 10:33:27 +08:00
jaime
7f7c71ea7d
Decoupling client and server API in types interface (#27186)
Co-authored-by:: aoiasd <zhicheng.yue@zilliz.com>

Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-09-26 09:57:25 +08:00
SimFG
26f06dd732
Format the code (#27275)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-09-21 09:45:27 +08:00
yah01
168e82ee10
Fix panic while handling with the nil status (#27040)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-09-15 10:09:21 +08:00
yah01
00c65fa0d7
Refine QueryNode errors (#27013)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-09-12 16:07:18 +08:00
Xu Tong
9166011c4a
Add float16 vector (#25852)
Signed-off-by: Writer-X <1256866856@qq.com>
2023-09-08 10:03:16 +08:00
yah01
3349db4aa7
Refine errors to remove changes breaking design (#26521)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-09-04 09:57:09 +08:00
cai.zhang
760a2d9aa7
Support AllocTimestamp api for Milvus (#25784)
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2023-07-25 10:05:00 +08:00
Enwei Jiao
66fdc71479
Refactor logs in DataCoord & DataNode (#25574)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-07-14 15:56:31 +08:00
smellthemoon
948d04cdd8
Check field type when create index (#25248)
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2023-07-03 15:26:26 +08:00
jaime
18df2ba6fd
[Cherry-Pick] Support Database (#24769)
Support Database(#23742)
Fix db nonexists error for FlushAll (#24222)
Fix check collection limits fails (#24235)
backward compatibility with empty DB name (#24317)
Fix GetFlushAllState with DB (#24347)
Remove db from global meta cache after drop database (#24474)
Fix db name is empty for describe collection response (#24603)
Add RBAC for Database API (#24653)
Fix miss load the same name collection during recover stage (#24941)

RBAC supports Database validation (#23609)
Fix to list grant with db return empty (#23922)
Optimize PrivilegeAll permission check (#23972)
Add the default db value for the rbac request (#24307)

Signed-off-by: jaime <yun.zhang@zilliz.com>
Co-authored-by: SimFG <bang.fu@zilliz.com>
Co-authored-by: longjiquan <jiquan.long@zilliz.com>
2023-06-25 17:20:43 +08:00
cai.zhang
c81bdda55a
Hide autoindex params details (#24851) (#24935)
Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
Co-authored-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2023-06-16 14:30:40 +08:00
congqixia
41af0a98fa
Use go-api/v2 for milvus-proto (#24770)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-06-09 01:28:37 +08:00
cai.zhang
93ea9c4925
Support return pending index rows when describe index (#24588)
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2023-06-01 18:14:32 +08:00
Jiquan Long
29ae1229b6
Support AutoIndex (#24387)
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-05-29 20:35:28 +08:00
cai.zhang
6209d5d717
Remove unused code for creating index (#24442)
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2023-05-26 16:49:26 +08:00
cai.zhang
008285f849
Support dynamic schema for create collection (#24176)
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2023-05-18 09:33:24 +08:00
congqixia
73a181d226
Fix get vector it timeout and improve some string const usage (#24141)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-05-16 17:41:22 +08:00
wayblink
e13d900398
Add some log in getIndexStatisticsTask (#23897)
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-05-08 10:02:39 +08:00
Jiquan Long
7be7e6f360
Refactor check logic of index parameters (#23856)
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-05-06 10:40:39 +08:00
wayblink
899702f13c
Implement GetIndexStatistics interface (#23603)
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-05-06 10:34:39 +08:00
Gao
c1e8406c44
Override index type as AUTOINDEX when autoindex is enabled (#23744)
Signed-off-by: chasingegg <gaoc96@qq.com>
2023-04-30 11:50:40 +08:00
cai.zhang
bd65ef8e95
Fix bug for type params when indexing (#23242) (#23250)
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2023-04-09 16:18:29 +08:00
jaime
c9d0c157ec
Move some modules from internal to public package (#22572)
Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-04-06 19:14:32 +08:00
cai.zhang
977943463e
Check index params and type params (#22988)
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2023-03-26 22:15:59 +08:00
Enwei Jiao
697dedac7e
Use cockroachdb/errors to replace other error pkg (#22390)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-02-26 11:31:49 +08:00
cai.zhang
e127cf7b99
Reset indexpb for upgrade (#21620)
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2023-01-11 14:35:40 +08:00
cai.zhang
e5f408dceb
Merge IndexCoord and DataCoord (#21267)
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2023-01-04 19:37:36 +08:00
Enwei Jiao
89b810a4db
Refactor all params into ParamItem (#20987)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>

Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2022-12-07 18:01:19 +08:00