wayblink
f2bd910df5
[skip e2e]Fix log mistake: WatchDmChannels -> WatchDeltaChannels ( #17643 )
...
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2022-06-20 15:10:12 +08:00
congqixia
50cecc65ed
Fix mock querynode server session not revoked ( #17229 )
...
Revoke mock querynode server session when it's stopped
This PR reduces the running time of TestLoadBalanceIndexedSegmentsAfterNodeDown from 60+ seconds to less than 1+ seconds
Also related to #17212 #17215
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2022-06-18 18:22:12 +08:00
wayblink
074ec3060a
Support return dropped segments info in GetSegmentInfo rpc ( #17617 )
...
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2022-06-17 18:24:12 +08:00
congqixia
785a5a757f
Use segment version instead of ref cnt ( #17609 )
...
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2022-06-17 17:38:12 +08:00
congqixia
cc3ecc4bd5
Make querycoord channel allocator respect context ( #17552 )
...
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2022-06-17 16:02:12 +08:00
yah01
f5fa93aa0b
Return err if failed to assign segments/channels to nodes ( #17616 )
...
Signed-off-by: yah01 <yang.cen@zilliz.com>
2022-06-17 12:42:11 +08:00
yah01
0f87763682
Fix LoadBalance may not update the shard leader ( #17608 )
...
This happens probably with concurrent updating replicas,
some goroutines modify the nodes list of replicas,
and the others modify the shard leaders of replicas
Signed-off-by: yah01 <yang.cen@zilliz.com>
2022-06-17 08:58:10 +08:00
yah01
3f42f5f345
Set the task state to TaskFailed if error occurs ( #17598 )
...
Signed-off-by: yah01 <yang.cen@zilliz.com>
2022-06-16 19:18:11 +08:00
Xiaofan
1f6fbf91b2
Fix pulsar unsubsribe issue ( #17562 )
...
Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>
2022-06-16 17:28:11 +08:00
yah01
7d5c8c5f38
Fix bug not remove offline node ( #17560 )
...
The LoadBalance task won't remove the offline node if the node never load/watch any segment/dmchannel
Signed-off-by: yah01 <yang.cen@zilliz.com>
2022-06-16 16:00:10 +08:00
wayblink
eb5b0b7fc8
Move SegmentInfo out of VchannelInfo, leave Id only to decrease kv size. Get complete SegmentInfo through RPC ( #17441 )
...
Resolves : #17233 #16047
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2022-06-16 12:00:10 +08:00
cai.zhang
ea5041aec2
Acquiring the segment reference lock on task level ( #17544 )
...
Signed-off-by: Cai.Zhang <cai.zhang@zilliz.com>
2022-06-15 21:38:10 +08:00
congqixia
f9553970f9
Add BindContext function for querycoord task scheduler ( #17531 )
...
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2022-06-14 17:12:09 +08:00
Enwei Jiao
a5b008acec
ignore getReplica's error when handle rebalanceTask ( #17469 )
...
Signed-off-by: Enwei Jiao <jiaoew2011@gmail.com>
2022-06-10 14:50:08 +08:00
Xiaofan
66f26943f8
Add Err code when task rollback ( #17472 )
...
Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>
2022-06-09 21:08:07 +08:00
congqixia
97a871cc82
Make querycoord segment allocator respect context ( #17452 )
...
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2022-06-09 20:14:06 +08:00
Letian Jiang
dfaed5acdd
Add QueryServiceAvailable field in ShowCollections ( #17456 )
...
Signed-off-by: Letian Jiang <letian.jiang@zilliz.com>
2022-06-09 18:20:07 +08:00
yah01
a2d2ad88bd
Make assigning segments faster ( #17377 )
...
This improve the Load performance,
and let the LoadBalance fails fast, which allows us to retry it timely
Signed-off-by: yah01 <yang.cen@zilliz.com>
2022-06-06 16:52:05 +08:00
xige-16
8c69790383
Fix lost delete msg caused by loadSegment after watchDeltaChannel ( #17308 )
...
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2022-06-02 16:56:04 +08:00
yah01
cc69c5cdd3
Make Cluster interface's methods called outside public ( #17315 )
...
Signed-off-by: yah01 <yang.cen@zilliz.com>
2022-06-02 13:16:05 +08:00
yah01
f5bd519e49
Add retry mechanism for NodeDown LoadBalance ( #17306 )
...
Signed-off-by: yah01 <yang.cen@zilliz.com>
2022-06-01 20:00:03 +08:00
cai.zhang
bcf3b7426a
Add distributed lock for segment refgerence ( #16782 )
...
Signed-off-by: Cai.Zhang <cai.zhang@zilliz.com>
2022-05-31 16:36:03 +08:00
congqixia
c88514bc49
Remove not used QueryChannel in Proxy and Query Cluster ( #16856 )
...
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2022-05-30 19:50:04 +08:00
yah01
b09359b12f
Remove useless collection ID in error message ( #17269 )
...
Signed-off-by: yah01 <yang.cen@zilliz.com>
2022-05-30 14:40:01 +08:00
xige-16
3a63d6c98e
Fix load timeout in chaos ( #17241 )
...
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2022-05-26 22:28:03 +08:00
Letian Jiang
f2a27e0e64
Retry GetShardLeaders until service available or timeout ( #17183 )
...
Signed-off-by: Letian Jiang <letian.jiang@zilliz.com>
2022-05-26 20:28:02 +08:00
yah01
5872c5afb6
Fix updating shard leaders may lost some modifications ( #17218 )
...
Signed-off-by: yah01 <yang.cen@zilliz.com>
2022-05-26 17:14:02 +08:00
congqixia
7409bfc56d
Make allocateNode run async in case of block offline event ( #17185 )
...
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2022-05-25 18:53:59 +08:00
yah01
de0ba6d495
Fix GetQuerySegmentInfo() returns incorrect result after LoadBalance ( #17190 )
...
Signed-off-by: yah01 <yang.cen@zilliz.com>
2022-05-25 15:17:59 +08:00
congqixia
37d7d7baf8
Fix node state data race in querycoord ( #17198 )
...
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2022-05-25 10:43:59 +08:00
Jiquan Long
75ca64f8c7
Refine task type logs to string ( #17196 )
...
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2022-05-25 03:08:00 +08:00
Bingyi Sun
ffaead6ad9
Add load meta in hand off task. ( #17179 )
...
issue: #16842
Signed-off-by: sunby <bingyi.sun@zilliz.com>
Co-authored-by: sunby <bingyi.sun@zilliz.com>
2022-05-24 18:24:00 +08:00
Bingyi Sun
86728490a2
Fix partition not found ( #17132 )
...
issue: #16842
Signed-off-by: sunby <bingyi.sun@zilliz.com>
Co-authored-by: sunby <bingyi.sun@zilliz.com>
2022-05-20 19:37:57 +08:00
yah01
7746a5b742
Add NodeIds field for QuerySegmentInfo ( #17121 )
...
Signed-off-by: yah01 <yang.cen@zilliz.com>
2022-05-20 18:03:58 +08:00
congqixia
599763d9bf
Fix replicas info is not removed after release ( #17111 )
...
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2022-05-19 20:05:57 +08:00
yah01
dcfe472586
Fix LoadBalance doesn't save the modification to replicas' shards ( #17064 )
...
Signed-off-by: yah01 <yang.cen@zilliz.com>
2022-05-19 16:51:57 +08:00
bigsheeper
9eeec4a2d5
Add collection load cache and InvalidateCollMetaCache by collID ( #16882 )
...
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2022-05-19 10:13:56 +08:00
yah01
33c855dcd2
Fix LoadBalance doesn't remove the source nodes from segment ( #17051 )
...
If the triggerCondition isn't NodeDown, the removing won't happen.
Signed-off-by: yah01 <yang.cen@zilliz.com>
2022-05-18 11:55:56 +08:00
yah01
960d35e517
Add MockCluster, make unit tests reliable ( #17032 )
...
Signed-off-by: yah01 <yang.cen@zilliz.com>
2022-05-17 12:57:56 +08:00
Bingyi Sun
59bc0a7000
Add some log. ( #16978 )
...
Signed-off-by: sunby <bingyi.sun@zilliz.com>
Co-authored-by: sunby <bingyi.sun@zilliz.com>
2022-05-16 17:15:55 +08:00
yah01
e38c6f6c44
Fix load the same segments multiple times for manual LoadBalance ( #16921 )
...
Signed-off-by: yah01 <yang.cen@zilliz.com>
2022-05-16 15:41:56 +08:00
yah01
a382133a8a
Add new node into the replica which has the most of offline nodes ( #16907 )
...
Signed-off-by: yah01 <yang.cen@zilliz.com>
2022-05-13 18:31:54 +08:00
congqixia
ae717bf991
Fix channelUnsubscribe data race and logic ( #16946 )
...
- Add a RWMutex for container/list which is not goroutine-safe
- Fix the element in list is never removed
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2022-05-12 18:09:53 +08:00
congqixia
5c98329f7c
Return error when no replica available ( #16886 )
...
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2022-05-10 19:47:53 +08:00
Xiaofan
000c5ff3de
Fix msgstream unsubscription ( #16883 )
...
Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>
2022-05-10 19:43:52 +08:00
yah01
2d0f908dba
Fix updating segments' NodeIds correctly after LoadBalance ( #16854 )
...
Signed-off-by: yah01 <yang.cen@zilliz.com>
2022-05-10 15:47:53 +08:00
Xiaofan
62658dcda6
Fix pulsar unscubsribe fail because of consumer not found ( #16839 )
...
Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>
2022-05-09 12:07:52 +08:00
Xiaofan
92b6293be4
Fix QueryNode log level ( #16604 )
...
Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>
2022-05-07 10:27:51 +08:00
Ten Thousand Leaves
1acd256481
Add DataQueryable and DataIndexed states for bulk load tasks ( #16725 )
...
issue: #16607
/kind enhancement
Signed-off-by: Yuchen Gao <yuchen.gao@zilliz.com>
2022-05-05 21:17:50 +08:00
yah01
c82e2453eb
Modify the replicas' shard info after load balance ( #16785 )
...
Signed-off-by: yah01 <yang.cen@zilliz.com>
2022-05-05 21:15:50 +08:00