Signed-off-by: Jael Gu <mengjia.gu@zilliz.com>
7.7 KiB
Drop Collection
Milvus 2.0
uses Collection
to represent a set of data, like Table
in traditional database. Users can create or drop Collection
. Altering the Schema
of Collection
is not supported yet. This article introduces the execution path of Drop Collection
. At the end of this article, you should know which components are involved in Drop Collection
.
The execution flow of Drop Collection
is shown in the following figure:
- Firstly,
SDK
starts aDropCollection
request toProxy
viaGrpc
, theproto
is defined as follows:
service MilvusService {
...
rpc DropCollection(DropCollectionRequest) returns (common.Status) {}
...
}
message DropCollectionRequest {
common.MsgBase base = 1; // must
string db_name = 2;
string collection_name = 3; // must
}
- Once the
DropCollection
request is received, theProxy
would wrap this request intoDropCollectionTask
, and push this task intoDdTaskQueue
queue. After that,Proxy
would call method ofWatiToFinish
to wait until the task is finished.
type task interface {
TraceCtx() context.Context
ID() UniqueID // return ReqID
SetID(uid UniqueID) // set ReqID
Name() string
Type() commonpb.MsgType
BeginTs() Timestamp
EndTs() Timestamp
SetTs(ts Timestamp)
OnEnqueue() error
PreExecute(ctx context.Context) error
Execute(ctx context.Context) error
PostExecute(ctx context.Context) error
WaitToFinish() error
Notify(err error)
}
type DropCollectionTask struct {
Condition
*milvuspb.DropCollectionRequest
ctx context.Context
rootCoord types.RootCoord
result *commonpb.Status
chMgr channelsMgr
chTicker channelsTimeTicker
}
-
There is a background service in
Proxy
, this service would get theDropCollectionTask
fromDdTaskQueue
, and execute it in three phases:PreExecute
, do some static checking at this phase, such as check ifCollection Name
is legal etc.Execute
, at this phase,Proxy
would sendDropCollection
request toRootCoord
viaGrpc
, and wait the response, theproto
is defined as below:
service RootCoord { ... rpc DropCollection(milvus.DropCollectionRequest) returns (common.Status) {} ... }
PostExecute
,Proxy
would deleteCollection
's meta from global meta table at this phase.
-
RootCoord
would wrap theDropCollection
request intoDropCollectionReqTask
, and then call functionexecuteTask
.executeTask
would return until thecontext
is done orDropCollectionReqTask.Execute
is returned.
type reqTask interface {
Ctx() context.Context
Type() commonpb.MsgType
Execute(ctx context.Context) error
Core() *Core
}
type DropCollectionReqTask struct {
baseReqTask
Req *milvuspb.DropCollectionRequest
}
-
Firstly,
RootCoord
would deleteCollection
's meta frommetaTable
, includingschema
,partition
,segment
,index
. All of these delete operations are committed in one transaction. -
After
Collection
's meta has been deleted frommetaTable
,Milvus
would consider this collection has been deleted successfully. -
RootCoord
would alloc a timestamp fromTSO
before deletingCollection
's meta frommetaTable
. This timestamp is considered as the point when the collection was deleted. -
RootCoord
would send a message ofDropCollectionRequest
intoMsgStream
. Thus other components, who have subscribed to theMsgStream
, would be notified. TheProto
ofDropCollectionRequest
is defined as below:
message DropCollectionRequest {
common.MsgBase base = 1;
string db_name = 2;
string collectionName = 3;
int64 dbID = 4;
int64 collectionID = 5;
}
-
After these operations,
RootCoord
would update internal timestamp. -
Then
RootCoord
would start aReleaseCollection
request toQueryCoord
viaGrpc
, notifyQueryCoord
to release all resources that related to thisCollection
. ThisGrpc
request is done in anothergoroutine
, so it would not block the main thread. Theproto
is defined as follows:
service QueryCoord {
...
rpc ReleaseCollection(ReleaseCollectionRequest) returns (common.Status) {}
...
}
message ReleaseCollectionRequest {
common.MsgBase base = 1;
int64 dbID = 2;
int64 collectionID = 3;
int64 nodeID = 4;
}
- At last,
RootCoord
would sendInvalidateCollectionMetaCache
request to eachProxy
, notifyProxy
to removeCollection
's meta. Theproto
is defined as follows:
service Proxy {
...
rpc InvalidateCollectionMetaCache(InvalidateCollMetaCacheRequest) returns (common.Status) {}
...
}
message InvalidateCollMetaCacheRequest {
common.MsgBase base = 1;
string db_name = 2;
string collection_name = 3;
}
- The execution flow of
QueryCoord.ReleaseCollection
is shown in the following figure:
-
QueryCoord
would wrapReleaseCollection
intoReleaseCollectionTask
, and push the task intoTaskScheduler
-
There is a background service in
QueryCoord
. This service would get theReleaseCollectionTask
fromTaskScheduler
, and execute it in three phases:PreExecute
,ReleaseCollectionTask
would only print debug log at this phase.Execute
, there are two jobs at this phase:- send a
ReleaseDQLMessageStream
request toRootCoord
viaGrpc
,RootCoord
would redirect theReleaseDQLMessageStream
request to eachProxy
, and notify theProxy
that stop processing any message of thisCollection
anymore. Theproto
is defined as follows:
message ReleaseDQLMessageStreamRequest { common.MsgBase base = 1; int64 dbID = 2; int64 collectionID = 3; }
- send a
ReleaseCollection
request to eachQueryNode
viaGrpc
, and notify theQueryNode
to release all the resources related to thisCollection
, includingIndex
,Segment
,FlowGraph
, etc.QueryNode
would no longer read any message from thisCollection
'sMsgStream
anymore
service QueryNode { ... rpc ReleaseCollection(ReleaseCollectionRequest) returns (common.Status) {} ... } message ReleaseCollectionRequest { common.MsgBase base = 1; int64 dbID = 2; int64 collectionID = 3; int64 nodeID = 4; }
- send a
PostExecute
,ReleaseCollectionTask
would only print debug log at this phase.
-
After these operations,
QueryCoord
would sendReleaseCollection
's response toRootCoord
. -
At
Step 8
,RootCoord
has sent a message ofDropCollectionRequest
intoMsgStream
.DataNode
would subscribe thisMsgStream
, so that it would be notified to release related resources. The execution flow is shown in the following figure.
- In
DataNode
, eachMsgStream
will have aFlowGraph
, which processes all messages. When theDataNode
receives the message ofDropCollectionRequest
,DataNode
would notifyBackGroundGC
, which is a background service onDataNode
, to release resources.
Notes:
- Currently, the
DataCoord
doesn't have response to theDropCollection
. So theCollection
'ssegment meta
still exists in theDataCoord
'smetaTable
, and theBinlog
files belonging to thisCollection
still exist in the persistent storage. - Currently, the
IndexCoord
doesn't have response to theDropCollection
. So theCollection
'sindex file
still exists in the persistent storage.