Signed-off-by: yefu.chen <yefu.chen@zilliz.com>
7.7 KiB
Drop Collection
Milvus 2.0
use Collection
to represent a set of data, like Table
in traditional database. Users can create or drop Collection
. Altering the Schema
of Collection
is not supported yet. This article introduces the execution path of Drop Collection
, at the end of this article, you should know which components are involved in Drop Collection
.
The execution flow of Drop Collection
is shown in the following figure:
- Firstly,
SDK
starts aDropCollection
request toProxy
viaGrpc
, theproto
is defined as follows:
service MilvusService {
...
rpc DropCollection(DropCollectionRequest) returns (common.Status) {}
...
}
message DropCollectionRequest {
common.MsgBase base = 1; // must
string db_name = 2;
string collection_name = 3; // must
}
- When received the
DropCollection
request, theProxy
would wraps this request intoDropCollectionTask
, and pushs this task intoDdTaskQueue
queue. After that,Proxy
would call method ofWatiToFinish
to wait until the task finished.
type task interface {
TraceCtx() context.Context
ID() UniqueID // return ReqID
SetID(uid UniqueID) // set ReqID
Name() string
Type() commonpb.MsgType
BeginTs() Timestamp
EndTs() Timestamp
SetTs(ts Timestamp)
OnEnqueue() error
PreExecute(ctx context.Context) error
Execute(ctx context.Context) error
PostExecute(ctx context.Context) error
WaitToFinish() error
Notify(err error)
}
type DropCollectionTask struct {
Condition
*milvuspb.DropCollectionRequest
ctx context.Context
rootCoord types.RootCoord
result *commonpb.Status
chMgr channelsMgr
chTicker channelsTimeTicker
}
-
There is a backgroud service in
Proxy
, this service would get theDropCollectionTask
fromDdTaskQueue
, and executes it in three phases.PreExecute
, do some static checking at this phase, such as check ifCollection Name
is legal etc.Execute
, at thie phase,Proxy
would sendDropCollection
request toRootCoord
viaGrpc
,and wait the reponse, theproto
is defined as follow:
service RootCoord { ... rpc DropCollection(milvus.DropCollectionRequest) returns (common.Status) {} ... }
PostExecute
,Proxy
would delete theCollection
's meta from global meta table at this phase.
-
RootCoord
would wraps theDropCollection
request intoDropCollectionReqTask
, and then call functionexecuteTask
.executeTask
would return until thecontext
is done orDropCollectionReqTask.Execute
returned.
type reqTask interface {
Ctx() context.Context
Type() commonpb.MsgType
Execute(ctx context.Context) error
Core() *Core
}
type DropCollectionReqTask struct {
baseReqTask
Req *milvuspb.DropCollectionRequest
}
-
Firstly,
RootCoord
would deleteCollection
's meta frommetaTable
, includingschema
,partition
,segment
,index
, all of these delete operations are committed in one transaction. -
After
Collection
's meta has been deleted frommetaTable
,Milvus
would consider this collection has been deleted successfully. -
RootCoord
would alloc a timestamp fromTSO
before deletingCollection
's meta frommetaTable
, and this timestamp is considered as the point when the collection was deleted. -
RoooCoord
would send a message ofDropCollectionRequest
intoMsgStream
, and other components, who has subscribe to theMsgStream
, would be notified. TheProto
ofDropCollectionRequest
is defined as follow:
message DropCollectionRequest {
common.MsgBase base = 1;
string db_name = 2;
string collectionName = 3;
int64 dbID = 4;
int64 collectionID = 5;
}
-
After these operations,
RootCoord
would update internal timestamp. -
Then
RootCoord
would start aReleaseCollection
request toQueryCoord
viaGrpc
, notifyQueryCoord
to release all the resouces that related to thisCollection
. ThisGrpc
request is done in anothergoroutine
, so it would not block the main thread. Theproto
is defined as follow:
service QueryCoord {
...
rpc ReleaseCollection(ReleaseCollectionRequest) returns (common.Status) {}
...
}
message ReleaseCollectionRequest {
common.MsgBase base = 1;
int64 dbID = 2;
int64 collectionID = 3;
int64 nodeID = 4;
}
- At last,
RootCoord
would sendInvalidateCollectionMetaCache
request to eachProxy
, notifyProxy
to removeCollection
's meta, theproto
is defined as follow:
service Proxy {
...
rpc InvalidateCollectionMetaCache(InvalidateCollMetaCacheRequest) returns (common.Status) {}
...
}
message InvalidateCollMetaCacheRequest {
common.MsgBase base = 1;
string db_name = 2;
string collection_name = 3;
}
- The execution flow of
QueryCoord.ReleaseCollection
is shown in the follwing figure.
-
QueryCoord
would wrapsReleaseCollection
intoReleaseCollectionTask
, and push the task intoTaskScheduler
-
There is a backgroud service in
QueryCoord
, this service would get theReleaseCollectionTask
fromTaskScheduler
, and executes it in three phases.PreExecute
,ReleaseCollectionTask
would only print debug log at this phase.Execute
, there are two jobs at this phase:- send a
ReleaseDQLMessageStream
request toRootCoord
viaGrpc
,RootCoord
would redirect theReleaseDQLMessageStream
request to eachProxy
, notify theProxy
that not processing any message of thisCollection
anymore. Theproto
is defined as follow:
message ReleaseDQLMessageStreamRequest { common.MsgBase base = 1; int64 dbID = 2; int64 collectionID = 3; }
- send a
ReleaseCollection
request to eachQueryNode
viaGrpc
, notify theQueryNode
to release all the resources related to thisCollection
, includingIndex
,Segment
,FlowGraph
, etc.QueryNode
would no longer read any message from thisCollection
'sMsgStream
anymore
service QueryNode { ... rpc ReleaseCollection(ReleaseCollectionRequest) returns (common.Status) {} ... } message ReleaseCollectionRequest { common.MsgBase base = 1; int64 dbID = 2; int64 collectionID = 3; int64 nodeID = 4; }
- send a
PostExecute
,ReleaseCollectionTask
would only print debug log at this phase.
-
After these operations,
QueryCoord
would sendReleaseCollection
's reponse to `RootCoord -
At
Step 8
,RoooCoord
has sent a message ofDropCollectionRequest
intoMsgStream
,DataNode
would subscribe thisMsgStream
, soDataNode
would be notified to released the resources. The execution flow is shown in the following figure.
- In
DataNode
, eachMsgStream
will have aFlowGraph
, all the messages are processed by thatFlowGraph
. When theDataNode
receives the message ofDropCollectionRequest
,DataNode
would notifyBackouGroundGC
, which is a background service onDataNode
, to release the resouces.
Notes:
- Currently, the
DataCoord
has not response to theDropCollection
, so theCollection
'ssegment meta
are still exist in theDataCoord
'smetaTable
, and theBinlog
files belongs to thisCollection
are still exist on the persistent storage. - Currently, the
IndexCoord
has not response to theDropCollection
, so theCollection
'sindex file
are still exist on the persistent storage.