"
+ ]
+ },
+ "execution_count": 8,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "def _process_val(data):\n",
+ "\n",
+ " contexts = [data[i][\"context\"] for i in range(len(data))]\n",
+ " questions = [data[i][\"question\"] for i in range(len(data))]\n",
+ "\n",
+ " tokenized_data_list = tokenizer(\n",
+ " questions,\n",
+ " contexts,\n",
+ " stride=doc_stride,\n",
+ " max_length=max_length,\n",
+ " return_dict=False\n",
+ " )\n",
+ "\n",
+ " for i, tokenized_data in enumerate(tokenized_data_list):\n",
+ " token_type_ids = tokenized_data[\"token_type_ids\"]\n",
+ " # 保存数据对应的 id\n",
+ " sample_index = tokenized_data[\"overflow_to_sample\"]\n",
+ " tokenized_data_list[i][\"example_id\"] = data[sample_index][\"id\"]\n",
+ "\n",
+ " # 将不属于 context 的 offset 设置为 None\n",
+ " tokenized_data_list[i][\"offset_mapping\"] = [\n",
+ " (o if token_type_ids[k] == 1 else None)\n",
+ " for k, o in enumerate(tokenized_data[\"offset_mapping\"])\n",
+ " ]\n",
+ "\n",
+ " return tokenized_data_list\n",
+ "\n",
+ "val_dataset.map(_process_val, batched=True, num_workers=5)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### 2.3 DataLoader\n",
+ "\n",
+ "最后使用 `PaddleDataLoader` 将数据集包裹起来即可。"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "from fastNLP.core import PaddleDataLoader\n",
+ "\n",
+ "train_dataloader = PaddleDataLoader(train_dataset, batch_size=32, shuffle=True)\n",
+ "val_dataloader = PaddleDataLoader(val_dataset, batch_size=16)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### 3. 模型训练:自己定义评测用的 Metric 实现更加自由的任务评测\n",
+ "\n",
+ "#### 3.1 损失函数\n",
+ "\n",
+ "对于阅读理解任务,我们使用的是 `ErnieForQuestionAnswering` 模型。该模型在接受输入后会返回两个值:`start_logits` 和 `end_logits` ,大小均为 `(batch_size, sequence_length)`,反映了每条数据每个词语为答案起始位置的可能性,因此我们需要自定义一个损失函数来计算 `loss`。 `CrossEntropyLossForSquad` 会分别对答案起始位置的预测值和真实值计算交叉熵,最后返回其平均值作为最终的损失。"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "class CrossEntropyLossForSquad(paddle.nn.Layer):\n",
+ " def __init__(self):\n",
+ " super(CrossEntropyLossForSquad, self).__init__()\n",
+ "\n",
+ " def forward(self, start_logits, end_logits, start_pos, end_pos):\n",
+ " start_pos = paddle.unsqueeze(start_pos, axis=-1)\n",
+ " end_pos = paddle.unsqueeze(end_pos, axis=-1)\n",
+ " start_loss = paddle.nn.functional.softmax_with_cross_entropy(\n",
+ " logits=start_logits, label=start_pos)\n",
+ " start_loss = paddle.mean(start_loss)\n",
+ " end_loss = paddle.nn.functional.softmax_with_cross_entropy(\n",
+ " logits=end_logits, label=end_pos)\n",
+ " end_loss = paddle.mean(end_loss)\n",
+ "\n",
+ " loss = (start_loss + end_loss) / 2\n",
+ " return loss"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### 3.2 定义模型\n",
+ "\n",
+ "模型的核心则是 `ErnieForQuestionAnswering` 的 `ernie-1.0-base-zh` 预训练模型,同时按照 `FastNLP` 的规定定义 `train_step` 和 `evaluate_step` 函数。这里 `evaluate_step` 函数并没有像文本分类那样直接返回该批次数据的评测结果,这一点我们将在下面为您讲解。"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "\u001b[32m[2022-06-27 19:00:15,825] [ INFO]\u001b[0m - Already cached /remote-home/shxing/.paddlenlp/models/ernie-1.0-base-zh/ernie_v1_chn_base.pdparams\u001b[0m\n",
+ "W0627 19:00:15.831080 21543 gpu_context.cc:278] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 11.2, Runtime API Version: 11.2\n",
+ "W0627 19:00:15.843276 21543 gpu_context.cc:306] device: 0, cuDNN Version: 8.1.\n"
+ ]
+ }
+ ],
+ "source": [
+ "from paddlenlp.transformers import ErnieForQuestionAnswering\n",
+ "\n",
+ "class QAModel(paddle.nn.Layer):\n",
+ " def __init__(self, model_checkpoint):\n",
+ " super(QAModel, self).__init__()\n",
+ " self.model = ErnieForQuestionAnswering.from_pretrained(model_checkpoint)\n",
+ " self.loss_func = CrossEntropyLossForSquad()\n",
+ "\n",
+ " def forward(self, input_ids, token_type_ids):\n",
+ " start_logits, end_logits = self.model(input_ids, token_type_ids)\n",
+ " return start_logits, end_logits\n",
+ "\n",
+ " def train_step(self, input_ids, token_type_ids, start_pos, end_pos):\n",
+ " start_logits, end_logits = self(input_ids, token_type_ids)\n",
+ " loss = self.loss_func(start_logits, end_logits, start_pos, end_pos)\n",
+ " return {\"loss\": loss}\n",
+ "\n",
+ " def evaluate_step(self, input_ids, token_type_ids):\n",
+ " start_logits, end_logits = self(input_ids, token_type_ids)\n",
+ " return {\"start_logits\": start_logits, \"end_logits\": end_logits}\n",
+ "\n",
+ "model = QAModel(MODEL_NAME)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### 3.3 自定义 Metric 进行数据的评估\n",
+ "\n",
+ "`paddlenlp` 为我们提供了评测 `SQuAD` 格式数据集的函数 `compute_prediction` 和 `squad_evaluate`:\n",
+ "- `compute_prediction` 函数要求传入原数据 `examples` 、处理后的数据 `features` 和 `features` 对应的结果 `predictions`(一个包含所有数据 `start_logits` 和 `end_logits` 的元组)\n",
+ "- `squad_evaluate` 要求传入原数据 `examples` 和预测结果 `all_predictions`(通常来自于 `compute_prediction`)\n",
+ "\n",
+ "在使用这两个函数的时候,我们需要向其中传入数据集,但显然根据 `fastNLP` 的设计,我们无法在 `evaluate_step` 里实现这一过程,并且 `FastNLP` 也并没有提供计算 `F1` 和 `EM` 的 `Metric`,故我们需要自己定义用于评测的 `Metric`。\n",
+ "\n",
+ "在初始化之外,一个 `Metric` 还需要实现三个函数:\n",
+ "\n",
+ "1. `reset` - 该函数会在验证数据集的迭代之前被调用,用于清空数据;在我们自定义的 `Metric` 中,我们需要将 `all_start_logits` 和 `all_end_logits` 清空,重新收集每个 `batch` 的结果。\n",
+ "2. `update` - 该函数会在在每个 `batch` 得到结果后被调用,用于更新 `Metric` 的状态;它的参数即为 `evaluate_step` 返回的内容。我们在这里将得到的 `start_logits` 和 `end_logits` 收集起来。\n",
+ "3. `get_metric` - 该函数会在数据集被迭代完毕后调用,用于计算评测的结果。现在我们有了整个验证集的 `all_start_logits` 和 `all_end_logits` ,将他们传入 `compute_predictions` 函数得到预测的结果,并继续使用 `squad_evaluate` 函数得到评测的结果。\n",
+ " - 注:`suqad_evaluate` 函数会自己输出评测结果,为了不让其干扰 `FastNLP` 输出,这里我们使用 `contextlib.redirect_stdout(None)` 将函数的标准输出屏蔽掉。\n",
+ "\n",
+ "综上,`SquadEvaluateMetric` 实现的评估过程是:将验证集中所有数据的 `logits` 收集起来,然后统一传入 `compute_prediction` 和 `squad_evaluate` 中进行评估。值得一提的是,`paddlenlp.datasets.load_dataset` 返回的结果是一个 `MapDataset` 类型,其 `data` 成员为加载时的数据,`new_data` 为经过 `map` 函数处理后更新的数据,因此可以分别作为 `examples` 和 `features` 传入。"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from fastNLP.core import Metric\n",
+ "from paddlenlp.metrics.squad import squad_evaluate, compute_prediction\n",
+ "import contextlib\n",
+ "\n",
+ "class SquadEvaluateMetric(Metric):\n",
+ " def __init__(self, examples, features, testing=False):\n",
+ " super(SquadEvaluateMetric, self).__init__(\"paddle\", False)\n",
+ " self.examples = examples\n",
+ " self.features = features\n",
+ " self.all_start_logits = []\n",
+ " self.all_end_logits = []\n",
+ " self.testing = testing\n",
+ "\n",
+ " def reset(self):\n",
+ " self.all_start_logits = []\n",
+ " self.all_end_logits = []\n",
+ "\n",
+ " def update(self, start_logits, end_logits):\n",
+ " for start, end in zip(start_logits, end_logits):\n",
+ " self.all_start_logits.append(start.numpy())\n",
+ " self.all_end_logits.append(end.numpy())\n",
+ "\n",
+ " def get_metric(self):\n",
+ " all_predictions, _, _ = compute_prediction(\n",
+ " self.examples, self.features[:len(self.all_start_logits)],\n",
+ " (self.all_start_logits, self.all_end_logits),\n",
+ " False, 20, 30\n",
+ " )\n",
+ " with contextlib.redirect_stdout(None):\n",
+ " result = squad_evaluate(\n",
+ " examples=self.examples,\n",
+ " preds=all_predictions,\n",
+ " is_whitespace_splited=False\n",
+ " )\n",
+ "\n",
+ " if self.testing:\n",
+ " self.print_predictions(all_predictions)\n",
+ " return result\n",
+ "\n",
+ " def print_predictions(self, preds):\n",
+ " for i, data in enumerate(self.examples):\n",
+ " if i >= 5:\n",
+ " break\n",
+ " print()\n",
+ " print(\"原文:\", data[\"context\"])\n",
+ " print(\"问题:\", data[\"question\"], \\\n",
+ " \"答案:\", preds[data[\"id\"]], \\\n",
+ " \"正确答案:\", data[\"answers\"][\"text\"])\n",
+ "\n",
+ "metric = SquadEvaluateMetric(\n",
+ " val_dataloader.dataset.data,\n",
+ " val_dataloader.dataset.new_data,\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### 3.4 训练\n",
+ "\n",
+ "至此所有的准备工作已经完成,可以使用 `Trainer` 进行训练了。学习率我们依旧采用线性预热策略 `LinearDecayWithWarmup`,优化器为 `AdamW`;回调模块我们选择 `LRSchedCallback` 更新学习率和 `LoadBestModelCallback` 监视评测结果的 `f1` 分数。初始化好 `Trainer` 之后,就将训练的过程交给 `FastNLP` 吧。"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "[19:04:54] INFO Running evaluator sanity check for 2 batches. trainer.py:631\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\u001b[2;36m[19:04:54]\u001b[0m\u001b[2;36m \u001b[0m\u001b[34mINFO \u001b[0m Running evaluator sanity check for \u001b[1;36m2\u001b[0m batches. \u001b]8;id=367046;file://../fastNLP/core/controllers/trainer.py\u001b\\\u001b[2mtrainer.py\u001b[0m\u001b]8;;\u001b\\\u001b[2m:\u001b[0m\u001b]8;id=96810;file://../fastNLP/core/controllers/trainer.py#631\u001b\\\u001b[2m631\u001b[0m\u001b]8;;\u001b\\\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n"
+ ],
+ "text/plain": []
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "---------------------------- Eval. results on Epoch:0, Batch:100 ----------------------------\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "---------------------------- Eval. results on Epoch:\u001b[1;36m0\u001b[0m, Batch:\u001b[1;36m100\u001b[0m ----------------------------\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "{\n",
+ " \"exact#squad\": 49.25899788285109,\n",
+ " \"f1#squad\": 66.55559127349602,\n",
+ " \"total#squad\": 1417,\n",
+ " \"HasAns_exact#squad\": 49.25899788285109,\n",
+ " \"HasAns_f1#squad\": 66.55559127349602,\n",
+ " \"HasAns_total#squad\": 1417\n",
+ "}\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\u001b[1m{\u001b[0m\n",
+ " \u001b[1;34m\"exact#squad\"\u001b[0m: \u001b[1;36m49.25899788285109\u001b[0m,\n",
+ " \u001b[1;34m\"f1#squad\"\u001b[0m: \u001b[1;36m66.55559127349602\u001b[0m,\n",
+ " \u001b[1;34m\"total#squad\"\u001b[0m: \u001b[1;36m1417\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_exact#squad\"\u001b[0m: \u001b[1;36m49.25899788285109\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_f1#squad\"\u001b[0m: \u001b[1;36m66.55559127349602\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_total#squad\"\u001b[0m: \u001b[1;36m1417\u001b[0m\n",
+ "\u001b[1m}\u001b[0m\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "---------------------------- Eval. results on Epoch:0, Batch:200 ----------------------------\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "---------------------------- Eval. results on Epoch:\u001b[1;36m0\u001b[0m, Batch:\u001b[1;36m200\u001b[0m ----------------------------\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "{\n",
+ " \"exact#squad\": 57.37473535638673,\n",
+ " \"f1#squad\": 70.93036525200617,\n",
+ " \"total#squad\": 1417,\n",
+ " \"HasAns_exact#squad\": 57.37473535638673,\n",
+ " \"HasAns_f1#squad\": 70.93036525200617,\n",
+ " \"HasAns_total#squad\": 1417\n",
+ "}\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\u001b[1m{\u001b[0m\n",
+ " \u001b[1;34m\"exact#squad\"\u001b[0m: \u001b[1;36m57.37473535638673\u001b[0m,\n",
+ " \u001b[1;34m\"f1#squad\"\u001b[0m: \u001b[1;36m70.93036525200617\u001b[0m,\n",
+ " \u001b[1;34m\"total#squad\"\u001b[0m: \u001b[1;36m1417\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_exact#squad\"\u001b[0m: \u001b[1;36m57.37473535638673\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_f1#squad\"\u001b[0m: \u001b[1;36m70.93036525200617\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_total#squad\"\u001b[0m: \u001b[1;36m1417\u001b[0m\n",
+ "\u001b[1m}\u001b[0m\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "---------------------------- Eval. results on Epoch:0, Batch:300 ----------------------------\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "---------------------------- Eval. results on Epoch:\u001b[1;36m0\u001b[0m, Batch:\u001b[1;36m300\u001b[0m ----------------------------\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "{\n",
+ " \"exact#squad\": 63.86732533521524,\n",
+ " \"f1#squad\": 78.62546663568186,\n",
+ " \"total#squad\": 1417,\n",
+ " \"HasAns_exact#squad\": 63.86732533521524,\n",
+ " \"HasAns_f1#squad\": 78.62546663568186,\n",
+ " \"HasAns_total#squad\": 1417\n",
+ "}\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\u001b[1m{\u001b[0m\n",
+ " \u001b[1;34m\"exact#squad\"\u001b[0m: \u001b[1;36m63.86732533521524\u001b[0m,\n",
+ " \u001b[1;34m\"f1#squad\"\u001b[0m: \u001b[1;36m78.62546663568186\u001b[0m,\n",
+ " \u001b[1;34m\"total#squad\"\u001b[0m: \u001b[1;36m1417\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_exact#squad\"\u001b[0m: \u001b[1;36m63.86732533521524\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_f1#squad\"\u001b[0m: \u001b[1;36m78.62546663568186\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_total#squad\"\u001b[0m: \u001b[1;36m1417\u001b[0m\n",
+ "\u001b[1m}\u001b[0m\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "---------------------------- Eval. results on Epoch:0, Batch:400 ----------------------------\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "---------------------------- Eval. results on Epoch:\u001b[1;36m0\u001b[0m, Batch:\u001b[1;36m400\u001b[0m ----------------------------\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "{\n",
+ " \"exact#squad\": 64.92589978828511,\n",
+ " \"f1#squad\": 79.36746074079691,\n",
+ " \"total#squad\": 1417,\n",
+ " \"HasAns_exact#squad\": 64.92589978828511,\n",
+ " \"HasAns_f1#squad\": 79.36746074079691,\n",
+ " \"HasAns_total#squad\": 1417\n",
+ "}\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\u001b[1m{\u001b[0m\n",
+ " \u001b[1;34m\"exact#squad\"\u001b[0m: \u001b[1;36m64.92589978828511\u001b[0m,\n",
+ " \u001b[1;34m\"f1#squad\"\u001b[0m: \u001b[1;36m79.36746074079691\u001b[0m,\n",
+ " \u001b[1;34m\"total#squad\"\u001b[0m: \u001b[1;36m1417\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_exact#squad\"\u001b[0m: \u001b[1;36m64.92589978828511\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_f1#squad\"\u001b[0m: \u001b[1;36m79.36746074079691\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_total#squad\"\u001b[0m: \u001b[1;36m1417\u001b[0m\n",
+ "\u001b[1m}\u001b[0m\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "---------------------------- Eval. results on Epoch:0, Batch:500 ----------------------------\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "---------------------------- Eval. results on Epoch:\u001b[1;36m0\u001b[0m, Batch:\u001b[1;36m500\u001b[0m ----------------------------\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "{\n",
+ " \"exact#squad\": 65.70218772053634,\n",
+ " \"f1#squad\": 80.33295482054824,\n",
+ " \"total#squad\": 1417,\n",
+ " \"HasAns_exact#squad\": 65.70218772053634,\n",
+ " \"HasAns_f1#squad\": 80.33295482054824,\n",
+ " \"HasAns_total#squad\": 1417\n",
+ "}\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\u001b[1m{\u001b[0m\n",
+ " \u001b[1;34m\"exact#squad\"\u001b[0m: \u001b[1;36m65.70218772053634\u001b[0m,\n",
+ " \u001b[1;34m\"f1#squad\"\u001b[0m: \u001b[1;36m80.33295482054824\u001b[0m,\n",
+ " \u001b[1;34m\"total#squad\"\u001b[0m: \u001b[1;36m1417\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_exact#squad\"\u001b[0m: \u001b[1;36m65.70218772053634\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_f1#squad\"\u001b[0m: \u001b[1;36m80.33295482054824\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_total#squad\"\u001b[0m: \u001b[1;36m1417\u001b[0m\n",
+ "\u001b[1m}\u001b[0m\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "---------------------------- Eval. results on Epoch:0, Batch:600 ----------------------------\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "---------------------------- Eval. results on Epoch:\u001b[1;36m0\u001b[0m, Batch:\u001b[1;36m600\u001b[0m ----------------------------\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "{\n",
+ " \"exact#squad\": 65.41990119971771,\n",
+ " \"f1#squad\": 79.7483487059053,\n",
+ " \"total#squad\": 1417,\n",
+ " \"HasAns_exact#squad\": 65.41990119971771,\n",
+ " \"HasAns_f1#squad\": 79.7483487059053,\n",
+ " \"HasAns_total#squad\": 1417\n",
+ "}\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\u001b[1m{\u001b[0m\n",
+ " \u001b[1;34m\"exact#squad\"\u001b[0m: \u001b[1;36m65.41990119971771\u001b[0m,\n",
+ " \u001b[1;34m\"f1#squad\"\u001b[0m: \u001b[1;36m79.7483487059053\u001b[0m,\n",
+ " \u001b[1;34m\"total#squad\"\u001b[0m: \u001b[1;36m1417\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_exact#squad\"\u001b[0m: \u001b[1;36m65.41990119971771\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_f1#squad\"\u001b[0m: \u001b[1;36m79.7483487059053\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_total#squad\"\u001b[0m: \u001b[1;36m1417\u001b[0m\n",
+ "\u001b[1m}\u001b[0m\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "---------------------------- Eval. results on Epoch:0, Batch:700 ----------------------------\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "---------------------------- Eval. results on Epoch:\u001b[1;36m0\u001b[0m, Batch:\u001b[1;36m700\u001b[0m ----------------------------\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "{\n",
+ " \"exact#squad\": 66.61961891319689,\n",
+ " \"f1#squad\": 80.32432238994133,\n",
+ " \"total#squad\": 1417,\n",
+ " \"HasAns_exact#squad\": 66.61961891319689,\n",
+ " \"HasAns_f1#squad\": 80.32432238994133,\n",
+ " \"HasAns_total#squad\": 1417\n",
+ "}\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\u001b[1m{\u001b[0m\n",
+ " \u001b[1;34m\"exact#squad\"\u001b[0m: \u001b[1;36m66.61961891319689\u001b[0m,\n",
+ " \u001b[1;34m\"f1#squad\"\u001b[0m: \u001b[1;36m80.32432238994133\u001b[0m,\n",
+ " \u001b[1;34m\"total#squad\"\u001b[0m: \u001b[1;36m1417\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_exact#squad\"\u001b[0m: \u001b[1;36m66.61961891319689\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_f1#squad\"\u001b[0m: \u001b[1;36m80.32432238994133\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_total#squad\"\u001b[0m: \u001b[1;36m1417\u001b[0m\n",
+ "\u001b[1m}\u001b[0m\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "---------------------------- Eval. results on Epoch:0, Batch:800 ----------------------------\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "---------------------------- Eval. results on Epoch:\u001b[1;36m0\u001b[0m, Batch:\u001b[1;36m800\u001b[0m ----------------------------\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "{\n",
+ " \"exact#squad\": 65.84333098094567,\n",
+ " \"f1#squad\": 79.23169801265415,\n",
+ " \"total#squad\": 1417,\n",
+ " \"HasAns_exact#squad\": 65.84333098094567,\n",
+ " \"HasAns_f1#squad\": 79.23169801265415,\n",
+ " \"HasAns_total#squad\": 1417\n",
+ "}\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\u001b[1m{\u001b[0m\n",
+ " \u001b[1;34m\"exact#squad\"\u001b[0m: \u001b[1;36m65.84333098094567\u001b[0m,\n",
+ " \u001b[1;34m\"f1#squad\"\u001b[0m: \u001b[1;36m79.23169801265415\u001b[0m,\n",
+ " \u001b[1;34m\"total#squad\"\u001b[0m: \u001b[1;36m1417\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_exact#squad\"\u001b[0m: \u001b[1;36m65.84333098094567\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_f1#squad\"\u001b[0m: \u001b[1;36m79.23169801265415\u001b[0m,\n",
+ " \u001b[1;34m\"HasAns_total#squad\"\u001b[0m: \u001b[1;36m1417\u001b[0m\n",
+ "\u001b[1m}\u001b[0m\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n"
+ ],
+ "text/plain": []
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "[19:20:28] INFO Loading best model from fnlp-ernie-squad/ load_best_model_callback.py:111\n",
+ " 2022-06-27-19_00_15_388554/best_so_far \n",
+ " with f1#squad: 80.33295482054824... \n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\u001b[2;36m[19:20:28]\u001b[0m\u001b[2;36m \u001b[0m\u001b[34mINFO \u001b[0m Loading best model from fnlp-ernie-squad/ \u001b]8;id=163935;file://../fastNLP/core/callbacks/load_best_model_callback.py\u001b\\\u001b[2mload_best_model_callback.py\u001b[0m\u001b]8;;\u001b\\\u001b[2m:\u001b[0m\u001b]8;id=31503;file://../fastNLP/core/callbacks/load_best_model_callback.py#111\u001b\\\u001b[2m111\u001b[0m\u001b]8;;\u001b\\\n",
+ "\u001b[2;36m \u001b[0m \u001b[1;36m2022\u001b[0m-\u001b[1;36m06\u001b[0m-\u001b[1;36m27\u001b[0m-19_00_15_388554/best_so_far \u001b[2m \u001b[0m\n",
+ "\u001b[2;36m \u001b[0m with f1#squad: \u001b[1;36m80.33295482054824\u001b[0m\u001b[33m...\u001b[0m \u001b[2m \u001b[0m\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ " INFO Deleting fnlp-ernie-squad/2022-06-27-19_0 load_best_model_callback.py:131\n",
+ " 0_15_388554/best_so_far... \n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\u001b[2;36m \u001b[0m\u001b[2;36m \u001b[0m\u001b[34mINFO \u001b[0m Deleting fnlp-ernie-squad/\u001b[1;36m2022\u001b[0m-\u001b[1;36m06\u001b[0m-\u001b[1;36m27\u001b[0m-19_0 \u001b]8;id=560859;file://../fastNLP/core/callbacks/load_best_model_callback.py\u001b\\\u001b[2mload_best_model_callback.py\u001b[0m\u001b]8;;\u001b\\\u001b[2m:\u001b[0m\u001b]8;id=573263;file://../fastNLP/core/callbacks/load_best_model_callback.py#131\u001b\\\u001b[2m131\u001b[0m\u001b]8;;\u001b\\\n",
+ "\u001b[2;36m \u001b[0m 0_15_388554/best_so_far\u001b[33m...\u001b[0m \u001b[2m \u001b[0m\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "from fastNLP import Trainer, LRSchedCallback, LoadBestModelCallback\n",
+ "from paddlenlp.transformers import LinearDecayWithWarmup\n",
+ "\n",
+ "n_epochs = 1\n",
+ "num_training_steps = len(train_dataloader) * n_epochs\n",
+ "lr_scheduler = LinearDecayWithWarmup(3e-5, num_training_steps, 0.1)\n",
+ "optimizer = paddle.optimizer.AdamW(\n",
+ " learning_rate=lr_scheduler,\n",
+ " parameters=model.parameters(),\n",
+ ")\n",
+ "callbacks=[\n",
+ " LRSchedCallback(lr_scheduler, step_on=\"batch\"),\n",
+ " LoadBestModelCallback(\"f1#squad\", larger_better=True, save_folder=\"fnlp-ernie-squad\")\n",
+ "]\n",
+ "trainer = Trainer(\n",
+ " model=model,\n",
+ " train_dataloader=train_dataloader,\n",
+ " evaluate_dataloaders=val_dataloader,\n",
+ " device=1,\n",
+ " optimizers=optimizer,\n",
+ " n_epochs=n_epochs,\n",
+ " callbacks=callbacks,\n",
+ " evaluate_every=100,\n",
+ " metrics={\"squad\": metric},\n",
+ ")\n",
+ "trainer.run()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### 3.5 测试\n",
+ "\n",
+ "最后,我们可以使用 `Evaluator` 查看我们训练的结果。我们在之前为 `SquadEvaluateMetric` 设置了 `testing` 参数来在测试阶段进行输出,可以看到,训练的结果还是比较不错的。"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "原文: 爬行垫根据中间材料的不同可以分为:XPE爬行垫、EPE爬行垫、EVA爬行垫、PVC爬行垫;其中XPE爬\n",
+ "行垫、EPE爬行垫都属于PE材料加保鲜膜复合而成,都是无异味的环保材料,但是XPE爬行垫是品质较好的爬\n",
+ "行垫,韩国进口爬行垫都是这种爬行垫,而EPE爬行垫是国内厂家为了减低成本,使用EPE(珍珠棉)作为原料生\n",
+ "产的一款爬行垫,该材料弹性差,易碎,开孔发泡防水性弱。EVA爬行垫、PVC爬行垫是用EVA或PVC作为原材料\n",
+ "与保鲜膜复合的而成的爬行垫,或者把图案转印在原材料上,这两款爬行垫通常有异味,如果是图案转印的爬\n",
+ "行垫,油墨外露容易脱落。 \n",
+ "当时我儿子爬的时候,我们也买了垫子,但是始终有味。最后就没用了,铺的就的薄毯子让他爬。\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "原文: 爬行垫根据中间材料的不同可以分为:XPE爬行垫、EPE爬行垫、EVA爬行垫、PVC爬行垫;其中XPE爬\n",
+ "行垫、EPE爬行垫都属于PE材料加保鲜膜复合而成,都是无异味的环保材料,但是XPE爬行垫是品质较好的爬\n",
+ "行垫,韩国进口爬行垫都是这种爬行垫,而EPE爬行垫是国内厂家为了减低成本,使用EPE(珍珠棉)作为原料生\n",
+ "产的一款爬行垫,该材料弹性差,易碎,开孔发泡防水性弱。EVA爬行垫、PVC爬行垫是用EVA或PVC作为原材料\n",
+ "与保鲜膜复合的而成的爬行垫,或者把图案转印在原材料上,这两款爬行垫通常有异味,如果是图案转印的爬\n",
+ "行垫,油墨外露容易脱落。 \n",
+ "当时我儿子爬的时候,我们也买了垫子,但是始终有味。最后就没用了,铺的就的薄毯子让他爬。\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "问题: 爬行垫什么材质的好 答案: EPE(珍珠棉 正确答案: ['XPE']\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "问题: 爬行垫什么材质的好 答案: EPE(珍珠棉 正确答案: ['XPE']\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "原文: 真实情况是160-162。她平时谎报的168是因为不离脚穿高水台恨天高(15厘米) 图1她穿着高水台恨\n",
+ "天高和刘亦菲一样高,(刘亦菲对外报身高172)范冰冰礼服下厚厚的高水台暴露了她的心机,对比一下两者的\n",
+ "鞋子吧 图2 穿着高水台恨天高才和刘德华谢霆锋持平,如果她真的有168,那么加上鞋高,刘和谢都要有180?\n",
+ "明显是不可能的。所以刘德华对外报的身高174减去10-15厘米才是范冰冰的真实身高 图3,范冰冰有一次脱\n",
+ "鞋上场,这个最说明问题了,看看她的身体比例吧。还有目测一下她手上鞋子的鞋跟有多高多厚吧,至少超过\n",
+ "10厘米。\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "原文: 真实情况是160-162。她平时谎报的168是因为不离脚穿高水台恨天高(15厘米) 图1她穿着高水台恨\n",
+ "天高和刘亦菲一样高,(刘亦菲对外报身高172)范冰冰礼服下厚厚的高水台暴露了她的心机,对比一下两者的\n",
+ "鞋子吧 图2 穿着高水台恨天高才和刘德华谢霆锋持平,如果她真的有168,那么加上鞋高,刘和谢都要有180?\n",
+ "明显是不可能的。所以刘德华对外报的身高174减去10-15厘米才是范冰冰的真实身高 图3,范冰冰有一次脱\n",
+ "鞋上场,这个最说明问题了,看看她的身体比例吧。还有目测一下她手上鞋子的鞋跟有多高多厚吧,至少超过\n",
+ "10厘米。\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "问题: 范冰冰多高真实身高 答案: 160-162 正确答案: ['160-162']\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "问题: 范冰冰多高真实身高 答案: 160-162 正确答案: ['160-162']\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "原文: 防水作为目前高端手机的标配,特别是苹果也支持防水之后,国产大多数高端旗舰手机都已经支持防\n",
+ "水。虽然我们真的不会故意把手机放入水中,但是有了防水之后,用户心里会多一重安全感。那么近日最为\n",
+ "火热的小米6防水吗?小米6的防水级别又是多少呢? 小编查询了很多资料发现,小米6确实是防水的,但是为\n",
+ "了保持低调,同时为了不被别人说防水等级不够,很多资料都没有标注小米是否防水。根据评测资料显示,小\n",
+ "米6是支持IP68级的防水,是绝对能够满足日常生活中的防水需求的。\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "原文: 防水作为目前高端手机的标配,特别是苹果也支持防水之后,国产大多数高端旗舰手机都已经支持防\n",
+ "水。虽然我们真的不会故意把手机放入水中,但是有了防水之后,用户心里会多一重安全感。那么近日最为\n",
+ "火热的小米6防水吗?小米6的防水级别又是多少呢? 小编查询了很多资料发现,小米6确实是防水的,但是为\n",
+ "了保持低调,同时为了不被别人说防水等级不够,很多资料都没有标注小米是否防水。根据评测资料显示,小\n",
+ "米6是支持IP68级的防水,是绝对能够满足日常生活中的防水需求的。\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "问题: 小米6防水等级 答案: IP68级 正确答案: ['IP68级']\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "问题: 小米6防水等级 答案: IP68级 正确答案: ['IP68级']\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "原文: 这位朋友你好,女性出现妊娠反应一般是从6-12周左右,也就是女性怀孕1个多月就会开始出现反应,\n",
+ "第3个月的时候,妊辰反应基本结束。 而大部分女性怀孕初期都会出现恶心、呕吐的感觉,这些症状都是因\n",
+ "人而异的,除非恶心、呕吐的非常厉害,才需要就医,否则这些都是刚怀孕的的正常症状。1-3个月的时候可\n",
+ "以观察一下自己的皮肤,一般女性怀孕初期可能会产生皮肤色素沉淀或是腹壁产生妊娠纹,特别是在怀孕的\n",
+ "后期更加明显。 还有很多女性怀孕初期会出现疲倦、嗜睡的情况。怀孕三个月的时候,膀胱会受到日益胀\n",
+ "大的子宫的压迫,容量会变小,所以怀孕期间也会有尿频的现象出现。月经停止也是刚怀孕最容易出现的症\n",
+ "状,只要是平时月经正常的女性,在性行为后超过正常经期两周,就有可能是怀孕了。 如果你想判断自己是\n",
+ "否怀孕,可以看看自己有没有这些反应。当然这也只是多数人的怀孕表现,也有部分女性怀孕表现并不完全\n",
+ "是这样,如果你无法确定自己是否怀孕,最好去医院检查一下。\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "原文: 这位朋友你好,女性出现妊娠反应一般是从6-12周左右,也就是女性怀孕1个多月就会开始出现反应,\n",
+ "第3个月的时候,妊辰反应基本结束。 而大部分女性怀孕初期都会出现恶心、呕吐的感觉,这些症状都是因\n",
+ "人而异的,除非恶心、呕吐的非常厉害,才需要就医,否则这些都是刚怀孕的的正常症状。1-3个月的时候可\n",
+ "以观察一下自己的皮肤,一般女性怀孕初期可能会产生皮肤色素沉淀或是腹壁产生妊娠纹,特别是在怀孕的\n",
+ "后期更加明显。 还有很多女性怀孕初期会出现疲倦、嗜睡的情况。怀孕三个月的时候,膀胱会受到日益胀\n",
+ "大的子宫的压迫,容量会变小,所以怀孕期间也会有尿频的现象出现。月经停止也是刚怀孕最容易出现的症\n",
+ "状,只要是平时月经正常的女性,在性行为后超过正常经期两周,就有可能是怀孕了。 如果你想判断自己是\n",
+ "否怀孕,可以看看自己有没有这些反应。当然这也只是多数人的怀孕表现,也有部分女性怀孕表现并不完全\n",
+ "是这样,如果你无法确定自己是否怀孕,最好去医院检查一下。\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "问题: 怀孕多久会有反应 答案: 6-12周左右 正确答案: ['6-12周左右', '6-12周', '1个多月']\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "问题: 怀孕多久会有反应 答案: 6-12周左右 正确答案: ['6-12周左右', '6-12周', '1个多月']\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "原文: 【东奥会计在线——中级会计职称频道推荐】根据《关于提高科技型中小企业研究开发费用税前加计\n",
+ "扣除比例的通知》的规定,研发费加计扣除比例提高到75%。|财政部、国家税务总局、科技部发布《关于提\n",
+ "高科技型中小企业研究开发费用税前加计扣除比例的通知》。|通知称,为进一步激励中小企业加大研发投\n",
+ "入,支持科技创新,就提高科技型中小企业研究开发费用(以下简称研发费用)税前加计扣除比例有关问题发\n",
+ "布通知。|通知明确,科技型中小企业开展研发活动中实际发生的研发费用,未形成无形资产计入当期损益的\n",
+ ",在按规定据实扣除的基础上,在2017年1月1日至2019年12月31日期间,再按照实际发生额的75%在税前加计\n",
+ "扣除;形成无形资产的,在上述期间按照无形资产成本的175%在税前摊销。|科技型中小企业享受研发费用税\n",
+ "前加计扣除政策的其他政策口径按照《财政部国家税务总局科技部关于完善研究开发费用税前加计扣除政\n",
+ "策的通知》(财税〔2015〕119号)规定执行。|科技型中小企业条件和管理办法由科技部、财政部和国家税\n",
+ "务总局另行发布。科技、财政和税务部门应建立信息共享机制,及时共享科技型中小企业的相关信息,加强\n",
+ "协调配合,保障优惠政策落实到位。|上一篇文章:关于2016年度企业研究开发费用税前加计扣除政策企业所\n",
+ "得税纳税申报问题的公告 下一篇文章:关于提高科技型中小企业研究开发费用税前加计扣除比例的通知\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "原文: 【东奥会计在线——中级会计职称频道推荐】根据《关于提高科技型中小企业研究开发费用税前加计\n",
+ "扣除比例的通知》的规定,研发费加计扣除比例提高到75%。|财政部、国家税务总局、科技部发布《关于提\n",
+ "高科技型中小企业研究开发费用税前加计扣除比例的通知》。|通知称,为进一步激励中小企业加大研发投\n",
+ "入,支持科技创新,就提高科技型中小企业研究开发费用(以下简称研发费用)税前加计扣除比例有关问题发\n",
+ "布通知。|通知明确,科技型中小企业开展研发活动中实际发生的研发费用,未形成无形资产计入当期损益的\n",
+ ",在按规定据实扣除的基础上,在2017年1月1日至2019年12月31日期间,再按照实际发生额的75%在税前加计\n",
+ "扣除;形成无形资产的,在上述期间按照无形资产成本的175%在税前摊销。|科技型中小企业享受研发费用税\n",
+ "前加计扣除政策的其他政策口径按照《财政部国家税务总局科技部关于完善研究开发费用税前加计扣除政\n",
+ "策的通知》(财税〔2015〕119号)规定执行。|科技型中小企业条件和管理办法由科技部、财政部和国家税\n",
+ "务总局另行发布。科技、财政和税务部门应建立信息共享机制,及时共享科技型中小企业的相关信息,加强\n",
+ "协调配合,保障优惠政策落实到位。|上一篇文章:关于2016年度企业研究开发费用税前加计扣除政策企业所\n",
+ "得税纳税申报问题的公告 下一篇文章:关于提高科技型中小企业研究开发费用税前加计扣除比例的通知\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "问题: 研发费用加计扣除比例 答案: 75% 正确答案: ['75%']\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "问题: 研发费用加计扣除比例 答案: 75% 正确答案: ['75%']\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n"
+ ],
+ "text/plain": []
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "{\n",
+ " 'exact#squad': 65.70218772053634,\n",
+ " 'f1#squad': 80.33295482054824,\n",
+ " 'total#squad': 1417,\n",
+ " 'HasAns_exact#squad': 65.70218772053634,\n",
+ " 'HasAns_f1#squad': 80.33295482054824,\n",
+ " 'HasAns_total#squad': 1417\n",
+ "}\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\u001b[1m{\u001b[0m\n",
+ " \u001b[32m'exact#squad'\u001b[0m: \u001b[1;36m65.70218772053634\u001b[0m,\n",
+ " \u001b[32m'f1#squad'\u001b[0m: \u001b[1;36m80.33295482054824\u001b[0m,\n",
+ " \u001b[32m'total#squad'\u001b[0m: \u001b[1;36m1417\u001b[0m,\n",
+ " \u001b[32m'HasAns_exact#squad'\u001b[0m: \u001b[1;36m65.70218772053634\u001b[0m,\n",
+ " \u001b[32m'HasAns_f1#squad'\u001b[0m: \u001b[1;36m80.33295482054824\u001b[0m,\n",
+ " \u001b[32m'HasAns_total#squad'\u001b[0m: \u001b[1;36m1417\u001b[0m\n",
+ "\u001b[1m}\u001b[0m\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "from fastNLP import Evaluator\n",
+ "evaluator = Evaluator(\n",
+ " model=model,\n",
+ " dataloaders=val_dataloader,\n",
+ " device=1,\n",
+ " metrics={\n",
+ " \"squad\": SquadEvaluateMetric(\n",
+ " val_dataloader.dataset.data,\n",
+ " val_dataloader.dataset.new_data,\n",
+ " testing=True,\n",
+ " ),\n",
+ " },\n",
+ ")\n",
+ "result = evaluator.run()"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3.7.13 ('fnlp-paddle')",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.7.13"
+ },
+ "orig_nbformat": 4,
+ "vscode": {
+ "interpreter": {
+ "hash": "31f2d9d3efc23c441973d7c4273acfea8b132b6a578f002629b6b44b8f65e720"
+ }
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}